Site Reliability Engineer at Assurit

1 year ago Network & Security Middle Full-Time

Assurit is currently seeking an experienced Site Reliability Engineer (Infrastructure Team) to support one of our clients.


Minimum Qualifications:

  • Ability to code in Python
  • Linux Administration (system administration & network configuration)
  • Debugging & Troubleshooting production performance issues (application and infrastructure)
  • Knowledge of Message Queue tools (Kafka, RabbitMQ)
  • Kubernetes Administration
  • CI/CD Tooling
  • DevOps Automation
Preferred Qualifications:
  • Shell Scripting
  • Knowledge of Containers
  • Exposure to distributed systems (Consul, ZooKeeper, MongoDB)
  • Knowledge of SaltStack
  • Knowledge of monitoring tools (Grafana, Prometheus)
  • Working at Assurit


Description

Assurit is currently seeking an experienced Site Reliability Engineer (Infrastructure Team) to support one of our clients.

Responsibilities:

  • Design, write and deliver software and automation to dramatically improve the availability, scalability, latency, and efficiency of infrastructure
  • Improve system design and architecture to ensure high stability and performance of the services across global multi-DC
  • Manage operations of data service, real-time/batch data pipelines, such as SLA management, system deployment, performance tuning on-call and trouble shooting
  • Perform lifecycle management of production systems including change management, service deployment, operations and emergency response
  • Provide strong support during big events to ensure the system is capable to consume large volume of Internet traffic
  • Managing infrastructure services, responsible for including but not limited to deployment, operation and troubleshooting
  • Work with team to establish service level objectives and monitor to ensure the objectives are met
  • Continually improve cloud operations automation and tooling to monitor and maintain enterprise cloud-based infrastructure
  • Execute automation for known cloud-operations tasks, and create new automation for new situations or issues you encounter; automate everything
  • Facilitate blame-free root cause analysis meetings in the event of a production-systems incident so that the team can learn from mistakes and improve our systems and run books
  • Be vigilant about security and adhere to best practices to secure our cloud infrastructure and real-time platform

Assurit is an award winning, certified small business headquartered in Fairfax, VA. We offer a highly competitive compensation and benefits package inclusive of medical and dental coverage, as well as paid time off.

Founded in 2013, Assurit has become a trusted provider of cybersecurity expertise to customers across federal, state and local governments, as well as the commercial sector. We are an employee-centric organization that focuses on the growth and development of our greatest asset – our people. We believe that if our Team is trained and educated, we will always be able to deliver our promise of customer success. If you enjoy work environments focused on continuous learning and growth, Assurit will be a great fit for you.

Whether you saw a specific job opening of ours or are simply interested in learning more about building your career at Assurit, feel free submit your resume. Based on your request, the appropriate individual within our organization will get back to you within 2 business days.

Assurit is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, protected veteran status, or disability status.

🎉 Let Employers Find You!

Employers will see your profile when they are sending a job in your skill.


Create Your Profile   (simple)