Site Reliability Engineer at Container Solutions B.V.

4 years ago DevOps & System Administration Senior Full-Time UTC+1 ±0

We are looking for Site Reliability Engineers based in Europe to join our new Customer Reliability Engineering (CRE) team.

What does it take?

Please make sure you meet these requirements before applying, as we will be checking for all of them in our hiring process.

Must-have requirements

A strong engineering or operations background, and the commitment to develop continuously in both disciplines
At least 3 years experience working in a related field
A strong understanding and knowledge of the following:
— Kubernetes API, core principles and components
— Linux networking and security related to containers
— Distributed systems and common distributed system failure modes
Proven production experience with at least one of the following per-line:
—Common CI/CD systems such as Github Action, Jenkins, Gitlab CI, etc.
— Major cloud service providers such as Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure
— Modern Infrastructure Automation or Configuration Management system such as Terraform, Puppet, Ansible, SaltStack, Chef, Pulumi, etc
— Container platforms such as Kubernetes, Mesos, Nomad, or equivalent
— Programming languages such as Go, Python, Rust, C, or equivalent
Experience working with distributed architectures, e.g. microservices or service-oriented architectures
Experience operating and maintaining production systems on Linux in a public cloud
Can work effectively in a globally distributed team
An urge to collaborate and communicate asynchronously
An analytical mind; debugging and problem solving skills are paramount
Attention to detail and excellent communication skills, both written and verbal
Ability to work on your own as well as part of a team
Comfortable with participating in an on-call schedule

Nice-to-have requirements:

Bachelor's degree in computer science, engineering, math, or a relevant field
Experience being part of an on-call schedule
Experience working 100% remotely
Experience implementing monitoring solutions
Operations experience with a production user-facing application
Have developed a Kubernetes controller, operator, or other platform component
A background in writing reliable software and/or automation tooling

About the role

We are looking for Site Reliability Engineers based in Europe to join our new Customer Reliability Engineering (CRE) team. As part of the team you will have responsibility for the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of our customers’ applications and infrastructure. We are building a remote-first team across multiple time zones with the goal of eventually enabling a follow-the-sun work schedule.

Core Responsibilities

Be part of a fully remote team across multiple continents and time zones
Regularly engage with customers to consult and share information
Develop simple, sustainable, and repeatable solutions and processes
Participate in your team’s effort to continuously improve our customers’ production environments
Own your team's tech and tools stack and contribute to any relevant open-source projects
Design, analyse, and troubleshoot large-scale distributed systemsParticipate in your team’s on-call rotation
Create and refine documentation and processes
Automate almost all the things
Provide ideas for future road-map items, based on customer, operational, and/or organisational needs
Learn and share by being part of the Cloud Native community through open-source tooling and processes, writing blog posts, and giving meetup or conference talks

If you are selected for this role and come to work for us on an employee basis, you can count on the following:

A competitive compensation package
Possibility for employee ownership and equity based compensation
25 days of paid leave annually
Company-wide mental health days off
‘No Meeting’ Wednesdays
Access to leadership development programmes, coaching and mentoring
Access to our in-house psychologists

The selection process

Stage 1:

CV sift based on our core requirements for this role (+ optional call with the recruiter)
Technical interview: A 60-minute interview assessing your understanding of the commonly used technologies in this role.

Stage 2:

Personality Profile Assessment: You will complete an online personality assessment and undertake an interview based on your profile via Google Hangouts.

Stage 3:

Final Behavioural and Situational panel interview with two members of our engineering team.

———

About the team
We’re a new team with big ambitions, and to help us deliver on our lofty goals, we have the following values to help keep us on course:

Do the right thing; blameless culture, be fair and do the right thing, and always be the change you seek
Assume good intentions; always have good intentions, show integrity, and maintain levelheadedness
Tireless generosity; always be generous to your peers, and collaborate whenever possible
Communicate endlessly; communicate as much as possible, be straightforward as possible and avoid jargon and weasel words whenever possible, and always be information seeking
Learn from reflection; work together to continually improve what we do, take time to reflect and learn from our actions/outcomes, and to understand others points of view

Being a distributed remote-first team, it’s important to develop habits that bring us together and force us to interact with each other, encourage us to collaborate, and generally just brings us closer together. To achieve this we have regular events such as:

Weekly Virtual-Coffee (social event, non-work discussions)
Monthly Coding Katas (group coding exercises for learning)
Monthly Team drinks (all drinks are welcome, teas, juices, alcohol, caffeinated, non-alcoholic)
Show & Tell sessions (ad-hoc technical or hobby related informal presentations)
Monthly Games Night (social event, e.g. Tetri.io, A Fake Artist Goes to New York, Among Us, etc)

And we’re always looking for more things we can do to facilitate this, so if you join the team and have some ideas about this, you can help us expand this list.

🌍 World Wide kubernetes linux infrastructure-as-code cloud

Go To Category ( Job Has Been Expired )