Jr. Site Reliability Engineer at ProShop ERP

2 years ago Network & Security Junior Full-Time

As a Jr. SRE, your day will begin with checks of vital services, and responding to any client outages. You’ll be collaborating on our JIRA tickets from the previous week’s planning session.

Requirements

- U.S Based
- Must have 1-2 years of experience working in IT Operations.
- Must be willing to undergo a background check.
- Experience with AWS (EC2, EBS, Lambda, S3).
- Experience working in an Agile SRE/Backend team.
- Infrastructure scripting experience (e.g. Bash/Powershell, etc).
- A Cloud security mindset.
- Exceptional verbal and written communication skills.
- Willingness to continuously learn.
- A “document everything” mindset.

Desirable Requirements

- Cloud certification (or a strong desire for certification).
- Experience in Golang or another strongly typed language.
- Experienced in managing/scaling/monitoring a fleet of machines.
- Experience with metrics/alerting tools (e.g. CloudWatch, Prometheus, Honeycomb, etc).
- Have a background in infrastructure-as-code tools like Terraform.
- Experience programmatically building machine images (e.g. using Packer).
- Experience implementing “AWS Well-Architected” (Or Google/Azure alternatives).

Salary Range: Compensation based on role, skills, and experience.

Description

The ProShop Mission Statement: We deliver powerful manufacturing software by deeply understanding our client's challenges in order to meaningfully improve their businesses, and in turn, their communities.

ProShop is a game-changing ERP/MES/QMS software product we call a Digital Manufacturing Ecosystem (DME) specifically for companies in the metalworking and aerospace industry, such as machine shops, fab shops, and other similar businesses. We build this software by combining and translating our own diverse experience with deep industry-specific knowledge in order to optimize manufacturing processes for our clients. We are looking for excellent problem solvers and communicators who love a challenge to join our rapidly growing company and further the mission outlined above.

We believe that a great environment is a crucial part of doing great work. With flexible hours, and working remotely, we want team members to find the balance that’s right for them so they can produce their best work, feeling engaged with their roles, colleagues, and our overall goals. We aren’t just about making money, we’re also about deeply engaging with our clients to help them take their business to the next level. We have a philosophy that we’re in this together, and we look for folks who are smart, kind, and motivated by a genuine desire to help each other and our clients. Check out our website: www.proshoperp.com

Location: The candidate needs to be based in the United States and has U.S. citizenship for this remote position.

Job Description: Jr. Cloud Site Reliability Engineer

Are you someone who has a deep intellectual curiosity, is excellent at problem-solving, and is kind? If so, keep reading…

As ProShop ERP is expanding, we are looking for a Jr. Cloud Site Reliability Engineer (SRE) to join the IT-Ops Team to deliver insights from massive-scale data in real-time. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction.

As a Jr. SRE, your day will begin with checks of vital services, and responding to any client outages. You’ll be collaborating on our JIRA tickets from the previous week’s planning session. This will likely revolve around improving our cloud infrastructure & building out tooling & automation.

You will be working closely with the engineering team enabling them to get the best out of our cloud infrastructure. You’ll also occasionally be working on ad-hoc or emergency tasks, and everything that goes along with this such as post-mortems, etc.

The key points of a DevOps philosophy are:

- Reducing organizational silos.
- Accept failure as normal.
- Implement gradual change.
- Leverage tooling and automation.
- Measure everything.

Responsibilities

Reducing our organizational silos
- Building tooling to allow shared ownership of production infrastructure
- Building shared metrics/alerts of performance, errors, usage, etc
- Building operational transparency within the company
Accepting our failure as normal
- Post-mortems and subsequent actions
- Helping to define budgets/SLOs and building out the tooling to measure these
- Assisting with our on-call efforts
Implement gradual change across our fleet
- Gradual roll-out of changes, patches, updates across the fleet
- Canary testing new changes such as OS updates
Leverage tooling and automation
- Building out fleet management tooling such our internal sea-power tool
- Help us automate this year's job away :-)
Measure everything
- Ensure we are capturing all relevant data points for measuring toil and reliability
- Ensure signals aren’t lost in the noise

The perks!

- Extended health benefits.
- Retirement benefits.
- Education Fund.
- Flexible Work Hours (Occasionally requires work outside of business hours).
- Compensation and benefits commensurate with experience and skills.
- Dynamic, supportive, and high-achieving team.

We are an equal opportunity employer and love diversity at our company! We do not discriminate on the basis of race, gender, religion, color, national origin, sexual orientation, age, marital status, or disability status.

🌍 World Wide AWS azure

Go To Category ( Job Has Been Expired )