🚀 Interim Site Reliability Engineer (gn)

Hiring now — limited positions available!

Michael Page

WorkFromHome

💰 Earn $100.000 – $125.000 / year

📍 Location: WorkFromHome
📅 Posted: Oct 17, 2025

2 days ago Be among the first 25 applicants

Get AI-powered advice on this job and more exclusive features.

Exciting Company
Exciting Opportunities

About Our Client

Start: ASAP

Hourly Salary: max. 65€

Project duration: 06 months

Workload: 5 days per week

Location: Remote / Düsseldorf (1* per quarter)

Industry: Sales

Project language: English

Job Description

This role focuses on maintaining and evolving a cloud-native environment. Responsibilities include designing resilient infrastructure, implementing automated CI/CD pipelines, monitoring application performance, and collaborating with cross-functional teams to drive continuous improvement.

Ideal Candidate Background

Software Engineering Foundation: Preferably the candidate started their career in software development, establishing a solid foundation in coding, system design, and software lifecycle management. This background provides a deep understanding of the development process and the importance of operational efficiency and system reliability.
Transition to Infrastructure and Operations: After gaining valuable experience in software engineering, the candidate transitioned into infrastructure and operations. This move was driven by an interest in scaling, automating, and improving the reliability of cloud-native applications and systems.

Technical Skills And Experience

Cloud-Native Applications: Proficient in deploying, managing, and scaling applications in a cloud-native environment. This includes using containerization technologies like Docker and orchestrators such as Kubernetes to manage containerized applications across various environments.
Kubernetes Experience: Extensive experience with Kubernetes, including setting up clusters, deploying applications, managing stateful and stateless workloads, implementing autoscaling, and ensuring high availability. Familiarity with Kubernetes ecosystem tools (e.g., Helm, Kustomize) and practices is essential.
Hyperscaler Expertise: Strong experience with at least one major cloud services provider, preferably AWS, but also open to experience with Azure or Google Cloud Platform. This includes managing cloud resources, implementing security best practices, and leveraging cloud-native services for operational efficiency.
Infrastructure as Code (IaC): Skilled in using IaC tools such as Terraform, Ansible, Chef, or Puppet to automate the provisioning and management of infrastructure, ensuring consistency and compliance.
Continuous Integration/Continuous Deployment (CI/CD): Experienced in setting up and managing CI/CD pipelines using tools like Jenkins, GitLab CI, or GitHub Actions to automate testing and deployment processes.
Monitoring and Logging: Proficient in implementing monitoring and logging solutions (e.g., Prometheus, Grafana, ELK stack) to ensure proactive issue identification and resolution.
Programming Languages/Frameworks: Familiarity with at least one of the following: Node.js, Golang, or Java Spring Boot, for effective automation, tooling, and incident response.

Operational Skills

On-Call Duties: Willingness to participate in an on-call rotation, defined as 18/7 and for some rare cases, 24/7, understanding the critical role of maintaining system reliability and performance.
Incident Management: Capable of quickly diagnosing and resolving issues, minimizing downtime, and learning from incidents to prevent future occurrences.
Cost Optimization: Ability to monitor, analyze, and optimize cloud resources for cost efficiency without compromising performance or security.

Soft Skills

Good English Communication Skills: Excellent verbal and written communication skills, capable of effectively collaborating with team members, stakeholders, and clients.
Teamwork and Collaboration: Ability to work well within a team, share knowledge, and contribute to a positive working environment.
Continuous Improvement: A strong desire for continuous learning and improvement, staying up-to-date with the latest technologies and best practices.
Problem-Solving: Strong analytical and problem-solving skills, with a proactive approach to identifying and addressing challenges.
High Adaptability: Exceptional adaptability is required for collaborating with multiple teams, quickly learning new technologies, and adjusting to changing project demands.

What's on Offer

Competitive hourly rate, remote working flexibility, and opportunities to work on high-impact cloud native projects with a collaborative team.

Contact

Kai Gehrmann

Quote job ref: JN-

#J-18808-Ljbffr

👉 Apply Now

Hurry — interviews are being scheduled daily!

Apply Now