🚀 Site Reliability Engineer
Hiring now — limited positions available!
MPower Plus
WorkFromHome- 📍 Location: WorkFromHome
- 📅 Posted: Oct 20, 2025
About our client :
A leading global information technology, consulting, and business process services company. it operates in over 60 countries, delivering solutions across cloud, digital, engineering, and cybersecurity. It partners with clients to drive innovation, efficiency, and business transformation. Fully remote role from anywhere in Germany. Work in a Mid to Senior level role with a global team. Competitive compensation package and benefits. Continuous learning opportunities. High-impact role supporting security for critical infrastructure and services.
About the Role
Job requirement No : SRE
Location
Germany(Complete Remote within Germany , Candidate has to be in Germany)
Position Type
Full time role
Responsibilities
Site Reliability Engineer (24x7 Operational Support) (f / m / d)
We are seeking a Site Reliability Engineer (SRE) with a strong background in observability, secure logging, and automation. The ideal candidate will have hands-on experience with Elasticsearch and / or Prometheus platforms. This role encompasses critical responsibilities in platform operations, including incident management, execution of scheduled maintenance, and contributing to engineering tasks focused on enhancing system stability. The SRE will also be responsible for adhering to standard operating procedures (SOPs) and actively contributing to their continuous improvement by providing constructive feedback.
Key responsibilities :
- Platform Engineering & DevOps : Manage Kubernetes and container orchestration, including Helm chart configurations and CI / CD pipelines (Jenkins, ArgoCD). Develop automation scripts (Python, Bash, Go) and deploy Infrastructure-as-Code (IaC) solutions.
- Observability, Monitoring & Visualisation : Maintain Prometheus solutions (scrape configurations, alert rules, PromQL queries), administer Thanos and Grafana.
- Elastic Stack Operations & Log Management : Configure and optimise Elasticsearch clusters, Logstash pipelines, and Kibana dashboards for secure, scalable log processing.
- Incident Response, Troubleshooting & Collaboration : Participate in 24x7 on-call rotations for rapid incident response, troubleshoot platform, data and performance issues, and engage in Major Incident Management (MIM).
- Secure Operations & Compliance : Ensure system operations meet security and data protection requirements, maintain secure documentation, and manage access control policies.
Kindly share your CV with
#J-18808-LjbffrHurry — interviews are being scheduled daily!