Genentech

Senior Site Reliability Engineer (SRE)

Join Genentech as a Senior Site Reliability Engineer to enhance system reliability using ServiceNow, AWS, and Azure. Enjoy competitive salary and growth opportunities.

Direct Hire

Senior

ServiceNow Role Type:

ServiceNow Modules:

DevOps

IT Service Management

Incident Management

Problem Management

ServiceNow Certifications (nice to have):

Job description

Posted on:

March 20, 2025

Roche is seeking a Senior Site Reliability Engineer (SRE) to join its global SRE team, responsible for designing and maintaining cutting-edge tools, scripts, and frameworks to automate repetitive tasks, streamline software deployment, and manage expansive systems with unparalleled efficiency. As a seasoned SRE, you will lead the charge in incident management and response, detect system anomalies, troubleshoot swiftly, and conduct thorough root cause analyses to prevent recurring issues. You will also champion continuous improvement by refining monitoring and alerting mechanisms, conducting insightful post-incident reviews, and embedding best practices in software lifecycle management.

Requirements

Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent professional experience.
Approximately 5 years of experience in site reliability engineering, IT operations, DevOps, or related fields, or equivalent skills and experience.
Solid experience with AWS and/or Azure, including setting up, monitoring, and maintaining cloud resources (incl. Kubernetes, EKS, AKS, GKE, etc knowledge).
Proficiency with monitoring and logging tools such as DataDog, Splunk-Oncall, ELK stack, Grafana, and Prometheus etc.
Hands-on experience with JIRA and ServiceNow for tracking incidents, requests, and documentation.
Proficiency in Python or similar scripting languages for automation purposes.
Understanding of SRE Core principles beside in-depth understanding of incident prioritization, escalation processes, and service level management (SLA/SLO/SLI).
Demonstrates proficient troubleshooting capabilities, especially in cloud and distributed system environments.
Excellent communication, teamwork, and documentation skills, with a proactive and self-motivated approach to improving system reliability and operational efficiencies.

Benefits

Competitive salary
Opportunities for professional growth and collaboration with industry leaders
Dynamic work environment with opportunities to make a direct impact on system resilience and reliability
Support for diversity and inclusion

Requirements Summary

5+ years of experience in SRE, IT operations, DevOps, or related fields, with proficiency in AWS, Azure, and monitoring tools