Genentech

Senior Site Reliability Engineer (SRE)

Join Genentech as a Senior Site Reliability Engineer to enhance system reliability using ServiceNow, AWS, and Azure. Enjoy competitive salary and growth opportunities.

ServiceNow Role Type:
ServiceNow Modules:
Department - JobBoardly X Webflow Template
DevOps
Department - JobBoardly X Webflow Template
IT Service Management
Department - JobBoardly X Webflow Template
Incident Management
Department - JobBoardly X Webflow Template
Problem Management
ServiceNow Certifications (nice to have):

Job description

Date - JobBoardly X Webflow Template
Posted on:
 
March 20, 2025

Roche is seeking a Senior Site Reliability Engineer (SRE) to join its global SRE team, responsible for designing and maintaining cutting-edge tools, scripts, and frameworks to automate repetitive tasks, streamline software deployment, and manage expansive systems with unparalleled efficiency. As a seasoned SRE, you will lead the charge in incident management and response, detect system anomalies, troubleshoot swiftly, and conduct thorough root cause analyses to prevent recurring issues. You will also champion continuous improvement by refining monitoring and alerting mechanisms, conducting insightful post-incident reviews, and embedding best practices in software lifecycle management.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent professional experience.
  • Approximately 5 years of experience in site reliability engineering, IT operations, DevOps, or related fields, or equivalent skills and experience.
  • Solid experience with AWS and/or Azure, including setting up, monitoring, and maintaining cloud resources (incl. Kubernetes, EKS, AKS, GKE, etc knowledge).
  • Proficiency with monitoring and logging tools such as DataDog, Splunk-Oncall, ELK stack, Grafana, and Prometheus etc.
  • Hands-on experience with JIRA and ServiceNow for tracking incidents, requests, and documentation.
  • Proficiency in Python or similar scripting languages for automation purposes.
  • Understanding of SRE Core principles beside in-depth understanding of incident prioritization, escalation processes, and service level management (SLA/SLO/SLI).
  • Demonstrates proficient troubleshooting capabilities, especially in cloud and distributed system environments.
  • Excellent communication, teamwork, and documentation skills, with a proactive and self-motivated approach to improving system reliability and operational efficiencies.

Benefits

  • Competitive salary
  • Opportunities for professional growth and collaboration with industry leaders
  • Dynamic work environment with opportunities to make a direct impact on system resilience and reliability
  • Support for diversity and inclusion

Requirements Summary

5+ years of experience in SRE, IT operations, DevOps, or related fields, with proficiency in AWS, Azure, and monitoring tools

More job openings

See all jobs
No items found.