We are seeking a highly skilled Technical SRE Manager to lead our Site Reliability Engineering (SRE) team. This role is pivotal in ensuring the scalability, availability, and reliability of our critical systems while driving automation, observability, and operational excellence.
Requirements
- 10+ years in SRE, DevOps, or infrastructure engineering, including 3+ years in a leadership role.
- Proven experience integrating AI/ML for observability, automation, and incident response.
- In-depth understanding of monitoring tools (LogicMonitor, Catchpoint, Redgate, ScienceLogic).
- Demonstrated expertise in implementing and optimizing OpenTelemetry (OTel) for comprehensive observability across endpoints, cloud environments, infrastructure, and SaaS applications, enabling proactive monitoring, tracing, and performance insights.
- Proficiency in scripting languages (Python, Go, Bash) and infrastructure tools (Terraform, Ansible) with AI/ML integration.
- In-depth knowledge of observability and data pipeline tools (Datadog, Prometheus, Splunk, AI-driven platforms like Cisco FSO).
- Extensive experience in incident management and on-call rotations, with AI-enhanced predictive approaches.
- Experience with CI/CD pipelines, GitOps, and infrastructure-as-code (IaC).
- Experience with data platforms or enterprise automation tools (e.g., ServiceNow, Salesforce, SAP).
- Knowledge of AI/ML-based data automation technologies.
- Familiarity with regulatory requirements for data privacy, such as GDPR and CCPA.
- A passion for leveraging emerging technologies to drive business transformation.
- A customer-first mentality with an ability to translate user feedback into actionable product features.
- Experience in leading cross-functional teams in a matrixed organization.
- Strong communication and leadership skills, with the ability to engage and influence stakeholders across technical and non-technical teams.
- Ability to thrive in a rapidly evolving industry and adapt to new challenges and opportunities.
Benefits
- Work Personas: flexible, remote, or required in office
- Equal Opportunity Employer
- Accommodations: accessible and inclusive experience for all candidates
- Export Control Regulations: obtaining export control approval from government authorities