CVS Health

Staff Observability Operations Engineer

Join CVS Health as a Staff Observability Operations Engineer (Remote, CT). Leverage ServiceNow ITOM for observability solutions, incident management, and performance optimization. 7+ years IT ops experience required. Competitive benefits included.

ServiceNow Role Type:
Department - JobBoardly X Webflow Template
Implementer
ServiceNow Modules:
Department - JobBoardly X Webflow Template
Event Management
Department - JobBoardly X Webflow Template
IT Operations Management
Department - JobBoardly X Webflow Template
IT Service Management
Department - JobBoardly X Webflow Template
Integration Hub
ServiceNow Certifications (nice to have):
Department - JobBoardly X Webflow Template
Certified Implementation Specialist - Event Management

Job description

Date - JobBoardly X Webflow Template
Posted on:
 
April 17, 2025

CVS Health is seeking a Staff Observability Operations Engineer to oversee and optimize their observability platform. The role involves deploying observability solutions, managing event management platforms, handling release management, and troubleshooting incidents.

Requirements

  • 7+ years of experience in IT operations, with significant responsibilities in system monitoring, performance tuning, and troubleshooting enterprise applications.
  • 5+ years in a Site Reliability Engineering (SRE) role deploying and managing modern observability solutions.
  • 5+ years managing and implementing observability and event management platforms (e.g., AppDynamics, Splunk, Prometheus, Grafana).
  • Experience developing and administering ServiceNow ITOM event management solutions, ensuring seamless integration with observability tools.
  • Experience deploying and managing service reliability platforms (e.g., xMatters, OpsGenie, PagerDuty), configuring incident notifications, incident command workflows, and automating incident remediation workflows.
  • Experience with and deep knowledge of cloud environments, cloud monitoring platforms, and container orchestration tools (e.g., AWS/CloudTrail, Azure/Monitor, GCP/GCM, Kubernetes, OpenShift).
  • Proficiency in Python and other scripting languages such as Ansible, PowerShell, and Bash for automation and configuration.
  • Solution Implementation and Platform Management: Hands-on experience deploying, managing, and administering observability platforms.
  • Incident and Problem Resolution: Excellent problem-solving skills, with the ability to handle multiple tasks, prioritize effectively, and work under pressure.
  • Performance Monitoring and Optimization: Experience monitoring platform performance and implementing enhancements to support scalability and complexity.
  • Release and Configuration Management: Experience coordinating and managing release cycles for observability platforms.
  • Collaboration and Communication: Excellent communication skills, both verbal and written.
  • Continuous Improvement: Commitment to continuous improvement and staying current with industry trends and best practices.
  • Customer Focus: Strong customer service orientation with the ability to manage customer relationships effectively.
  • Compliance and Security: Knowledge of compliance and security standards related to observability platforms.

Benefits

  • Affordable medical plan options
  • 401(k) plan with matching company contributions
  • Employee stock purchase plan
  • No-cost programs for wellness screenings, tobacco cessation, and weight management
  • Confidential counseling and financial coaching
  • Benefit solutions for paid time off, flexible work schedules, family leave, and dependent care resources

Requirements Summary

7+ years of IT operations experience, 5+ years in SRE role, and 5+ years of experience in observability and event management platforms