We are seeking a Senior Incident Commander to join our Infrastructure and Platform Services team, servicing our Identity Security Cloud platform. The ideal candidate will be a self-starter who enjoys a fast-paced job, thrives on problem solving, and is committed to delivering seamless product availability to large enterprises around the world.
Requirements
- 5+ years experience in 24x7 production operations, preferably supporting a highly available environment for a SaaS or cloud service provider
- 3 to 5 years experience leading incident response efforts
- Experience with ticketing systems like Jira, Remedy, or ServiceNow
- Experience with cloud infrastructure environments, preferably AWS
- Experience with containerization technology, preferably Docker
- Experience leading RCA (Root Cause Analysis) / Post Mortems
- Experience with Java applications and related J2EE technology stack
- Release automation (Jenkins, etc), system administration, system configuration, and system debugging experience
- Experience working with tools like Grafana, Splunk, Prometheus, and Confluence
- Experience using scripting languages (Ruby, Python, etc), configuration management tools (Chef, Puppet, etc) and command execution frameworks
- Strong understanding of system and networking concepts and troubleshooting techniques
- Strong interpersonal and teaming skills - ability to set and enforce process and influence engineers who are not direct reports.
- Ability to operate in an agile, entrepreneurial start-up environment.
- Great communication skills – C1 or better English fluency
- Bachelor's degree in Computer Science or other technical discipline, or equivalent experience, preferred not required