Lead Site Reliability Engineer - Director will be responsible for supporting and ensuring the stability of the applications, leading the team in change planning, deployment, and reviews, and performing proactive maintenance activities. The role requires 10+ years of work experience with a Bachelor's Degree or 8+ years of experience with an Advanced Degree. Key qualifications include expertise in delivering services through APIs, experience in supporting Java applications, and excellent analytical and problem-solving skills.
Requirements
- 10 or more years of work experience with a Bachelor’s Degree or at least 8 years of work experience with an Advanced Degree
- Minimum 5 years of work experience in Production/Application Support
- Minimum 3 years working experience in a leadership role
- Expert level experience in delivering services through APIs.
- Experience in supporting Java applications that run in Apache Tomcat, JWS, or similar containers.
- In-depth experience working with analytics tools like Splunk and Grafana
- Ability to work fluently on container technologies such as Docker and Kubernetes
- Solid working experience in an application support function working in 24*7 environment, driving high availability on active/active environments
- Excellent analytical and problem solving skills with a strong automation mind-set
- Good ITIL knowledge in Change Requests, Incidents, Problem Managements and worked on any Ticketing systems such as ServiceNow, Remedy or any equivalent.
- Experience planning, deploying and reviewing changes for critical applications.
- Experience with Continuous Integration Continuous Deployment (CICD) systems such as Jenkins, Ansible or equivalent
- Experience with log analysis tools such as Splunk, Grafana or equivalent
- Experience troubleshooting and resolving incidents and conducting Root Cause Analysis.
- Experience supporting distributed systems or microservices