The Lead Site Reliability Engineer will be hands-on and provide mentorship to other team members on core SRE principles and tools, while participating in end-to-end operational aspects of the Production environment. They will work closely with Architects, DevOps, Product, and development teams to ensure the best use of the software on the AWS platform.
Requirements
- Bachelors or Master's in Computer Science discipline.
- 5+ years' experience focussed on Site Reliability Engineering or related position in AWS Cloud Platform.
- At least 2 AWS Certifications are must.
- Experience working with SQL, Windows Servers, Load balancers, Linux
- Deep experience with AWS, Docker and Kubernetes, CloudFormation, CloudWatch, CodeDeploy, DynamoDB, Lambda, SQS, Amazon FSX, Elastic Search and networking concepts are must.
- Program at a high level in at least one language such as: Java, C#, Javascript, Python or Ruby.
- Integration experience with PagerDuty, ServiceNow, Datadog, CloudWatch.
- Good understanding of Site Reliability Engineering (SRE) philosophies, technologies, platforms and tools, SLO management, incident resolution, and automation;
- Ability to explain technical concepts in clear, non-technical language
Benefits
- Hybrid Work Model
- Flexibility & Work-Life Balance
- Career Development and Growth
- Industry Competitive Benefits
- Culture
- Social Impact