The Site Reliability Engineering (SRE) team seeks to apply an SRE approach to managing infrastructure operations, ensuring services are running smoothly, and continuously analyzing service data to identify service improvement opportunities.
Requirements
- Bachelor's degree in computer science or related technical field, or equivalent practical experience.
- At least 5 years of demonstrated experience in similar roles/environments.
- Experience with at least one of the major cloud providers (AWS, Google Cloud, Azure, etc), infrastructure architecture and infrastructure as code (IaC).
- Proven experience with containerization and container orchestration tools.
- Proven experience working with various operating systems, including Unix, Linux, and Windows (on premise and virtual).
- Proven experience with configuration management tools (Ansible, Puppet, Chef etc).
- Expertise in git operations, branching strategies, versioning and releasing.
- Practical experience and understanding of CI/CD pipelines.
- Strong scripting skills (Bash, PowerShell, Python etc.) to automate various tasks.
- Exposure to monitoring tools (logs, metrics, traces) and alerts.
- Experience with analyzing and troubleshooting on-premise and cloud systems.
- Previous experience with Agile delivery frameworks (e.g. Scrum, Kanban).
- Previous experience with hosting and network solutions.
- Previous experience with programming and SDLC lifecycle in at least two of the following: Python, Java, C#, JavaScript, TypeScript, Go, C, C++, etc.
- Experience with Relational and Non-Relational databases.
- Hands-on experience with ServiceNow or similar, ITSM workflows, and CMDB integrations.
- Proven ability to work effectively with cross-functional teams, including developers, QA, operations, and product management.
- Aptitude for mentoring less experienced team members and providing guidance on best practices, even without direct management responsibilities.
- Ability to quickly learn and adapt to new tools and technologies as needed.
- Expertise in designing and maintaining automated build, test, and deployment systems.
- Familiarity with modern architectural patterns, including microservices and serverless architectures.
- Systematic problem-solving approach, coupled with effective communication skills and a sense of drive.
- Previous work in highly regulated environments with security and compliance considerations.
- Experience working in globally distributed teams with cross-functional collaboration.
Benefits
- Flexible work arrangements
- Equal Employment Opportunity
- Disability Inclusion