Site Reliability Engineer
- Job Ref: 6493
- Location: Dublin, Ireland
- Type: Contract
The role of the SRE is expected to:
- Increase application reliability at scale
- Overcome Dev and Operations silos and conflicts
- Improve project, end user, and business outcomes
We would expect you to:
- Understand what it takes to deliver and support high quality solutions that align to the needs of our clients based on the AWS and/or Azure platforms in an Agile & DevOps environment
- Be passionate about Cloud technology and Cloud adoption.
- Develop and build upon client CI/CD platforms and domains.
- Be an unflappable problem solver, troubleshooting and improving existing DevOps processes as well as providing support and advice for development and platform teams.
- Be curious and want to develop your knowledge of current and future technologies and how they can be leveraged at Expleo client sites.
We are looking for Site Reliability Engineers with proven experience of cloud infrastructure, software engineering and SRE practices.
You will work within the client SRE and DevOps teams and proactively engage with other business areas and product and project teams with regards to their use and exploitation of cloud resources to maximise the value generated.
Re: SRE practices you should have evidence of working with Error Budget policies and practices, Toil identification, mitigation and elimination, Observability tools and techniques, blameless post-mortems, and incident management improvements
This engagement will support both central IT teams and business teams in their designs and delivery of technologies and solutions. The role will also support and guide other team members and work with Cloud Architects in regarding function and platform wide features, technologies and configurations that ensure the platform is supportable, secure and consistent for use within client domains. Day to day you will be required to onboard new initiatives and support the current cloud estate.
Our SRE teams are often an evolving part of the Site Reliability Engineering function within client domains. As a significant member of this function you will find excellent opportunities to grow and broaden your skills and experience. We encourage all team members to share their passion of innovative problem solving, quality-first coding and DevOps/SRE practices to help evolve the team and function
What you'll Need:
It is essential that you have strong experience in and exposure to Enterprise API, Continuous Integration, Test Driven Development and Infrastructure as Code.
Unix OS experience is preferred and ideally, recent experience in container orchestration (Docker, Kubernetes, etc) and a scripting language (e.g. Python, Bash).
You will need experience of at least one major cloud provider (preferably Azure, GCP or AWS,) and ideally Kubernetes experience. Strong experience of Site Reliability Engineering practices and principles is also required.
In addition to the above, you will also have;
- Proven track record of working in a complex and multi-faceted organisation using a wide variety of current and legacy IT technology components.
- Strong exposure to both waterfall based and Agile/Scrum based methodologies
- Relevant and current software development / engineering experience covering all aspects including all security and privacy aspects, error handling, monitoring and logging.
- Awareness of architecture practices.