Today’s organisations deal with a higher volume of change in a more complex tech environment leading to a higher risk of outages and incidents. IT teams must improve service reliability and system resiliency. With automation and observability becoming key factors for more efficient and rapid deployments, the SRE profile has become one of the fastest-growing job roles.
With Site Reliability Engineering (SRE) Foundation, you will learn about:
- SRE Principles and Practices
- Service Level Objectives and Error Budgets
- Reducing Toil
- Monitoring and Service Level Indicators
- SRE Tools and Automation
- Anti-Fragility and Learning from Failure
- Organisational Impact of SRE
- SRE, Other Frameworks, The Future
Benefits for Organisations
- Enhanced stability and reliability of services
- Better understanding of how production services work
- Increased balance between technical investment in reliability and customer experience
- Greater appreciation of the operational impact of services in development teams
- Improvements in staff morale and retention
Benefits for Individuals
- Improved work balance with ring-fenced time for improvement
- Less stressful on-call experiences and a reduction in overall call-out volumes
- Broader skills-based capabilities that leverage the latest in automation
- Improvement in workplace culture
- Opportunities for “shifting left” and helping to ensure development teams deliver more reliable services