Here are the Important Differences Between SLI, SLO, and SLA

When embarking on your SRE journey, it can seem daunting to decipher all the acronyms. What are SLOs versus SLAs? What’s the difference between SLIs and SLOs? In this blog post, we’ll cover what SLI, SLO, and SLA mean and how they contribute to your reliability goals.

What’s the difference between SLI, SLO, and SLA?

Below are the definitions for each of these terms, as well as a brief description. Definitions are according to the Google SRE Handbook.

  1. SLIs that directly influence the health and the availability or the latency and performance of certain services.

How these terms help with reliability: an example case study

Imagine an organization is looking to increase reliability. The company has recently begun investigating expensive SLA breaches and wants to know why it’s reliability is suffering. This organization breaches its SLA for availability almost every month. As it onboards more customers with SLAs, these expenses can grow if it doesn’t meet its performance guarantees.

Identifying SLIs that matter to the user

The team knows it needs to examine availability and set SLOs for it, so it begins looking at the user journey. The QA team has already done some documentation, so the team refers to the user journeys outlined there and augments this documentation with their own journeys.

Establishing corresponding SLOs

After the team determines its SLIs, it’s time to set up the SLOs. The team is looking at availability of the site (a common complaint), as well as the latency issue on the expense page. While the team plans to add more SLOs later, these two will serve as the guinea pigs.

  • Alerting and on-call procedures for the service
  • Escalation policies in the event of error budget depletion
  • An agreement to halt feature development and focus on reliability after a certain amount of time where the error budget is exceeded.

Agreeing on SLAs

SLAs are an external metric, therefore not goaled the same way as SLOs. SLAs are a business agreement with users that dictates a certain level of usability. The engineering team is aware of SLAs, but doesn’t set them. Instead, the team sets SLOs more stringently than the SLAs, giving themselves a buffer.

Giving you all you need to know about Site Reliability Engineering. https://www.blameless.com/blog/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store