What's the difference between SLO vs. SLA?

2017/1/26 posted in  云计算和大数据

Measurement source

What's the difference between SLO vs. SLA?

The terms SLO and SLA are often confused, but as you’ll learn in this expert tip, there are key differences.

A service-level agreement (SLA) is a contract between an external service provider and its customers or between an IT department and the internal business units it serves. The agreement documents what services the provider or IT department will produce and what performance standards the provider or IT department is expected to meet. The performance metrics, or service levels, associated with an SLA are sometimes referred to as service-level objectives (SLOs). SLOs describe, usually in measurable terms, benchmarks or goals set by the parties involved around the services a provider furnishes a customer within a given time period. For example, when used as a call center metric, an SLO might be for the service provider's agents to answer 80% of incoming calls within one minute.

SLAs enable customers and internal business units (collectively referred to here as "customers") to measure service provider or IT department (collectively referred to here as "service provider") performance and confirm that it is delivering services per the contract. SLAs are typically established for each IT service area, such as call center, data center or application maintenance. Customers create SLAs for each service they buy from a service provider, each of which includes a subset of performance metrics.

Although there is no hard and fast rule governing how many SLOs may be included in each SLA, it only makes sense to measure what matters.

Each SLO corresponds with a single performance characteristic relevant to the delivery of an overall service. Some examples of SLOs would include: system availability, help desk incident resolution time and application response time.

Eight components of an SLA

SLAs typically include the following components:

  • Service-level title
  • Service-level metric definition
  • Measurement calculation (the mathematical formula used to calculate the performance)
  • Measurement type (unit-based or event-based)
  • Measurement source (the tools used to monitor or measure)
  • Measurement period (the period of time over which performance is measured)
  • Default triggers (the measurable metric that must be met by the service provider to avoid service-level credit; e.g., resolution time for high-severity incidents is two hours)
  • Service-level credit (financial amount associated with a performance metric, which the customer is entitled to if the service level is not achieved)

SLA methodology

The SLA methodology refers to how the SLAs function; if the service provider doesn't perform per the agreed-upon metrics, a credit results. The eighth component in the SLA methodology listed above also defines the maximum credit; for example, the contract methodology might state that no more than 15% of the monthly invoice will be credited due to performance failure. Another piece of the methodology describes the pool of points that the customer can allocate across the various service performance metrics to designate priorities among the metrics. This allows customers to increase the weight or importance associated with a performance failure of a particular SLA. Customers should be able to change the points over time, based on shifting priorities.

Examples of SLOs

Below are three IT functions and examples of SLOs associated with each function. The eight components of SLAs listed above can be applied to examples here:

Data center: Application availability; virtual instance provisioning; disaster recovery time (e.g., performance must be restored within two hours).
Help desk: First-call resolution (e.g., 60% of all problems should be resolved during the first call); live communication response time (e.g., 80% of calls should be answered within one minute); abandonment rate (e.g., no more than 3% of calls received should be abandoned by the caller).
Application maintenance: Regression error rates; patch implementations; application availability (e.g., 99.9% uptime).

Benchmarking SLO metrics for optimal results

Besides understanding how to establish and enforce SLAs, customers need to determine the appropriate SLA default triggers based on industry and customer-specific requirements. This may mean deciding between 99.9% or 99.999% application uptime, for example, or between 20% first-call resolution or 60% first-call resolution.

Establishing benchmarks is a key step in creating an SLA because without them, customers may be asking for too little or too much from their service providers. Oftentimes, customers don't have data on their current performance if they haven't outsourced that particular function before. Or, they may be dissatisfied with their current outsourced results, if they're using SLAs with subpar performance metrics. Customers who are most satisfied with the performance of their suppliers are those that understand what the market will bear regarding SLA performance.

About the author:

Steven Kirz is a managing director at Pace Harmon, a consultancy for Fortune 500 and select midmarket companies in support of their outsourcing, technology sourcing, and transformation programs.