An SLA is an agreement that defines the level of service a company provides to its customers. This term is usually used in IT and telecommunications.
Unlike standard service contracts, a Service Level Agreement provides a very detailed description of service quality, operating modes, response times to incidents, and other parameters.
A Service Level Agreement usually has the following characteristics:
Maximum possible transparency of all processes and interactions between the service provider and the client. When drafting the contract, vague wording that could be interpreted ambiguously in one direction or another is avoided.
Clearly defined rights and obligations understood by all participants in the agreement. For example, a provider commits to ensuring 99.9% service availability and to pay compensation if a lower figure is recorded, while the client has the right to request that compensation.
Expectation management. For instance, a client might expect 24/7, ultra-fast support even for minor issues, while the provider cannot offer such a service. In this case, the client should either lower their expectations or sign a contract with another provider. A third option is also possible: the provider may raise the service level if it benefits their business processes.
The agreement specifies the timeframes for fixing issues and solving other problems. It also describes possible compensations that the client may receive if the company fails to meet the declared metrics.
An SLA does not always need to be a large document. The main thing is that it clearly describes the core parameters of the service in understandable terms. For example, the AWS S3 SLA is only one page long. It lists monthly uptime percentages and the amount of compensation the client receives if the service fails to meet those thresholds.
The example above from Amazon Web Services is not a standard; it is just one possible format tailored to a specific service.
An IT SLA often includes the following sections:
The procedure for using the service.
Responsibilities of both parties, including tools for mutual monitoring of performance.
Specific steps for troubleshooting and restoring functionality.
The agreement may also specify its term. In some cases, the parties describe in detail the procedure for adding new requirements for functionality or service availability.
When describing service quality, its parameters are also disclosed. These typically include:
Service availability.
Response time to a problem.
Time to fix incidents.
The SLA may also specify a metric for operating hours.
When describing payment procedures, it may indicate the billing model (e.g., pay-as-you-go, fixed rate, etc.). If penalties are provided, the SLA will specify the situations in which the provider must pay them. If the client is entitled to compensation, the SLA also describes the relevant situations and payment procedures.
SLA parameters are metrics that can be measured. The agreement should not contain vague phrases like “issues will be resolved quickly, before you even notice.” Such wording is unclear and prevents all participants from organizing proper workflows.
For example, the support schedule metric should clearly define when and for which groups of users technical support is available.
Suppose a company divides its clients into several groups:
Group 1: 24/7 phone and chat support.
Group 2: phone and chat support only on weekdays.
Group 3: chat-only support on weekdays.
Metrics are necessary so that all participants understand which services they receive, when, and in what scope.
From this, several key characteristics follow:
Metrics must always be publicly available.
Their descriptions must be unambiguous for all parties.
Clients must be notified in advance about metric changes.
When defining metrics, it’s important not to set overly strict requirements, as this significantly increases costs.
For instance, suppose a typical specialist can resolve a problem in 4 hours, while a higher-level expert can do it in 2 hours. Writing “2 hours” as the SLA metric is not ideal, as it would immediately make the expert’s work more expensive. If you specify “1 hour,” costs rise further due to the increased risk of penalties for non-compliance.
Other important metrics can include response time to a client request. The values may differ depending on the client’s status and problem criticality.
For example, a company providing IT outsourcing services might have:
Premium clients: response within 15 minutes.
Basic clients: response within 24 hours.
All of this must be clearly reflected in the SLA.
In addition to response time, there’s also incident resolution time. The logic for this parameter is similar: even if a client is important, requests are prioritized based on criticality.
For example:
If a client’s local office network stops working and all processes halt, that issue must be prioritized. The SLA may state that local network troubleshooting should take no more than 5 hours.
If the same client needs to add a few new devices to an already working network, the resolution time may be several hours or even days.
The combination of response time and resolution time forms downtime.
These and other parameters must be described in the SLA and accepted by all parties before cooperation begins. This approach reduces conflicts; everyone understands what to expect from each other.
For providers, one of the most important SLA parameters is service availability. It is usually measured in days, hours, or minutes over an agreed period. For example, a provider guarantees that a cloud computing service will be available 99.99% of the time during a year.
At first glance, the difference between SLA 99 and SLA 100 may seem small. But in absolute terms, it’s significant.
At 99%, you agree that servers may be down up to 4 days per year.
At 100%, downtime should be zero—something no company can guarantee.
That’s why SLAs are usually written with “nines”: e.g., 99.9%, 99.99%, etc.
For example, Hostman.com guarantees 99.98% uptime, meaning total annual downtime will not exceed 1 hour 45 minutes.
Some providers promise “five nines”: 99.999% uptime, or less than 15 minutes of downtime per year. But this is not always the best option. Two points to consider:
The higher the SLA percentage, the higher the cost.
Not every client needs such a high level. In most cases, 99.982% uptime (or slightly higher) is sufficient.
It’s important to check not only the number of nines but also the time unit used for measurement. By default, SLA indicators are calculated annually. For example, 99.95% availability equals no more than 4.5 hours of downtime per year.
If the contract doesn’t explicitly say that the time unit is “per year,” be sure to clarify, as some providers disguise monthly values as annual.
Another key concept is aggregate availability, which equals the lowest of all measured values.
Signing and adhering to an SLA benefits both parties.
For the company, it defines obligations and protects against unreasonable client demands, such as urgently fixing a minor issue in the middle of the night.
Other benefits include:
The provider can use the SLA to organize both external and internal processes, such as introducing different support levels depending on service criticality and client importance.
Clients gain clarity about what services they can expect, in what timeframes, and in what order, helping them plan their core operations.
An SLA can also be viewed as an indicator of user satisfaction, ranging from 0% to 100%.
Absolute satisfaction (100%) is impossible, just as it’s impossible to guarantee 100% uptime. Therefore, when choosing metrics, one should be realistic and select achievable values.
For example, if your team doesn’t provide 24/7 support, you shouldn’t promise it. When the team expands, you can update the SLA and delight clients by offering round-the-clock assistance.
To monitor service levels internally, another system is used: SLO (Service Level Objective). These are the target values the provider aims to achieve.
Example: Current capabilities are handling 50 tickets per business day, working 9:00 to 18:00, five days a week. These metrics are fixed in the SLA and shown to clients.
Meanwhile, the SLO document sets internal goals, for example, increasing the number of handled tickets to 75 per day or switching to 24/7 support. This directly affects the company’s future service level.
Start with a descriptive section, which usually includes:
The next section describes the services provided, giving the client a full understanding of what they can expect when signing with the provider.
Then comes the main section, describing the service level. It should include metrics that reflect quality and are easily measurable, as well as metric values that are specific numbers guiding all participants.
You can end the SLA with references to other documents that regulate service processes.
At all stages of preparing an SLA, remember: it is a regulatory document. Its main goal is control. The more control over the process, the better the SLA. If there is no control, such an agreement is meaningless.
If you are not signing but drafting an SLA to offer clients, pay attention to the following points:
Users. In large systems, divide users into groups and manage them separately. This helps allocate resources efficiently and avoid overload from different client types.
Services. Consider the criticality of each service for each client group. Example: You provide a CRM to trading companies. If they can’t use it, they lose money and complain, meaning it’s a high-criticality service. Printer replacement or user account creation can wait until tomorrow.
Service quality parameters. They must align with business goals and client needs. A typical example is incident resolution times, e.g., 24/7 support versus 9 a.m. to 5 p.m. on weekdays only.
An SLA is a document that must be announced to all users whenever it is introduced or updated, regardless of privilege level or service criticality.
SLA is a management tool that constantly evolves. You may find that current quality parameters harm business processes or no longer meet client expectations. In that case, management should decide to optimize processes or improve services.
The main goal of SLA indicators is not to attract users but to ensure open dialogue with them. Every participant accepts the agreement and commits to following it.
Violation of an SLA is grounds to claim compensation and terminate cooperation.