What Is a Service Level Agreement?
We often write about the complexity of custom software projects — the need for clearly outlined requirements, the carefully drafted roadmaps, and the flexibility to make adjustments when the developers encounter obstacles or the client requests a change. So with all those moving pieces, how can a software vendor and a client be sure they understand the scope and cost of the work? The key is the service level agreement (SLA), which is a contract between vendor and specifies the level of service expected during the engagement.
An could be an addition to a commercial contract, or it could be published on the official site of the service provider. Either way, it should contain the information both parties need to fully understand the offering.
Why? When one is ordering a service such as software maintenance or support, it is important to clarify certain parameters in order to have the certainty needed to plan both business operations and budget. (The more urgency is required, the more the service will cost.) It is important to base this agreement on objective parameters. These parameters may be different in different industries and domains, but regardless, they must be defined for both parties’ protection. After all, if the agreement isn’t clear and measurable, it becomes nearly impossible to file a claim if the agreement is breached.
Let’s now take a look at the components of a typical SLA.
Recitals, or Definitions
It is best to start with a glossary as well as a short description of the system and the roles of the participants. This is to ensure that both parties are operating on the same understanding of various key terms and ideas. The name of the system is stated along with the technologies and any readymade third-party solutions that were used while building the system. Usually, this section would also list typical users such as regular users, key users, and help desk staff. It could also list company departments that are involved in the process and state their roles.
Then, the document defines limits of applicability, including territorial, timed, and functional. Here, we define where the service will be provided (remotely or on-site) and when (timeframe, work schedule, time zone, days off, and bank holidays). The functional applicability might describe the systems versions and modules, its interactions with other systems, etc. If your SLA covers any environment other than production (such as testing), this should be stated clearly in the document.
Finally, the SLA’s recitals section should clearly define the process that is regulated by the main part of the document.
After the recitals and definitions, the main part of the agreement should describe what should be done under this SLA and what should not be done. This includes type of work, rules regarding deliverables and communication, guidelines for revisions, and any other details that govern the process of fulfilling the agreement.
If any ambiguity remains, you should work on the SLA until all parts are clear to all parties concerned. Ideally, an outsider (for example, an employee of a field-specific consulting company) should be able read the agreement and say, “Yes, it’s all clear!”
Choosing the Metrics for the SLA
The metrics guiding the SLA are what make it possible for both parties to determine whether the work is being done in an effective and timely manner. This could include time boundaries — such as reaction timing or target resolution time — or quality markers. But whatever the metrics, they should possess these main characteristics:
- Metrics should reflect the quality of the service provided.
- Metrics should be easily measured.
- Metrics should follow both common sense and industry best practices.
- The number of metrics should be limited. Should there be more than one metric, the SLA should clearly define the priorities in order to ensure the vendor can address critical issues quickly to keep the project moving forward.
Metrics should depend purely on the work of the vendor or party tasked with the job. Should the correlation between the metric and the responsible party be poor, the control would be lost and the SLA would not work.
Let’s take a look at an example of a bad metric. Imagine that the time of availability of a given IT system is set at 99.99%, and the metric is attributed to the work of the help desk. The employees running the help desk do not have an influence on the down time of a system. Should the system be down, the only thing they can do is to inform the responsible administrator or department in a timely manner. The amount of time it takes the administrator or relevant department to fix the issue does not depend on the help desk. Hence, it would be unfair to blame the help desk for someone else’s work should something go haywire.
Beware of Excessive Requirements
While the metrics and requirements are designed to keep a project moving forward in a timely manner, it’s important to ensure they’re not so strict that they put the vendor (and, therefore, the project) in a bind. It might sound like a cliché, but it’s true: the tougher the metrics, the dearer the cost. Here are a few examples of excessively tough metrics:
Example #1. Let’s suppose that a certain type of service request takes on average four hours to resolve, though a senior developer could fix the issue within two. What would happen if, for this type of request, we were to write two hours instead of four (or even five)? This would mean that the project would require a senior developer, which would automatically bring the cost up. The senior developer’s expertise — not to mention their overqualification for many jobs and the volume of offers coming their way — would mean cost of such service would be much higher than normal.
Example #2. What if the SLA promised the same issue would be fixed within one hour? It may be tempting to write steep requirements into the SLA in order to impress the client, but when the requirements are unrealistic, they render the SLA useless and set the vendor up for trouble (and set the client up for frustration).
Example #3. Assuming the standard support time is 8/5 (eight hours a day, five days a week), if a client requests 24/7 support instead, the price tag will increase sharply, at least 5X. Why? Because in this scenario, three shifts per day would be employed, plus weekends and bank holidays, plus substitution for annual leaves and sick leaves. The clients should ask themselves: do I really need 24/7 support?
From our experience with successful SLAs, we advise both vendors and clients to really think carefully about what is a “must” and what is a “nice to have” in an SLA. If users insist on some unreasonable metrics, the vendor will ask them if they are prepared to pay for it.
SLAs could be further supplemented by referencing other documents describing the process, such as operating procedures or anything else applicable to the client’s domain. It would also be useful to mention the procedures and software used for tracking requests and include the relevant links.