Latency, latency, latency! It has always been a problem of the Internet. It was, it is, and it probably will be. Delivering data from one geographic point to another takes time.
However, latency can be reduced. This can be achieved in several ways:
Reduce the number of intermediate nodes on the data path from the remote server to the user. The fewer the handlers, the faster the data reaches the destination. But this is hardly feasible. The global Internet continues to grow and become more complex, increasing the number of nodes. More nodes = more power. That’s the global trend. Evolution!
Instead of regularly sending data over long distances, we can create copies of it on nodes closer to the user. Fortunately, the number of network nodes keeps growing, and the topology spreads ever wider. Eureka!
The latter option seems like an absolute solution. With a large number of geographically distributed nodes, it's possible to create a kind of content delivery network. In addition to the main function—speeding up loading—such a network brings several other benefits: traffic optimization, load balancing, and increased fault tolerance.
Wait a second! That's exactly what a CDN is—Content Delivery Network. So, let’s let this article explain what a CDN is, how it works, and what problems it solves.
A CDN (Content Delivery Network) is a distributed network of servers designed to accelerate multimedia content delivery (images, videos, HTML pages, JavaScript scripts, CSS styles) to nearby users.
Like a vast web, the CDN infrastructure sits between the server and the user, acting as an intermediary. Thus, content is not delivered directly from the server to the user but through the powerful "tentacles" of the CDN.
Since the early days of the Internet, content has been divided into two types:
Static (requires memory, large in size). Stored on a server and delivered to users upon request. Requires sufficient HDD or SSD storage.
Dynamic (requires processing power, small in size). Generated on the server with each user request. Requires enough RAM and CPU power.
The volume of static content on the Internet far exceeds that of dynamic content. For instance, a website's layout weighs much less than the total size of the images embedded in it.
Storing static and dynamic content separately (on different servers) is considered good practice. While heavy multimedia requests are handled by one server, the core logic of the site runs on another.
CDN technology takes this practice to the next level. It stores copies of static content taken from the origin server on many other remote servers. Each of these servers serves data only to nearby users, reducing load times to a minimum.
CDN infrastructure consists of many geographically distributed computing machines, each with a specific role in the global data exchange:
A single CDN infrastructure simultaneously includes many active users, origin servers, and edge nodes.
First, CDN nodes perform specific operations to manage the rotation of static content:
Second, CDN nodes have several configurable parameters that ensure the stable operation of the entire infrastructure:
Thus, static content flows from the origin server through edge nodes to users, cached based on specific caching rules, and cleared once the TTL expires. Meanwhile, access restrictions are enforced on every edge node for security.
Let's see how a CDN works from the user's perspective. We can divide the process into several stages:
For instance, if a website’s origin server is in Lisbon and the user is in Warsaw, the CDN will automatically find the nearest server with cached static content—say, in Berlin.
If there is no nearby CDN server with cached content, the CDN will request the origin server. Subsequent requests will then be served through the CDN.
The straight-line distance from Warsaw to Lisbon is about 2800 km, while the distance from Warsaw to Berlin is only about 570 km.
Someone unfamiliar with networking might wonder: “How can a CDN speed up content delivery if data travels through cables at the speed of light—300,000 km/s?”
In reality, delays in data transmission are due to technical, not physical, limitations:
Thus, the difference between 2800 km and 570 km is negligible in terms of signal propagation. But from a network infrastructure perspective, it makes a big difference.
Moreover, a CDN server in Berlin, finding no cached content, might request it not from the origin server but from a neighboring CDN node in Prague, if that node has the content cached.
Therefore, CDN infrastructure nodes can also exchange cached content among themselves.
There are several ways to classify CDNs. The most obvious is based on the ownership of the infrastructure:
Each type has its own pros and cons:
Public |
Private |
|
Connection speed |
High |
Low |
Initial costs |
Low |
High |
Maintenance complexity |
Low |
High |
Cost of large-scale traffic |
High |
Low |
Control capabilities |
Low |
High |
Dependence on third parties |
High |
Low |
Many CDN providers offer free access to their infrastructure resources to attract users. However, in such cases, there are limitations on:
Paid CDN providers use various pricing models:
Deploying your own CDN infrastructure is a serious step, usually justified by strong reasons:
Here are a few examples of private CDN networks used by major tech companies:
CDN technology has evolved to address several key tasks:
The CDN, being a global infrastructure, takes over nearly all core responsibilities for handling user requests for static content.
Despite solving many network issues, CDNs do have certain drawbacks:
Of course, we can minimize these drawbacks by carefully selecting the CDN provider and properly configuring the infrastructure they offer.
In today’s cloud-based reality, websites with multimedia content, high traffic, and a global audience are practically required to use CDN technology. Otherwise, they won’t be able to handle the load effectively.
Yes, websites can function without a CDN, but the question is, how? Slower than with a CDN.
Almost all major websites, online platforms, and services use CDNs for faster loading and increased resilience. These include:
However, CDNs aren’t just for the big players — smaller websites can benefit too. Several criteria suggest that a website needs distributed caching:
That said, there are cases where using a CDN makes little sense and only complicates the web project architecture:
Still, the main indicator for needing a CDN is a large volume of multimedia content.
While each CDN’s infrastructure is globally distributed, there are priority locations where CDN servers are most concentrated:
The smallest CDN networks comprise 10 to 150 servers, while the largest can include 300 to 1,500 nodes.
Here are some of the most popular, large, and technologically advanced CDN providers. Many offer CDN infrastructure as an add-on to their cloud services:
There are also more affordable options:
Some providers specialize in CDN infrastructure for specific content types, such as video, streams, music, or games:
Choosing the right CDN depends on the business goals, content type, and budget. To find the optimal option, you should consider a few key factors:
In practice, it’s best to test several suitable CDN providers to find the right one for long-term use.
In a way, choosing a CDN provider is like choosing a cloud provider. They all offer similar services, but the implementation always differs.
It’s important to understand that a CDN doesn’t fully store static data; it only distributes copies across its nodes to shorten the distance between the origin server and the user.
Therefore, the main role of a CDN is to speed up loading and optimize traffic. This is made possible through the caching mechanism for static data, which is distributed according to defined rules between the origin server and CDN nodes.