What Is A CDN & Where Does It Shine?
A Content Delivery Network (CDN) is a geographically distributed network of servers and their data centers that help in content distribution to users with minimal delay.
It does this by bringing the content closer to the geographical location of users through strategically located data centres called Points of Presence (PoPs). CDNs also involve caching servers which store and deliver cached files to accelerate webpage loading times and reduce bandwidth consumption. We will go into more details of exactly how CDNs work below.
CDN services are essential for businesses which rely on delivering content to users.
Consider the following:
- Large news publications with readers in many countries
- Social media sites that need to deliver multimedia content on users’ feeds
- Entertainment websites like Netflix delivering high-definition web content in real-time
- E-commerce platforms with millions of customers
- Gaming companies with graphics-heavy content being accessed by geographically distributed users
All these businesses need to ensure acceleration of their content delivery, availability of services, scalability of resources, and security of web applications. This is where CDN services shine as a unique advantage.
A Brief History of CDNs
CDNs were created almost twenty years ago to address the challenge of pushing massive amounts of data rapidly to end users on the internet. Today, they have become the driving force behind website content delivery and continue to be researched and improved by academia and commercial developers.
The first Content Delivery Networks were built in the late 90s and these are still responsible for 15-30 percent of global internet traffic. Following that, the growth of broadband content and streaming of audio, video and associated data over the internet has seen more CDNs being developed. Broadly speaking, the evolution of CDNs can be categorized into four generations:
Pre-formation Period: Before the actual creation of CDNs, the technologies and infrastructure needed were being developed. This period was characterized by the rise of server farms, hierarchical caching, improvements in web servers and caching proxy deployment. Mirroring, caching and multihoming were also technologies that paved the way for the creation and growth of CDNs.
First Generation: The first iterations of CDNs focused primarily on dynamic and static content delivery, as these were the only two content types on the web. The principle mechanism then was the creation and the implementation of replicas, intelligent routing and edge computing methods. Apps and info were split across the servers.
Second generation: Next came CDNs which focused on streaming video and audio content or Video-on-Demand services like Netflix for users and news services. This generation also cleared a path for delivering website content to mobile users and saw the usage of P2P and cloud computing techniques.
Third generation: The third generation of CDNs is where we are now and is still evolving with new research and development. We can expect CDNs in the future to be increasingly modelled for community. This means that the systems will be driven by average users and regular individuals. Self-configuring is expected to be the new technological mechanism, as well as self-managing and autonomic content delivery. Quality of experience for end users is expected to be the primary driver going forward.
CDNs initially evolved to deal with extreme bandwidth pressures, as video streaming was growing in demand along with the number of cdn service providers. With connectivity advancements and new consumption trends in each generation, the pricing of CDN services dropped, allowing it to become a mass-market technology. And as cloud computing became widely adopted, CDNs have played a key role in all layers of business operations. They are key to models such as SaaS (Software as a service), IaaS (Infrastructure as a service), PaaS (Platform as a service) and BPaaS (Business Process as a service).
How Does a CDN Work?
CDNs work by reducing the physical distance between a user and the origin (a web or an application server). It involves a globally distributed network of servers that store content much closer to the server than the origin. To understand this better, it helps to examine how a user accesses the web content from a website with and without CDNs.
Without a CDN
When a user enters the website into the browser, he establishes a connection similar to the one in the following figure. The website name resolves to an IP address using the Local DNS or LDNS (such as the DNS server provided by the ISP or a public DNS resolution server). If the DNS or LDNS cannot resolve the IP address, it recursively asks upstream DNS servers for resolution. Ultimately, the request may pass to the authoritative DNS server where the zone is hosted. This DNS server resolves the address and returns it to the user.
Then the user’s browser directly connects to the origin and downloads the website content. Each subsequent request is served by the origin directly, and the static assets are cached locally on the user’s machine. If another user from a similar or other location tries to access the same site, he will go through the same sequence. Every time, user requests will hit the origin and the origin will reply with content. Each step along the way adds a delay, or “latency”. If the origin is located far from the user, response times will suffer from significant latency, delivering a poor user experience.
With a CDN
In the presence of a CDN however, the process is slightly different. When the user-initiated DNS requests are received by his LDNS, it forwards the requests to one of the CDN’s DNS servers. These servers are part of the Global Server Load Balancer infrastructure (or “GSLB”). The GSLB helps with load balancing functionality that literally measures the entire Internet, and keeps tracking information about all available resources and their performance. With this knowledge, the GSLB resolves the DNS request using the best performing edge address (usually in proximity to the user). An “edge” is a set of servers that caches and delivers the web content.
After DNS resolution is completed, the user makes the HTTPS request to the edge. When the edge receives the request, the GSLB servers help the edge servers forward the requests following the optimal route to the origin. Then the edge servers fetch the requested data, delivers it to the end-user who requested it, and stores that data locally. All subsequent user requests will be served from the local dataset without having to query the origin server again. Content stored on the edge can be delivered even if the origin becomes unavailable for any reason.
Why Use a CDN?
CDNs help businesses deliver content to end users effectively by minimizing latency, improving website performance and reducing bandwidth costs.
Another unique feature of CDNs is that it allows the edge servers to prefetch content in advance. This ensures that the data you are going to deliver is stored in all CDN data centers. In CDN parlance, these data centers are called Points of Presence (or “POPs”). PoPs help minimize the round-trip time by bringing the web content closer to the website visitor.
For example, assume that you run an ad campaign and advertise your service or product among millions of potential customers. You may expect a large number of customers to rush to your site after reading the post. If you deal with influencers who have good audience engagement rates, the volume of traffic can see an even bigger spike. Can you be sure that your origin server will be able to handle this spike in volume all at once?
In such a scenario, CDNs can help distribute the load between the edge servers, and everyone will get the response. Because only a small fraction of requests will reach the origin, your servers will not experience massive traffic spikes, 502 errors, and overloaded upstream network channels.
Benefits of CDNs
Depending on the size and needs of your business, the benefits of CDNs can be broken down into 4 different components:
Improving website page load times
By enabling web content distribution closer to website visitors by using a nearby CDN server (among other optimizations), visitors experience faster webpage loading times. Visitors are usually more inclined to click or bounce away from a website with a high page load time. This can also negatively affect the webpage’s ranking on search engines. So having a CDN can reduce bounce rates and increase the amount of time that people spend on the site. In other words, a website that loads quickly will keep more visitors around longer.
Reducing bandwidth costs
Every time an origin server responds to a request, bandwidth is consumed. The costs of bandwidth consumption is a major expense for businesses. Through caching and other optimizations, CDNs are able to reduce the amount of data an origin server must provide, thus reducing hosting costs for website owners.
Increasing content availability and redundancy
Large amounts of web traffic or hardware failures can interrupt normal website function and lead to downtime. Thanks to their distributed nature, a CDN can handle more web traffic and withstand hardware failure better than many origin servers. Moreover, if one or more of the CDN servers go offline for some reason, other operational servers can pick up the web traffic and keep the service uninterrupted.
Improving website security
The same process by which CDNs handle traffic spikes makes it ideal for mitigating DDoS attacks. These are attacks where malicious actors overwhelm your application or origin servers by sending a massive amount of requests. When the server goes down due to the volume, the downtime can affect the website’s availability for customers. A CDN essentially acts as a DDoS protection and mitigation platform with the GSLB and edge servers distributing the load equally across the entire capacity of the network. CDNs can also provide certificate management and automatic certificate generation and renewal.
How Else Can A CDN Be Helpful?
The CDN is not limited to the benefits explained above. A modern CDN platform delivers many more advantages to your business and engineering teams.
It can be used to manage access from different regions on the planet. While you allow access for some regions, you can deny access to others.
You can easily offload application logic to the edge and close to your customers. You can process and transform the request/response headers and body, route requests between different origins based on request attributes, or delegate authentication tasks to the edge.
Large amounts of traffic require an infrastructure for log collection and processing for further analysis. CDNs collect the logs and provide an interface to conveniently analyze the data generated by the visitors.
It is only natural that something becomes easy to use when you are already familiar with it. For that reason, CDN Pro edges are NGINX based. This means you can perform tasks using standard NGINX directives.
Our engineering team spent thousands of hours extending NGINX.
Data Security & CDNs
Information security is an integral part of a CDN. CDNs help protect a website’s data in the following ways.
By providing TLS/SSL certificates
CDN can help protect a site by providing Transport Layer Security (TLS)/Secure Sockets Layer (SSL) certificates that ensure a high standard of authentication, encryption, and integrity. These are certificates that ensure that certain protocols are followed in the transfer of data between a user and a website.
When data is transferred across the internet, it becomes vulnerable to interception by malicious actors. This is addressed by encrypting the data using a protocol such that only the intended recipient can decode and read the information. TSL and SSL are such protocols that encrypt the data sent over the Internet. It is a more advanced version of Secure Sockets Layer (SSL). You can tell if a website uses the TLS/SSL certification if it starts with https:// rather than http:// , suggesting that it is secure enough for communication between a browser and a server.
Mitigating DDoS attacks
Since the CDN is deployed at the edge of the network, it acts as a virtual high-security fence against attacks on your website and web application. The scattered infrastructure and the on-edge position also makes a CDN ideal for blocking DDoS floods. Since these floods need to be mitigated outside of your core network infrastructure, the CDN will process them on different PoPs according to their origin, preventing server saturation.
Blocking bots and crawlers
CDNs are also capable of blocking threats and limiting abusive bots and crawlers from using up your bandwidth and server resources. This helps limit other spam and hack attacks and keeps your bandwidth costs down.
Static & Dynamic Acceleration
Dynamic acceleration applies to something that cannot be cached on the edge due to its dynamic nature. Imagine a WebSocket application that listens for events from a server or API endpoint whose response differs, depending on credentials, geographic location, or other parameters. It is hard to leverage the cache machinery on the edge in a way that is similar to caching static content. In some cases, tighter integration between the app and the CDN may help; however, in some cases, something other than caching should be used. For dynamic acceleration, CDN’s optimized network infrastructure and advanced request/response routing algorithms are used.
Billing model or “What do I pay for?”
Conventionally in a CDN, you pay for the traffic consumed by your end-users and the amount of requests. Additionally, HTTPS requests require more computing resources than HTTP requests, which creates more load on the CDN provider equipment. For this reason, you may pay additional costs for HTTPS requests, while HTTP requests are not billed at an additional cost.
As the computation moves to the edge, the CPU becomes an object of billing. Requests might have various processing pipelines and, as result, will require different amounts of CPU time. It is impractical to bill by the requests count; it is more practical to bill by traffic amount + cpu time used.
Who Uses CDN?
CDN is used by businesses of various sizes to optimize their network presence, availability, and provide a superior user experience for customers. A CDN is particularly popular in the following industries:
- Digital Publishing
- Online Video & Audio
- Gaming CDN
- Online Education
- Public Sector
- Financial Services