B-3-125GP9D

How to make a simple CDN for testing

Blog Post created by B-3-125GP9D Employee on Feb 4, 2016

We have always used the term Akamai Intelligent Platform but what is a regular basic platform like. Its actually very simple. A regular platform is an origin pull or push CDN that purelyproxies content with limited static options and will not adapt to failures or dynamically adjust to current global network conditions. We are quite far away from that type of CDN point today. So here is a lab to help you make your own little CDN and learn how it works.

 

Note: Do this in lab and test environment, but not on production. Even a Production simple CDN is far too advanced than this sample Lab.

 

Process: You need to install NGiNX web server on your test system. You can refer to documentation here How to Install NGINX Open Source | NGINX  Most linux distributions come with a pre-packaged version of NGiNX in their repository. Once you have the web server installed, it is time to configure a virtual host. You can place the configuration and edit it in /etc/nginx/conf.d/ or /etc/nginx/sites-enabled/ on Ubuntu for a quick test. Remember to disable firewall and have network access to your Test server for later tests.

 

The full configuration sample file is here on Github

 

Place this configuration in NGiNX directory. Create a directory /tmp/cache and give write permissions to the NGiNX service/user. Reload the NGiNX service. If you did everything right, the service will restart without errors. You can then edit the file as you go through the options and reload the service to apply the changes.

 

I will break it down for the main lines and explain. Most important one is right here at the top.

 

proxy_cache_path /tmp/cache levels=1:2 keys_zone=my_cache:100m max_size=1g inactive=60m;

 

This sets the directory where NGiNX will store Cache files and how many levels deep directories it can create. The keys_zone identifies the shared memory zone where in-memory objects are stored. Here we also set the size of maximum memory available for caching to 100MB and the disk cache to 1 GB. The inactive settings can confuse some people but for now lets just say if a cache object is not accessed for 1 Hour then it is removed to make space for new cache files that are being accessed. Just remember this. It has nothing to do with what you send to the browser to cache with HTTP 304. Its purely a maintenance feature. I will get into this later.

 

1. The Hostnames for your application. You can define multiple hostnames by separating them with a space.

 

  listen [::]:80;

  listen:80;

  #edit your actual site name
  server_name www.example.com;

 

The above is simply to tell web server which hostname(s) we will be serving and to listen on which ports. Here I use listen all IPv4 and IPv6 on port 80. Replace www.example.com with any site, can even be your local net hosts that has a site on it for you to try. It can be an actual site!

 

2. Origin is everything.

  location / {
  proxy_pass http://origin.example.com;
  proxy_set_header Host $host;
  proxy_set_header True-Client-IP $remote_addr;
  }

 

This shouldn't come as a surprise. The origin hostname can be an IP. I use LAN IP or my VM IP as origin. You can use anything except the server_name itself. Logically you are just creating a loop. NGiNX is smart enough to point this error out if you make it.

proxy_pass directive tells the web server where to forward all requests that match the location "/" which is the root of the site and ANY path under it not defined elsewhere.

proxy_set_header directive sends additional headers to the Forward Host. "Host" will set the correct Host header and $host is the built-in variable that will be equal to the "server_name" we put in Part 1. You can always specify a static name as as long the remote server can respond to it. The second header we add to use send the actual Client IP to origin in the special True-Client-IP header we are familiar with. Remember that our real origin will be behind our Simple CDN so it won't know that request are coming from a proxy. We can add this header for testing.

 

This rule basically tells the NGiNX proxy to forward all requests to origin, nothing too complex.

 

3. Lets define a list of Static objects

location ~* .(jpg|png|gif|jpeg|webp|css|mp3|wav|swf|mov|doc|pdf|xls|ppt|docx|pptx|xlsx)$ {...

 

This last blocks adds a rule to our default "root" rule in 2. It has a bunch of directives I have added that are possible. In case any of the file extension match this rule. The entire block will get applied. I will explain the main ones. The rest you can investigate in your tests.

 

proxy_cache_valid 200 301 7d;

Any response that is Status 200 OK or 301 Permanent redirect, cache it for 7 days. I mentioned the "inactive=60m" in the first configuration line, more on this. So let's say you have afunny.gif file that gets cached because it matches all the criteria. Ideally here web server tells the browser to also cache for 7d and the Simple CDN keeps it for 7 days in storage, provided it is accessed "at least" once every 60 mins. Otherwise it's ejected. It may be ejected after an hour so you will continue to see requests from it.

 

proxy_cache_min_uses 2;

At least two Hits are required to consider caching the object. If a file is accessed only once it will just be served from origin.

 

proxy_cache my_cache;

This defines the memory zone we declared in the first line of the configuration. Not to be confused with Cache key.

 

4. Lets fire it up.

In order to test , all you have to do is spoof! Once you reload the web server you can edit your hosts file on a system with a web browser and point www.example.com to your Simple CDN IP address and load in the browser a couple of times. Check the /tmp/cache directory on the test server and you will see new directories and files in there. These are disk cache files!

 

This is a good way to learn the basics of Caching and CDN. It's nowhere near Akamai Intelligent platform level but it allows you to asses any site that you want evaluate as well as look for odd behavior on a test application that isn't yet publicly accessible. You can of course add a lot of options/directives. NGiNX is very programmable and also has Lua extensions (beta) if you like scripting.

 

There are many limitations to a simple CDN that you will start to observe and not limited to this lab configuration where I have skipped things like no-cache cookies. I hope to hear some questions and discussions in comments.

 

I write NGiNX config without actually running them through the system, you can do that with nginx -test option. if you spot bugs please let me know or submit a pull request.

 

Kristle Chia -Thanks for helping me write my first post in the community

Outcomes