Pascal Maugeri

Design pattern for implementing Site Failover in Property Manager

Blog Post created by Pascal Maugeri Employee on Jul 23, 2015

This post describes how to implement Site Failover in Property Manager. The rules needed to implement this function are grouped in a subtree that could be inserted in your Property Manager configuration. This post also explain how to simulate failure of your origin server, hence you don't need to physically disconnect or stop it.

 

The design pattern presented here is based on best practices collected by Akamai Paris Professional Services team.

 

Requirements

Before starting to configure, make sure you have the module Site Failover listed in the available modules of your configuration as shown in the list of the configuration modules:

The configuration presented here will fail over your origin to a sorry page hosted on a NetStorage storage group. Hence, the second requirement of this configuration is to have a NetStorage group configured with an html page sorry_page.html stored in /failover/ folder.

 

Overview

We are going to configure several rules following this model:

Screenshot at Jul 23 22-33-39.png

 

 

Rule: Failover Rules

The root of “Failover Rules” is used to configure how the health of the origin is detected (number of retries, interval between retries and maximum reconnects):

 

Rule: Failover Trigger

In this rule, one defines the criteria to consider an origin have failed: it is generally set on an origin timeout or status codes. The behaviour Site Failover is enabled here and specifies the hostname and path used to serve the sorry page.

 

Rule: Netstorage Map

In this rule, the criteria catches the case where one serves the content on the alternate hostname and failover path:

 

Rule: Sorry Page Settings

When Site Failover is triggered, a new internal request is fired. One needs a different rule to catch this subsequent request to make adaptation on the returned Status Code for instance.

 

Note: some customers add a ‘Content Provider Code’ behaviour here, in order to display statistics on the failover page in the monitor panel.

 

Rule: Test on Staging

It could be very useful to simulate an origin failure. This rule shows how to achieve this on Akamai Staging network and to trigger the failure using a query string parameter.

 

Rule: Origin Timeout

If you need to simulate a timeout, you may actually trigger one with this rule (www.akamai.com:81 is blocked by the Akamai firewall, triggering a real timeout).

Since this rule is using an advanced part, it needs to be set by an Akamai representative.

 


Important Update: you should also add the test criteria CDNNetwork=Staging on this rule as well to protect your Production.

 

 

Rule: Static Content

The purpose of deactivating Sitefailover in rule "static content" is to avoid to respond with a sorry page upon failure of a request for something that is not a page (like static objects).

 

Note: in order to work, this rule should be defined after the “Failover Trigger” rule because it is resetting the failover behaviour. If not, the failover mechanism will be to serve the alternate html content even for those static resources.

 

 

 

 

 

 

 

Outcomes