Pascal Maugeri

HOWTO generate a simple Excel report based on your web page HAR

Blog Post created by Pascal Maugeri Employee on Oct 1, 2015

This post describes a simple approach I am using to analyse a web page and see in a glimpse characteristics of all the objects loaded (ex: Content-Type, Size) and in particular which objects are cached on Akamai platform.

 

HTTP Archive (HAR) format is based on JSON and can be generated using different web browsers or proxy. The file contains everything about the HTTP requests and responses exchanged during the loading of a web page in your browser.  It may include also content of each object. It is used to construct your web page waterfall.

 

Generating your HAR file


Browser based method

The simple way to generate the HAR corresponding to your web page is to use the tool Developer Tools that comes by default with Chrome browser.

When the devtools window is opened, just load your web page and hit the contextual menu in the Network tab (see screenshot below).


Screenshot at Oct 01 11-12-36.png


Using PhantomJS

You can also use PhantomJS from a command line to generate the HAR file.

I have slightly modified the original netsniff.js script provided on PhantomJS web to include the Pragma headers used in Akamai platform:

 

var page = require('webpage').create(),

    system = require('system');

 

page.customHeaders = {

  "Pragma": "akamai-x-feo-trace, akamai-x-cache-on, akamai-x-cache-remote-on, akamai-x-check-cacheable, akamai-x-get-cache-key, akamai-x-get-extracted-values, akamai-x-get-nonces, akamai-x-get-ssl-client-session-id, akamai-x-get-true-cache-key, akamai-x-serial-no, akamai-x-get-request-id"

};


Note that the complete updated script is attached to this post.

 

With this Pragma headers sent, you will receive additional information on how each objects is handled by Akamai platform (cached/not cached, etc).

 

Then you can run it the following way:


$ phantomjs phantomjs_netsniff.js http://www.akamai.com > akamai.har



Export to CSV


To convert the data of the HAR file, I use the command-line JSON parser jq.Note that if you don't want to install the tool in your working environment, you may use it online.


Here is the jq directive I am using to export the HAR file to CSV format:

 

[ "URL", "Status", "Body size","Content-Type", "Content-Encoding", "Server", "X-Check-Cacheable", "X-Cache" ],

(.log.entries[] | [

    .request.url,

    .response.status,

    .response.bodySize,

    ((.response.headers[] | select(.name == "Content-Type").value) // ""),

    ((.response.headers[] | select(.name == "Content-Encoding").value) // ""),

    ((.response.headers[] | select(.name == "Server").value) // ""),

    ((.response.headers[] | select(.name == "X-Check-Cacheable").value) // ""),

    ((.response.headers[] | select(.name == "X-Cache").value) // "")

]) | @csv


Hence here is the complete command line when running jq locally:

 

$ cat akamai.har | jq '[ "URL", "Status", "Body size","Content-Type", "Content-Encoding", "Server", "X-Check-Cacheable", "X-Cache" ],

(.log.entries[] | [

    .request.url,

    .response.status,

    .response.bodySize,

    ((.response.headers[] | select(.name == "Content-Type").value) // ""),

    ((.response.headers[] | select(.name == "Content-Encoding").value) // ""),

    ((.response.headers[] | select(.name == "Server").value) // ""),

    ((.response.headers[] | select(.name == "X-Check-Cacheable").value) // ""),

    ((.response.headers[] | select(.name == "X-Cache").value) // "")

]) | @csv' | sed 's/\\"//g' | sed 's/"//g' > akamai.csv


You are now ready to import the file in Excel :-)





Attachments

Outcomes