05/27/2022 | Press release | Distributed by Public on 05/27/2022 09:56
2022-05-27 | 9 min read
Web applications are evolving in complexity, and it is becoming harder to see through the highly dynamic UI and track request-responses, authentication packets, redirects etc. To make our life a little bit easier we have a standard logging mechanism capturing all HTTP traffic behind fancy front-end pages. It is called HTTP Archive Record (HAR) file that is easily accessible in most modern web-browsers. In this blog we will look at HAR files and how they provide key insight about web traffic.
What is HAR File?
The HAR (HTTP Archive) file format is a JSON structured file containing all network traffic information about a browser's interactions with the server providing content for a webpage. The specification for the HTTP Archive (HAR) format defines an archival format for HTTP transactions. This archived log can be used by a web browser to export detailed performance data about web pages it loads.
The specification for this format is produced by the Web Performance Working Group of the W3C. Although this is never officially published by W3C but the specifications are widely used by different application.
Why HAR files?
HAR files supply necessary information to debug and visualize a web page performance and functionality. Here are several use cases where a HAR file could be useful:
How to capture HAR files?
HAR exporting is supported in almost all modern web browsers. In this section we will discuss how we can open developer tool and export HAR files from there.
Once the developer tool is opened, the process is the same across browsers.
Image: Google Chrome devtools network tab.
From the developer's tool we need to navigate to Network tab. Here all network activities are listed when a web page is opened. Some important steps to perform before exporting HAR are:-
Finally, to export, click on the download icon ( ⭳) and all of the logged session will be exported as a ".har" file. (For Firefox click on ⚙ from developer tools and click on "Save All As HAR").
What is there inside a HAR file?
At a very high level a HAR file can be observed as a log object which has three key components - creator, pages, and entries. It also consists of the version of HAR (typically 1.2).
Image: Pages object in a HAR file.
It is a list of all the pages visited and in each entry there are:
Image: entries object in a HAR file.
The contents of a object in entries list are-
i. _priority - Priority of the transaction.
ii. _resourceType - Type of resources to be loaded.
iii. pageref - Reference to the parent page from where this transaction has been originated.
iv. startedDateTime - Timestamp for the transaction start.
v. time - Total elapsed time of the request in milliseconds.
vi. request - Detailed info about the HTTP request.
vii. response - Detailed info about the HTTP response.
viii. cache - Info about cache usage.
ix. timings - Detailed timing info about request/response round trip.
x. serverIPAddress - IP address of the webserver.
xi. connection - Unique ID of the parent TCP/IP connection. This field is optional and does not guarantee to be unique.
The request and response objects have additional info such as URLs, headers, and decrypted data. For more details about each of the fields you can visit this page.
HAR vs Proxy Tools for Application Research
In Keysight ATI we analyze and research extensively on web traffic. We have found some advantages of using HAR files over a more traditional proxy setup.
HAR Files | Proxy Tools |
Taking a HAR capture is easy and almost no setup is required. | Setting up a proxy tool for traffic decryption can be quite a complicated task. |
Third party tools are not required to log and export HAR files. This feature is embedded in most of the modern browsers. | A MITM proxy relies on third party tools like Fiddler, sslsplit, mitmproxy etc. |
No additional certificate is needed to install for getting decrypted application layer traffic through HAR files. | A MITM proxy typically need to install third party certificates installed. |
HTTP2/3 decryption is easily available in HAR logs. | For HTTP2/3 decryption there is usually additional steps required. |