Keysight Technologies Inc.

05/27/2022 | Press release | Distributed by Public on 05/27/2022 09:56

A Comprehensive Guide on HAR Files

A Comprehensive Guide on HAR Files

2022-05-27 |  9 min read 

Web applications are evolving in complexity, and it is becoming harder to see through the highly dynamic UI and track request-responses, authentication packets, redirects etc. To make our life a little bit easier we have a standard logging mechanism capturing all HTTP traffic behind fancy front-end pages. It is called HTTP Archive Record (HAR) file that is easily accessible in most modern web-browsers. In this blog we will look at HAR files and how they provide key insight about web traffic.

What is HAR File?

The HAR (HTTP Archive) file format is a JSON structured file containing all network traffic information about a browser's interactions with the server providing content for a webpage. The specification for the HTTP Archive (HAR) format defines an archival format for HTTP transactions. This archived log can be used by a web browser to export detailed performance data about web pages it loads.

The specification for this format is produced by the Web Performance Working Group of the W3C. Although this is never officially published by W3C but the specifications are widely used by different application.

Why HAR files?

HAR files supply necessary information to debug and visualize a web page performance and functionality. Here are several use cases where a HAR file could be useful:

  • Blocked request-responses can be caused by a user's network, and it is often very difficult for a developer to understand what is causing their endpoints or CDN to not load properly. HAR logs can provide the insights needed in this scenario.
  • Network Visualization: A HAR file contains detailed load time of different components of a web page in terms of network requests. HAR logs provide excellent visualization capability to understand a web page's performance.
  • Decrypted Traffic Forensics: Some middle boxes (ex: proxy) can modify contents of a request / response and HAR files are very useful to make sure everything is loading properly in a web page as the contents are decrypted in the HAR file and isolated from the rest of the network traffic flowing through the machine.

How to capture HAR files?

HAR exporting is supported in almost all modern web browsers. In this section we will discuss how we can open developer tool and export HAR files from there.

  • Chrome - The developer tools can be accessed by (Menu -> More Tools -> Developer Tools)
  • Firefox - The developer tool can be accessed by (Menu -> More tools -> Web Developer Tools)
  • Edge - Open the developer Tool from (Menu -> More tools -> Developer Tools

Once the developer tool is opened, the process is the same across browsers.

[Link]

Image: Google Chrome devtools network tab.

From the developer's tool we need to navigate to Network tab. Here all network activities are listed when a web page is opened. Some important steps to perform before exporting HAR are:-

  1. The "Preserve logs" checkbox should be clicked, otherwise the network page will clear previous records when you visit a new webpage (even in same website).
  2. The disabled "Cache" checkbox should be checked. Otherwise, no network information will be logged if a particular content is already cached.
  3. Last but not least, HAR files contain decrypted request-responses and cookies and might contain personal information. So always capture HAR files in Incognito tab and check carefully before sharing the logs with others, especially the recorded activity included logging into an account.

Finally, to export, click on the download icon ( ) and all of the logged session will be exported as a ".har" file. (For Firefox click on from developer tools and click on "Save All As HAR").

What is there inside a HAR file?

At a very high level a HAR file can be observed as a log object which has three key components - creator, pages, and entries. It also consists of the version of HAR (typically 1.2).

  • Creator: The creator section consists of the name and version of the tool that is used to capture the logs. It also might contain an optional field called comment.
  • Pages: This part represents a list of all the pages that have been exported.

Image: Pages object in a HAR file.

It is a list of all the pages visited and in each entry there are:

  1. The start timestamp of page loading.
  2. Unique identifier of the page on the scope of the log. (Used by entries section).
  3. The title of the page.
  4. Several page loading time information.
  • Entries: This section provides information about each of the request responses exported irrespective of the page. This is a list of request response objects.

Image: entries object in a HAR file.

The contents of a object in entries list are-

i. _priority - Priority of the transaction.

ii. _resourceType - Type of resources to be loaded.

iii. pageref - Reference to the parent page from where this transaction has been originated.

iv. startedDateTime - Timestamp for the transaction start.

v. time - Total elapsed time of the request in milliseconds.

vi. request - Detailed info about the HTTP request.

vii. response - Detailed info about the HTTP response.

viii. cache - Info about cache usage.

ix. timings - Detailed timing info about request/response round trip.

x. serverIPAddress - IP address of the webserver.

xi. connection - Unique ID of the parent TCP/IP connection. This field is optional and does not guarantee to be unique.

The request and response objects have additional info such as URLs, headers, and decrypted data. For more details about each of the fields you can visit this page.

HAR vs Proxy Tools for Application Research

In Keysight ATI we analyze and research extensively on web traffic. We have found some advantages of using HAR files over a more traditional proxy setup.

HAR Files Proxy Tools
Taking a HAR capture is easy and almost no setup is required. Setting up a proxy tool for traffic decryption can be quite a complicated task.
Third party tools are not required to log and export HAR files. This feature is embedded in most of the modern browsers. A MITM proxy relies on third party tools like Fiddler, sslsplit, mitmproxy etc.
No additional certificate is needed to install for getting decrypted application layer traffic through HAR files. A MITM proxy typically need to install third party certificates installed.
HTTP2/3 decryption is easily available in HAR logs. For HTTP2/3 decryption there is usually additional steps required.
In this blog I have tried to summarize information about HAR files. We need to keep in mind while choosing a decryption tool that HAR is more involved in application layer and does not contain any previous layer information (in the OSI model), and that makes it more powerful when we research about only the layer 7 part of the stack.