04/05/2021 | News release | Distributed by Public on 04/05/2021 07:18
Introduction
Malware reverse engineers perform dynamic code analysis to inspect a program during execution. This typically involves using a debugger to monitor a suspect process. A complementary approach is to interrogate a running process using Dynamic Binary Instrumentation (DBI) frameworks. While a debugger allows you to attach to a process, DBI techniques allow you to inject and execute code within a process to examine its internals.
Well-known DBI frameworks include Intel's Pin, DynamoRIO, and Frida. These frameworks are often used to assess proprietary programs and evaluate program performance, but they can also be applied to accelerate malware analysis. They allow analysts to hook functions to observe API calls, assess their inputs and outputs, and even modify instructions and data during execution. DBI frameworks target both desktop and mobile operating systems (i.e., Windows, macOS, GNU/Linux, iOS, Android™, and QNX) and provide well documented APIs to facilitate tool development.
This blog demonstrates how to use Frida to automate reverse engineering workflows. Specifically, we will use the framework to identify and dump deobfuscated executable content. We will discuss Frida's key features and explain the core components of a Frida Python script. With this knowledge, analysts can rapidly build custom tools to perform binary analysis.
Getting Started with Frida
Frida is a free and open-source software created by Ole André V. Ravnås. It allows analysts to inject JavaScript into programs to observe, intercept and modify the inputs and outputs of function calls during execution. It works on a variety of desktop and mobile operating systems. Frida provides command line tools for those who want immediate access to its benefits, but the framework's functionality and flexibility are best experienced using the available Python bindings.
Frida requires a Python 3 install on a Windows, macOS or GNU/Linux operating system. This article will focus on Windows malware, so we will use a Windows environment for analysis.
To install Frida, run the following command from an Internet-connected machine:
pip install frida-tools
Frida-trace for API Tracing
Frida-trace is one of several command-line tools in the Frida framework that has clear benefits for malware analysis. Malware analysts often spend time tracing API calls; this tool helps automate tracing by allowing analysts to display and process the inputs and outputs of a specified function. Frida should not be mistaken for an emulator framework. It actually executes the target program, so it should be used in an isolated environment.
To briefly demonstrate frida-trace, we will examine this publicly available 64-bit Windows executable. For our analysis we'll refer to the file as sample.exe. To spawn a process and begin tracing function calls, use the following command line format:
frida-trace -f -i <_function28_s29_ to='' monitor=''>
There are numerous APIs worth tracing for malware analysis. For example, it is often helpful to monitor files that are created or opened during execution. We could execute the following command to monitor all APIs that include the string 'CreateFile':
frida-trace -f sample.exe -i *CreateFile*
While this command provides comprehensive coverage of CreateFile variants, the output (not shown here) becomes overwhelming. To further refine this command, we will focus on tracing only CreateFileA,located in kernel32.dll. We chose this API because it appears in the program's Import Address Table (IAT). The following command focuses on this API and results in the output seen in Figure 1:
The output shows a single call to the API we specified but offers no supporting information. This is a good starting point, but we need more detailed output. For example, what specific files or I/O device(s) does this function call reference?
The Frida output references a handler called CreateFileA.js. Within the framework, handler files define actions that occur when the function is called and when it returns. CreateFileA.js contains the following default content:
There is quite a bit of text here to provide context. Our focus is on the references to onEnter and onLeave. They are displayed in orange for emphasis. The function onEnter specifies code to execute when CreateFileA is called. By default, it contains a log statement to print the API name to make it clear this function was called, as seen in Figure 1.
As described in the comments, the args parameter is an array of arguments passed to the function. For example, we access the first argument via args[0] and the second argument via args[1].
onLeave specifies code to execute when the function returns. By default, there is no code here. The retval parameter is a pointer to the return value, something we will take advantage at a later time.
The Microsoft documentation for CreateFileA explains that the first argument points to the file or device to be created or opened. We can direct Frida to output this argument by modifying onEnter as shown below:
This new output now refers to the target file or device. In this case we see a named pipe used for interprocess communication. The specific format of this named pipe is often encountered when analyzing Cobalt Strike binaries, suggesting this sample is associated with that red team toolset.
Cobalt Strike is a popular tool among red teams, penetration testers, and adversary groups. During incident response it is common to discover Cobalt Strike loaders on disk that launch the 'Beacon' - the primary payload - to control a target machine. The loader may be a 'stager', which downloads and executes the Beacon from a malicious server. Or it may be 'stageless', where the Beacon payload is obfuscated and embedded within the primary executable.
Analyzing Cobalt Strike binaries usually involves working through multiple stages of deobfuscation and execution, revealing shellcode and additional Windows executables in memory along the way. We will first explore any deobufscation this program performs and then attempt to automate that process using Frida.
Choosing APIs for Monitoring
Cobalt Strike executables typically deobfuscate content during execution to launch the underlying code. This process involves:
There are multiple Windows APIs used to allocate and interact with memory regions; we will focus on VirtualAlloc and VirtualProtect.
VirtualAlloc is used to allocate memory in the current process. As described on Microsoft's website, the syntax for this API is:
LPVOID VirtualAlloc(
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flAllocationType,
DWORD flProtect
);
The second of these four parameters specifies the size (in bytes) of the memory region to be allocated. If VirtualAlloc executes successfully, the return value specifies the starting address of the allocated memory region. The memory region is zeroed out in preparation for usage.
VirtualProtect is used to change the permissions (e.g., read, write, execute) of a memory region. As described on Microsoft's website, the syntax for this API is:
BOOL VirtualProtect(
LPVOID lpAddress,
SIZE_T dwSize,
DWORD flNewProtect,
PDWORD lpflOldProtect
);
The first of the four parameters specifies the starting address of the memory region whose permissions will change. The third parameter specifies the new permission to be applied (Microsoft refers to this as the 'memory protection constant', with options listed here).
If we load sample.exe into a debugger, we could set breakpoints on VirtualAlloc and VirtualProtect to monitor calls to these APIs. This would allow us to observe how allocated memory regions change during execution. Instead, we will use Frida to take a closer look at these API references.
Scripting with Frida-Python
Frida-trace is a great way to initially benefit from the framework. However, writing our own script using Frida's Python bindings provides more control and flexibility over API monitoring. This section presents the key components of a functional Frida script to monitor regions in memory for executable content. We'll refer to our script as script.py, and it will perform the following:
In this script there are several functions (highlighted in yellow) that require further explanation:
For more detail, see the Frida JavaScript API documentation.
We will now run the script and view the first few lines of output:
The dumped file is a 64-bit DLL. It has a single export named ReflectiveLoader and the imphash value 253Ad4e3ba1e8984c7a31117a5643de9ed85. Researching these values, along with some additional analysis, confirms this is a Cobalt Strike Beacon DLL. Further analysis of this DLL is out of scope for this article. However, briefly researching open sources will reveal techniques and tools for analyzing a Beacon DLL, including config extraction.
It is important to understand a few caveats associated with the 'MZ' matches described earlier. First, this simple check for an executable could identify incomplete or invalid executable files. Second, this approach may locate mapped executables, which represent the target executable loaded in memory. Mapped executables are similar to their on-disk counterparts. However, differences between the file in memory and on-disk means the dumped executable content may require fixing to be runnable (our very own PE Tree can make these tweaks).
For example, the alignment of PE file sections in memory differs from that of the file on-disk. On-disk, the first section typically begins at file offset 1024 in memory. However, the first section typically begins at offset 4096. While we won't explore this detail further, this difference is one approach to differentiating a mapped executable in memory from an unmapped one. The code presented here is only a starting point. We leave it to the reader to address additional scenarios.
Identifying and Extracting Shellcode
Let's explore another Cobalt Strike executable, a publicly available stager that aims to download the Beacon payload into memory and execute it. Similar to the first example, this program deobfuscates code during execution. We'll refer to this 32-bit Windows executable as sample2.exe. If we load sample2.exe into a debugger, we could set breakpoints on VirtualAlloc and VirtualProtect to monitor any memory regions referenced by this program. In time, by monitoring each region allocated and referenced, we would discover two instances of shellcode deobfuscated in memory.
We can run the first version of our Python script (see Figure 5) against sample2.exe to observe the VirtualAlloc and VirtualProtect calls, along with details:
We see references to four different addresses:
We first explore the contents of memory at 0x24d0000. With the final call to VirtualProtect, we can use the same hexdump code described in Figure 7. Including that single line to create a hexdump provides some detail for the final call to VirtualProtect (other output omitted for brevity):
The starting bytes FC E8 are commonly found at the beginning of shellcode, including executable content used by Cobalt Strike and Metasploit. Similar to the previous example, we could automatically search for common sequences of bytes to identify potential shellcode. Common byte sequences we encounter when identifying shellcode include:
This is not an exhaustive list, and identifying code based on these short byte sequences could lead to false positives. Our goal is to identify regions that are likely to include executable content, but additional vetting may be necessary.
To automatically identify and dump shellcode, we can modify the VirtualProtect's OnEnter function:
Each highlighted group of code performs the following:
Running this script, which we will call script2.py, produces the following (some output omitted for brevity):
This output provides more detail than scdbg. We see the IP address and port in the call to InternetConnectA (in the third argument, 0x50 is 80 in decimal). Also, the call to HttpOpenRequestA reveals the request target '/M7ph', and the call to HttpSendRequestA specifies the user-agent string. While we will not explore this sample further, its purpose is to download the Beacon DLL payload.
Additional Approaches to Identify Code
Our identification of the MZ header and shellcode were triggered by a call to VirtualProtect which modified permissions to include execute access. However, calling VirtualProtect is not necessary for code execution. For example, in the shellcode example we discussed, the first memory region was allocated via VirtualAlloc with read/write/execute permissions. Once that function call completes code can be copied to this region and executed without any further changes to the protection attributes. Continuing with our goal to monitor memory regions for executable code, there are two more options to consider:
Key activities in this code occur at the following locations:
This code makes it clear that memcpy is used to copy content to the newly allocated memory region. If we examine the memory region specified by the second argument within x32dbg, we see:
The first bytes 55 8B EC, as discussed earlier, are common opcodes for a function prologue in x86. This confirms our suspicion that shellcode is copied into the allocated memory region and reveals the specific API used - memcpy.
To automate this process, we can intercept calls to memcpy and evaluate the second (i.e., source) operand for common shellcode opcodes. Further testing of this approach reveals there are numerous calls to memcpy during execution, many copying only a few bytes. We can reduce the noise by limiting our evaluation of the source address to only those memcpy calls that copy a minimum threshold of bytes. For this example, we will set our minimum threshold at 500 bytes. The resulting code is almost identical to our interceptor for VirtualProtect. While this points to a clear opportunity to optimize our script, the code is shown here for completeness:
This output is not as helpful as our earlier effort to emulate shellcode execution and indicates a more complicated scenario. The program encountered an error because the shellcode accessed another location in memory (i.e., the memory region allocated via VirtualAlloc at the line beginning with 40136e). This attempt failed due to the constraints of emulation. A case like this often requires a more manual debugging effort, which is out of the scope of this write-up. Closer review reveals that the code does jump to the region in memory to continue executing. However, the additional code only decodes the final stage of shellcode, which we already extracted.
While we will not cover other APIs here, other functions to considering hooking include VirtualAllocEx, VirtualProtectEx, HeapAlloc, WriteProcessMemory, and NtWriteVirtualMemory.
Track Allocated Memory Regions
In addition to investigating more APIs for monitoring, performing robust tracking of allocated memory regions may improve our detection of executable content. We would prefer to set the equivalent of a persistent memory breakpoint within the Frida framework to identify executable content across multiple allocated memory regions. This would alert us when content in a memory region is executed. The memory access monitor API approximates this functionality. It monitors access to one or more memory ranges and executes a specified function when access occurs. Unfortunately, it only provides notification on first access (i.e., it's a one-shot) and therefore is not a persistent solution.
We could create an array of memory regions to monitor whenever a new one is allocated or referenced to work around this limitation. Then, as we encounter calls to APIs like VirtualAlloc and VirtualProtect, we can iterate through this list of addresses, checking for the presence of executable content. This automates the common malware analysis workflow of monitoring multiple regions in memory during execution.
The following code highlights script updates that define an array of memory regions and add to it when a new region is allocated via VirtualAlloc:
Note the use of 'this' in onEnter, which provides access to the argument from onLeave. Each object in our array of memory regions requires an address and size. The size of the region is passed in an argument to VirtualAlloc.
The complete implementation of an array to track memory regions is left to the reader. To take advantage of this approach, additional code should:
The BlackBerry Research and Intelligence team examines emerging and persistent threats, providing intelligence analysis for the benefit of defenders and the organizations they serve.