So, why a loader? The main reason was that I wanted something I could re-use when reversing future ESP8266 firmware dumps.
Our loader will be quite simple. IDA loaders typically define the following functions:
defload_file(li, neflags, format):
The first is responsible for identifying an applicable file, based on its signature and is executed when you open a file in IDA for analysis. The second, for interpreting the file, setting entry points, processor, as well as loading and naming segments accordingly. Our loader won’t perform any sanity checking, but should be able to load an image for us.
My loader is derived from the existing loader classes shipped with IDA and of-course, is built to take into account the format we’ve dissected above. It will attempt to identify the firmware image based on signature (image magic), followed by loading each of the segments into memory, whilst trying to guess the names and types of segments based on their loading address.
Below is the Python code for our loader, which lives in IDA’s loader directory:
As you can see, the user segment loading loop, which iterates over each of the segments within ROM 1, attempts to perform some basic classification and naming based on the load address of the given segment, per our rules mentioned earlier.
elif(seg_addr > 0x40100000):
With this loader in use, IDA now recognises our firmware image:
Our segments look a lot tidier:
And we have an entry point! (of the user ROM):
Whilst we’re in a good state to perform cursory analysis, we don’t have any function names to base our analysis on. Ideally, we’d like to identify the routine(s) responsible for connecting to a given port and locate the references to that function, as well as make sense of any other library function calls. This will allow us to discover the ports knocked on, as well as the order of which knocking should take place.
Performing library recognition
There are known and documented methods to identify library functions within a statically linked, stripped image. The most known of which is to use IDA’s Fast Library Acquisition for Identification and Recognition(FLAIR) tools, which in turn creates Fast Library Identification and Recognition Technology (FLIRT) signatures.
The process of creating FLIRT signatures usually requires a number of prerequisite conditions to exist:
A pattern file must be created via either pelf or similar, followed by use of sigmake
A compiled, relocatable library containing the functions and associated names, of which signatures are to be generated against, must exist
The library must be a recognised format and with a supported instruction set
This poses two problems, the first is that we don’t have such a library available to us at present, the second is that Xtensa is not a supported processor type, as shown below.
ELF parser. Copyright (c) 2000-2015 Hex-Rays SA. Version 1.16
Supported processors: MIPS, I960, ARM, IBM PC, M6812, SuperH
Usage: ./pelf [-switch or @file or $env_var] file [pattern-file]
(wildcards are allowed)
The result is that we can’t create pattern files using IDA’s traditional toolset.
The solution to these problems, which we’ll tackle in a moment (not without their own obstacles) are as follows:
We need to install a suitable IDE capable of compiling code for the ESP8266
We need to write code that hopefully, uses the same libraries as our target
We need to compile our code into an ELF file that is statically linked, unstripped and with debug info.
We need to find a way to create signatures from said ELF file
The first step is involved and beyond the scope of this blog post. I’ve opted to use Arduino IDE and configured it to compile for a generic ESP8266 module, with verbose compiler output enabled.
With our environment configured, we can look up example sketches for the ESP8266, we want to find one that performs a similar function to our target. Fortunately, a Github of example code exists, which can help us.
Searching the repository, we see a promising file, WiFiClient.ino, which contains the following code:
This sketch sends data via HTTP GET requests to data.sparkfun.com service.
You need to get streamId and privateKey at data.sparkfun.com and paste them
below. Or just customize this script to talk to other HTTP servers.
constchar* ssid = "your-ssid";
constchar* password = "your-password";
constchar* host = "data.sparkfun.com";
constchar* streamId = "....................";
constchar* privateKey = "....................";
// We start by connecting to a WiFi network
Serial.print("Connecting to ");
/* Explicitly set the ESP8266 to be a WiFi-client, otherwise, it by default,
would try to act as both a client and an access-point and could cause
network-issues with your other WiFi-devices on your WiFi-network. */
This is a good sign, as it’s indicative that at the very least, we’re compiling a Sketch which uses the relevant, identical or similar libraries (there may be version discrepancies) to our target firmware image. This increases the likelihood of successful function identification, based on the signatures we’ll obtain.
Compiling the above sketch, results in the following notable compiler output:
/tmp/arduino_build_867542/sketch_may24a.ino.elf: ELF 32-bit LSB executable, Tensilica Xtensa, version 1 (SYSV), statically linked, with debug_info, not stripped
Loading this ELF file into IDA, we can see we’ve got sensible function names! As depicted below:
So, how can we generate a pattern file from the above ELF to create a FLIRT signature? After much research, I found Fire Eye’s IDB2PAT tool, created by the FLARE the division of Fire Eye.
This tool is described as follows:
This script allows you to easily generate function patterns from an existing IDB database that can then be turned into FLIRT signatures to help identify similar functions in new files. More information is available at: https://www.fireeye.com/blog/threat-research/2015/01/flare_ida_pro_script.html
Having installed this plugin, it initially didn’t work at all for my version of IDA (6.8). This appeared to be the result of IDA using QT5 as opposed to Pyside in later versions (7.x), where the plugin was migrated to support version 7.x of IDA and not version 6.8.
Scrolling through the plugin’s known issues, someone pointed out the above and recommended an earlier version be used, which worked with IDA 6.8. I checked out an earlier commit. No more IDA plugin errors.
Did the plugin work? No. It got stuck in an infinite loop upon being launched. It turned out this issue was related to the version I had containing a bug, where functions less than 32 bytes would cause an infinite loop. To fix this issue, I downloaded the latest version of the individual script file, in which the bug was apparently fixed.
The result, yet another issue:
This was seemingly due to a version discrepancy between the installed and targeted IDA SDK. I fixed the plugin by updating the relevant function call “get_name(…)” to “GetFunctionName(…)”. I also added code to ignore functions that started with the word “sub_”, as these were undefined and not useful to me.
See the documentation to learn how to resolve collisions.
We can see six collisions have occurred. In this context, a collision is generated when sigmake encounters the same signature for more than one function. When this happens, it will generate a .exc file listing the collisions, which we can modify to instruct IDA to use one signature over another, for example.