Не рекомендуется к прочтению тем, кто обладает проблемами с нервной системой.
Когда я кидала статью на утверждение, меня спросили: “А где же тут ИБ, это же схемотехника!” Хочу от себя заметить, что ИБ помимо сферы ИТ распространяется не только на стандартизацию, безопасность программного обеспечения, защиту от утечек информации по сторонним каналам (сюда в том числе входит изучение распространения радиоволн и электромагнитных сигналов), но и на аппаратный уровень.
Обра́тная разрабо́тка (обратное проектирование, обратный инжиниринг, реверс-инжиниринг; англ. reverse engineering) — исследование некоторого готового устройства или программы, а также документации на него с целью понять принцип его работы; например, чтобы обнаружить недокументированные возможности (в том числе программные закладки), сделать изменение или воспроизвести устройство, программу или иной объект с аналогичными функциями, но без прямого копирования.
Схемоте́хника — научно-техническое направление, занимающееся проектированием, созданием и отладкой (синтезом и анализом) электронных схем и устройств различного назначения.
К сожалению, в данной статье мы ничего проектировать и создавать не будем. В том числе, заниматься отладкой (синтезом и анализом), в том варианте, в котором ей занимаются при создании и проектировании устройств, то есть перед их запуском в производство.
В данной статье мы рассмотрим, как методом подбора комбинации логических значений можно научиться управлять проприентарным устройством для нарушения целостности информации. Мы будем использовать недокументированные возможности устройства, то есть не предусмотренные инструкцией.
Однажды (тысячу лет назад) мне в руки попался интереснейший девайс, я не смогла удержаться и разобрала его. Постараюсь описать самое интересное из того что я увидела внутри и подтолкнуть к определенным выводам тех, кто захочет развить данную тему.
Уточнять, что за устройство мы рассматриваем я не буду по этическим причинам.
Итак, перед нами сравнительно небольшая плата:
Рассматриваемый нами девайс содержит дисплей, который имеет шесть выводов. Изображения, которые может выводить дисплей – фиксированные, то есть они заранее созданы и отображаются, когда на выводы дисплея поступает определенный набор сигналов с контролирующей микросхемой.
С другой стороны мы можем увидеть:
– отсек с батарейками (1),
– пассивные компоненты,
– микросхему, залитую компаундом – для того, чтобы защитить ее от внешних воздействий и реверсеров (2),
– контакты датчика(3),
– кусок пластика с прорезями.
После удаления куска пластика с прорезями мы можем наблюдать 4 светодиода и один фотодиод (группа элементов справа). Это удивительно, так как в корпусе как раз нет отверстий ни для первых, ни для второго. Можно предположить, что фотодиод служит для определения момента закрытия крышки корпуса, но опять же – достаточного отверстия для поступления света там нет.
Еще одним следствием работы фотодиода может быть то, что после разбора корпуса, при попадании на него света прошивка перестала вести себя статично и микросхема начала отправлять меандр на контакты дисплея, от чего тот начал мигать.
Меандр – последовательность прямоугольных импульсов с равной длиной импульса и паузой, в логических устройствах может считываться как 0-1-0-1-0-1-0 и так далее.
Чтобы определить на скорую руку на какие контакты приходит сигнал – меандр, был использован светодиод. Путем преставления ног светодиода между контактами (одна – на земле, вторая на сигнальном контакте) можно легко определить какие из контактов – «мерцают». Так же, это можно проделать с помощью мультиметра в режиме измерения постоянного тока.
Так как отковыривать компаунд дело долгое и неблагодарное, а внесение изменений в прошивку данного девайса в срочном порядке вряд ли будет популярно, далее будет предложено простое и действенное изменение значения на дисплее нужное нам.
Практическим образом было установлено, что, если использовать постоянный сигнал с источника питания девайса, можно добиться фиксации изображения на экране. Эксперимент проводился с двумя контактами, но можно их припаивать и пробовать узнавать режимы работы дисплея с изменением большего количества информационных значений.
Мы разобрались, что при закорачивании контактов 3 и 5 можно добиться специфической статической надписи. Для того, чтобы это проделать требуется:
Разобрать корпус (он крайне легко разбирается руками).
Кинуть минимум две перемычки от батарейки к контактам 3 и 5.
Если сделать это аккуратно, то можно легко справиться за 60 секунд. При учете специфических условий из данных нескольких минут, останется еще время на посидеть в мессенджере.
Отрицательный результат достигается закорачиванием 3 и 6 контакта соответственно, но здесь нас предательски выдает неверная надпись, под результатом, логически несовместимая с ним.
[+] За счет пайки и использования оставшихся контактов добиться вывода на экране полностью статичной надписи.
In this article, we tell the story of how we found a logical bug using the WinAFL fuzzer and exploited it in WinRAR to gain full control over a victim’s computer. The exploit works by just extracting an archive, and puts over 500 million users at risk. This vulnerability has existed for over 19 years(!) and forced WinRAR to completely drop support for the vulnerable format.
A few months ago, our team built a multi-processor fuzzing lab and started to fuzz binaries for Windows environments using the WinAFL fuzzer. After the good results we got from our Adobe Research, we decided to expand our fuzzing efforts and started to fuzz WinRAR too.
One of the crashes produced by the fuzzer led us to an old, dated dynamic link library (dll) that was compiled back in 2006 without a protection mechanism (like ASLR, DEP, etc.) and is used by WinRAR.
We turned our focus and fuzzer to this “low hanging fruit” dll, and looked for a memory corruption bug that would hopefully lead to Remote Code Execution. However, the fuzzer produced a test case with “weird” behavior. After researching this behavior, we found a logical bug: Absolute Path Traversal. From this point on it was simple to leverage this vulnerability to a remote code execution.
Perhaps it’s also worth mentioning that a substantial amount of money in various bug bounty programs is offered for these types of vulnerabilities.
Figure 1: Zerodium tweet on purchasing WinRAR vulnerability.
What is WinRAR?
WinRAR is a trialware file archiver utility for Windows which can create and view archives in RAR or ZIP file formats and unpack numerous archive file formats.
According to the WinRAR website, over 500 million users worldwide make WinRAR the world’s most popular compression tool today.
This is what the GUI looks like:
Figure 2: WinRAR GUI.
The Fuzzing Process Background
These are the steps taken to start fuzzing WinRAR:
Creation of an internal harness inside the WinRAR main function which enables us to fuzz any archive type, without stitching a specific harness for each format. This is done by patching the WinRAR executable.
Eliminate GUI elements such as message boxes and dialogs which require user interaction. This is also done by patching the WinRAR executable. There are some message boxes that pop up even in CLI mode of WinRAR.
Use a giant corpus from an interesting piece of research conducted around 2005 by the University of Oulu.
Fuzz the program with WinAFL using WinRAR command line switches. These force WinRAR to parse the “broken archive” and also set default passwords (“-p” for password and “-kb” for keep broken extracted files). We found those options in a WinRAR manual/help file.
After a short time of fuzzing, we found several crashes in the extraction of several archive formats such as RAR, LZH and ACE that were caused by a memory corruption vulnerability such as Out-of-Bounds Write. The exploitation of these vulnerabilities, though, is not trivial because the primitives supplied limited control over the overwritten buffer.
However, a crash related to the parsing of the ACE format caught our eye. We found that WinRAR uses a dll named unacev2.dll for parsing ACE archives. A quick look at this dll revealed that it’s an old dated dll compiled in 2006 without a protection mechanism. In the end, it turned out that we didn’t even need to bypass them.
Build a Specific Harness
We decided to focus on this dll because it looked like it would be quick and easy to exploit.
Also, as far as WinRAR is concerned, as long as the archive file has a .rar extension, it would handle it according to the file’s magic bytes, in our case – the ACE format.
To improve the fuzzer performance, and to increase the coverage only on the relevant dll, we created a specific harness for unacev2.dll .
To do that, we need to understand how unacev2.dll is used. After reverse engineering the code calling unacev2.dll for ACE archive extraction, we found that two exported functions should be called for extraction in the following order:
An initialization function named ACEInitDll, with the following signature: INT __stdcall ACEInitDll(unknown_struct_1 *struct_1); • struct_1: pointer to an unknown struct
An extraction function named ACEExtract , with the following signature: INT __stdcall ACEExtract(LPSTR ArchiveName, unknown_struct_2 *struct_2); •ArchiveName: string pointer to the path to the ace file to be extracted •struct_2: pointer to an unknown struct
Both of these functions required structs that are unknown to us. We had two options to try to understand the unknown struct: reversing and debugging WinRAR, or trying to find an open source project that uses those structs.
The first option is more time consuming, so we opted to try the second one. We searched github.com for the exported function ACEInitDll and found a project named FarManager that uses this dll and includes a detailed header file for the unknown structs. Note: The creator of this project is also the creator of WinRAR.
After loading the header files to IDA, it was much easier to understand the previously “unknown structs” to both functions (ACEInitDll and ACEExtract ), as IDA displayed the correct name and type for each struct member.
From the headers we found in the FarManager project, we came up with the following signature:
To mimic the way that WinRAR uses unacev2.dll , we assigned the same struct member just as WinRAR did.
We started to fuzz this specific harness, but we didn’t find new crashes and the coverage did not expand in the first few hours of the fuzzing. We tried to understand the reason for this limitation.
We started by looking for information about the ACE archive format.
Understanding the ACE Format
We didn’t find a RFC for that format, but we did find vital information over the internet.
1. Creating an ACE archive is protected by a patent. The only software that is allowed to create an ACE archive is WinACE. The last version of this program was compiled in November 2007. The company’s website has been down since August 2017. However, extracting an ACE archive is not protected by a patent.
2. A pure Python project named acefile is mentioned in this Wikipedia page. Its most useful features are:
It can extract an ACE archive.
It contains a brief explanation about the ACE file format.
It has a very helpful feature that prints the file format header with an explanation.
To understand the ACE file format, let’s create a simple .txt file (named “simple_file.txt”), and compress it using WinACE. We will then check the headers of the ACE file using acefile .
This is simple_file.txt
Figure 3: File before compression.
These are the options we selected in WinACEto create our example:
Figure 4: WinACE compression GUI.
This option creates the subdirectories \users\nadavgr\Documents under the chosen extraction directory and extracts simple_file.txt to that relative path.
Figure 5: The simple_file.ace produced using WinACE’s “store” compression option for visibility.
Running acefile.py from the acefile project using headers flags displays information about the archive headers:
Figure 6: Parsing ACE file header using acefile.py.
This results in:
Figure 7: acefile.py header parsing output.
Consider each “\\” from the filename field in the image above as a single slash “\”, this is just python escaping.
For clarity, the same fields are marked with the same color in the hex dump and in the output fromacefile.
Summary of the important fields:
hdr_crc (marked in pink): Two CRC fields are present in 2 headers. If the CRC doesn’t match the data, the extraction is interrupted. This is the reason why the fuzzer didn’t find more paths (expand its coverage).To “solve” this issue we patched all the CRC* checks in unacev2.dll .*Note – The CRC is a modified implementation of the regular CRC-32.
filename (marked in green): It contains the relative path to the file. All the directories specified in the relative path are created during the extracting process (including the file). The size of the filename is defined by 2 bytes (little endian) marked by a black frame in the hex dump.
advert (marked in yellow) The advert field is automatically added by WinACE, during the creation of an ACE archive, if the archive is created using an unregistered version of WinACE.
“origsize ” – The content’s size. The content itself is positioned after the header that defines the file (“hdr_type” field == 1).
“hdr_size ” – The header size. Marked by a gray frame in the hex dump.
At offset 70 (0x46) from the second header, we can find our file content: “Hello From Check Point!”
Because the filename field contains the relative path to the file, we did some manual modification attempts to the field to see if it is vulnerable to “Path Traversal.” For example, we added the trivial path traversal gadget “\..\” to the filename field and more complex “Path Traversal” tricks as well, but without success.
After patching all the structure checks, such as the CRC validation, we once again activated our fuzzer. After a short time of fuzzing, we entered the main fuzzing directory and found something odd. But let’s first describe our fuzzing machine for some necessary background.
The Fuzzing Machine
To increase the fuzzer performance and to prevent an I\O bottleneck, we used a RAM disk drive that uses the ImDisk toolkit on the fuzzing machine.
The Ram disk is mapped to drive R:\, and the folder tree looks like this:
Figure 8: Fuzzer’s folders hierarchy
Detecting the Path Traversal Bug
A short time after starting the fuzzer, we found a new folder named sourbe in a surprising location, in the root of drive R:\
Figure 9: ”sourbe”, the unexpected folder which created during fuzzing.
The harness is instructed to extract the fuzzed archive to sub-directories under “output_folders”. For example, R:\ACE_FUZZER\output_folders\Slave_2\ . So why do we have a new folder created in the parent directory?
Inside the sourbe folder we found a file named RED VERSION_¶ with the following content:
Figure 10: Content of the file that produced by the fuzzer in the unexpected path “R:\sourbe\RED VERSION_¶”.
This is the hex dump of the test case that triggers the vulnerability:
Figure 11: A hex dump of the file that produced by the fuzzer in the unexpected path “R:\sourbe\RED VERSION_¶”.
We made some minor changes to this test case, (such as adjusting the CRC) to make it parsable by acefile.
For convenience, fields are marked with the same color in the hex dump and in the output from acefile.
Figure 12: Header parsing output from acefile.py for the file that produced by the fuzzer in the unexpected path.
These are the first three things that we noticed when we looked at the hex dump and the output from acefile:
The fuzzer copied parts of the “advert” field to other fields:
The content of the compressed file is “SIO”, marked in an orange frame in the hex dump. It’s part of the advert string “*UNREGISTERED VERSION*”.
The filename field contain the string “RED VERSION*” which is part of the advert string “*UNREGISTERED VERSION*”.
The path in the filename field was used in the extraction process as an “absolute path” instead of a relative path to the destination folder (the backslash is the root of the drive).
The extract file name is “RED VERSION_¶”. It seems that the asterisk from the filename field was converted to an underscore and the \x14\ (0x14) value represented as “¶” in the extract file name. The other content of the filename field is ignored because there is a null char which terminates the string, after the \x14\ (0x14) value.
To find the constraints that caused it to ignore the destination folder and use the filename field as an absolute path during the extraction, we did the following attempts, based on our assumptions.
Our first assumption was the first character of the filename field (the ‘\’ char) triggers the vulnerability. Unfortunately, after a quick check we found out that this is not the case. After additional checks we arrived at these conclusions:
The first char should be a ‘/’ or a ‘\’.
‘*’ should be included in the filename at least once; the location doesn’t matter.
Example of a filename field that triggers the bug: \some_folder\some_file*.exe will be extracted to C:\some_folder\some_file_.exe , and the asterisk is converted to an underscore (_).
Now that it worked on our fuzzing harness, it is time to test our crafted archive (e.g. exploit file) file on WinRAR.
Trying the exploit on WinRAR
At first glance, it looked like the exploit worked as expected on WinRAR, because the sourbe directory was created in the root of drive C:\ . However, when we entered the “sourbe” folder (C:\sourbe ) we noticed that the file was not created.
These behaviors raised two questions:
Why did the harness and WinRAR behave differently?
Why were the directories that were specified in the exploit file created, and the extracted file was not created?
Why did the harness and WinRAR behave differently?
We expected that the exploit file would behave the same on WinRAR as it behaved in our harness, for the following reasons:
The dll (unacev2.dll ) extracts the files to the destination folder, and not the outer executable (WinRAR or our harness).
Our harness mimics WinRAR perfectly when passing parameters / struct members to the dll.
A deeper look showed that we had a false assumption in our second point. Our harness defines 4 callbacks pointers, and our implemented callbacks differ from WinRAR’s callbacks. Let’s return to our harness implementation.
We mentioned this signature when calling the exported function named ACEInitDll.
These callbacks are called by the dll (unacev2.dll ) during the extraction process. The callbacks are used as external validators for operations that about to happen, such as the creation of a file, creation of a directory, overwriting a file, etc. The external callback/validators get information about the operation that’s about to occur, for example, file extraction, and returns its decision to the dll.
If the operation is allowed, the following constant is returned to the dll: ACE_CALLBACK_RETURN_OK Otherwise, if the operation is not allowed by the callback function, it returns the following constant: ACE_CALLBACK_RETURN_CANCEL, and the operation is aborted.
Our harness returned ACE_CALLBACK_RETURN_OK for all the callback functions except for the ErrorCallbackProc, where it returned ACE_CALLBACK_RETURN_CANCEL.
It turns out, WinRAR does validation for the extracted filename (after they are extracted and created), and because of those validations in the WinRAR callback’s, the creation of the file was aborted. This means that after the file is created, it is deleted by WinRAR.
WinRAR Validators / Callbacks
This is part of the WinRAR callback’s validator pseudo-code that prevents the file creation:
Figure 13: WinRAR validator/callback pseudo-code.
“SourceFileName” represents the relative path to the file that will be extracted.
The function does the following checks:
The first char does not equal “\” or “/”.
The File Name doesn’t start with the following strings “..\” or “../” which are gadgets for “Path Traversal”.
The following “Path Traversal” gadgets does not exist in the string:
The extraction function in unacv2.dll calls StateCallbackProc in WinRAR, and passes the filename field of the ACE format as the relative path to be extracted to.
The relative path is checked by the WinRAR callback’s validator. The validators return ACE_CALLBACK_RETURN_CANCEL to the dll, (because the filename field starts with backslash “\”) and the file creation is aborted.
The following string passes to the WinRAR callback’s validator:
Note: This is the original filename with fields “\sourbe\RED VERSION*¶”. “unacev2.dll ” replaces the “*” with an underscore.
Why were the folders that were specified in the exploit file created and the extracted file was not created?
Because of a bug in the dll (“unacev2.dll ”), even if ACE_CALLBACK_RETURN_CANCEL returned from the callback, the folders specified in the relative path (filename field in ACE archive) will be created by the dll.
The reason for this is that unacev2.dll calls the external validator (callback) before the folder creation, but it checks the return value from the callbacks too late – after the creation of the folder. Therefore, it aborts the extraction operation just before writing content to the extracted file, before the call to WriteFile API.
It actually creates the extracted file, without writing content to it. It calls to CreateFile API and then checks the return code from the callback function. If the return code is ACE_CALLBACK_RETURN_CANCEL, it actually deletes the file that previously created by the call to CreateFile API.
We found a way to bypass the deletion of the file, but it allows us to create empty files only. We can bypass the file deletion by adding “:” to the end of the file, which is treated as Alternate Data Streams. If the callback returns ACE_CALLBACK_RETURN_CANCEL, dll tries to delete the Alternate Data Stream of the file instead of the file itself.
There is another filter function in the dll code that aborts the extraction operation if the relative path string starts with “\” (slash). This happens in the first extraction stages, before the calls to any other filter function. However, by adding “*”or “?” characters (wildcard characters) to the relative path (filename field) of the compressed file, this check is skipped and the code flow can continue and (partially) trigger the Path Traversal vulnerability. This is why the exploit file which was produced by the fuzzer triggered the bug in our harness. It doesn’t trigger the bug in WinRAR because of the callback validator in the WinRAR code.
Summary of Intermediate Findings
We found a Path Traversal vulnerability in unacev2.dll . It enables our harness to extract the file to an arbitrary path, and completely ignore the destination folder, and treats the extracted file relative path as the full path.
Two constraints lead to the Path Traversal vulnerability (summarized in previous sections): 1. The first char should be a ‘/’ or a ‘\’. 2. ‘*’ should be included in the filename at least once. The location does not matter.
WinRAR is partially vulnerable to the Path Traversal:
unacev2.dll doesn’t abort the operation after getting the abort code from the WinRAR callback (ACE_CALLBACK_RETURN_CANCEL). Due to this delayed check of the return code from WinRAR callback, the directories specified in the exploit file are created.
The extracted file is created as well, on the full path specified in the exploit file (without content), but it is deleted right after checking the returned code from the callback (before the call to WriteFile API).
We found a way to bypass the deletion of the file, but it allows us to create empty files only.
Finding the Root Cause
At this point, we wanted to figure out why the destination folder is ignored, and the relative path of the archive files (filename field) is treated as the full path.
To achieve this goal, we could use static analysis and debugging, but we decided on a much quicker method. We used DynamoRio to record the code coverage in unacev2.dll of a regular ACE file and of our exploit file which triggered the bug. We then used the lighthouse plugin for IDA and subtracted one coverage path from the other.
These are the results we got:
Figure 14: Lighthouse’s coverage overview window. You can see the coverage subtraction in the “Composer” form, and one result highlighted in purple.
In the “Coverage Overview” window we can see a single result. This means there is only one basic block that was executed in the first attempt (marked in A) and wasn’t reached on the second attempt (marked in B).
The Lighthouse plugin marked the background of the diffed basic block in blue, as you can see in the image below.
Figure 15: IDA graph view of the main bug in unacev2.dll. Lighthouse marked the background of the diffed basic block in blue.
From the code coverage results, you can understand that the exploit file is not going through the diffed basic block (marked in blue), but it takes the opposite basic block (the false condition, marked with a red arrow).
If the code flow goes through the false condition (red arrow) the line that is inside the green frame replaces the destination folder with "" (empty string), and the later call to sprintf function, which concatenates the destination folder to the relative path of the extracted file.
The code flow to the true and false conditions, marked with green and red arrows respectively, is influenced by the call to the function named GetDevicePathLen (inside the red frame).
If the result from the call to GetDevicePathLen equals 0, the sprintf looks like this:
The last sprintf is the buggy code that triggers the Path Traversal vulnerability.
This means that the relative path will actually be treated as a fullpath to the file/directory that should be written/created.
Let’s look at GetDevicePathLen function to get a better understanding of the root cause:
Figure 16: GetDevicePathLen code.
The relative path of the extracted file is passed to GetDevicePathLen. It checks if the device or drive name prefix appears in the Path parameter, and returns the length of that string, like this:
The function returns 3 for this path: C:\some_folder\some_file.ext
The function returns 1 for this path: \some_folder\some_file.ext
The function returns 15 for this path: \\LOCALHOST\C$\some_folder\some_file.ext
The function returns 21 for this path: \\?\Harddisk0Volume1\some_folder\some_file.ext
The function returns 0 for this path: some_folder\some_file.ext
If the return value from GetDevicePathLen is greater than 0, the relative path of the extracted file will be considered as the full path, because the destination folder is replaced by an empty string during the call to sprintf, and this leads to Path Traversal vulnerability.
However, there is a function that “cleans” the relative path of the extract file, by omitting any sequences that are not allowed before the call to GetDevicePathLen.
This is a pseudo-code that cleans the path “CleanPath”.
Figure 17: Pseudo-code of CleanPath.
The function omits trivial Path Traversal sequences like “\..\” (it only omits the “..\” sequence if it is found in the beginning of the path) sequence, and it omits drive sequence like: “C:\”, “C:”, and for an unknown reason, “C:\C:” as well.
Note that it doesn’t care about the first letter; the following sequence will be omitted as well: “_:\”, “_:”, “_:\_:” (In this case underscore represents any value).
Putting It All Together
To create an exploit file, which causes WinRAR to extract an archived file to an arbitrary path (Path Traversal), extract to the Startup Folder (which gains code execution after reboot) instead of to the destination folder.
We should bypass two filter functions to trigger the bug.
To trigger the concatenation of an empty string to the relative path of the compressed file, instead of the destination folder:
The result from GetDevicePathLen function should be greater than 0. It depends on the content of the relative path (“file_relative_path”). If the relative path starts the device path this way:
option 1: C:\some_folder\some_file.ext
option 2: \some_folder\some_file.ext (The first slash represents the current drive.)
The return value from GetDevicePathLen will be greater than 0. However, there is a filter function in unacev2.dll named CleanPath (Figure 17) that checks if the relative path starts with C:\ and removes it from the relative path string before the call to GetDevicePathLen.
It omits the “C:\” sequence from the option 1 string but doesn’t omit “\” sequence from the option 2 string.
To overcome this limitation, we can add to option 1 another “C:\” sequence which will be omitted by CleanPath (Figure 17), and leave the relative path to the string as we wanted with one “C:\”, like:
However, there is a callback function in WinRAR code (Figure 13), that is used as a validator/filter function. During the extraction process, unacev2.dll is called to the callback function that resides in the WinRAR code.
The callback function validates the relative path of the compressed file. If the blacklist sequence is found, the extraction operation will be aborted.
One of the checks that is made by the callback function is for the relative path that starts with “\” (slash). But it doesn’t check for the “C:\”. Therefore, we can use option 1’ to exploit the Path Traversal Vulnerability!
We also found an SMB attack vector, which enables it to connect to an arbitrary IP address and create files and folders in arbitrary paths on the SMB server.
We change the .ace extension to .rar extension, because WinRAR detects the format by the content of the file and not by the extension.
This is the output from acefile:
Figure 18: Header output by acefile.py of the simple exploit file.
We trigger the vulnerability by the crafted string of the filename field (in green).
This archive will be extracted to C:\some_folder\some_file.txt no matter what the path of the destination folder is.
Creating a Real Exploit
We can gain code execution, by extracting a compressed executable file from the ACE archive to one of the Startup Folders. Any files that reside in the Startup folders will be executed at boot time. To craft an ACE archive that extracts its compressed files to the Startup folder seems to be trivial, but it’s not. There are at least 2 Startup folders at the following paths:
The first path of the Startup folder demands high privileges / high integrity level (in case the UAC is on). However, WinRAR runs by default with a medium integrity level.
The second path of the Startup folder demands to know the name of the user.
We can try to overcome it by creating an ACE archive with thousands of crafted compressed files, any one of which contains the path to the Startup folder but with different <user name>, and hope that it will work in our target.
The Most Powerful Vector
We have found a vector which allows us to extract a file to the Startup folder without caring about the <user name>.
By using the following filename field in the ACE archive:
Because the CleanPath function removes the “C:\C:” sequence.
Moreover, this destination folder will be ignored because the GetDevicePathLen function (Figure 16) will return 2 for the last “C:” sequence.
Let’s analyze the last path:
The sequence “C:” is translated by Windows to the “current directory” of the running process. In our case, it’s the current path of WinRAR.
If WinRAR is executed from its folder, the “current directory” will be this WinRAR folder: C:\Program Files\WinRAR
However, if WinRAR is executed by double clicking on an archive file or by right clicking on “extract” in the archive file, the “current directory” of WinRAR will be the path to the folder that the archive resides in.
For example, if the archive resides in the user’s Downloads folder, the “current directory” of WinRAR will be: C:\Users\<user name>\Downloads If the archive resides in the Desktop folder, the “current directory” path will be: C:\Users\<user name>\Desktop
To get from the Desktop or Downloads folder to the Startup folder, we should go back one folder “../” to the “user folder”, and concatenate the relative path to the startup directory: AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\ to the following sequence: “C:../”
This is the end result: C:../AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\some_file.exe
Remember that there are 2 checks against path traversal sequences:
In the CleanPath function which skips such sequences.
In WinRAR’s callback function which aborts the extraction operation.
CleanPath checks for the following path traversal pattern: “\..\”
The WinRAR’s callback function checks for the following patterns:
Because the first slash or backslash are not part of our sequence “C:../”, we can bypass the path traversal validation. However, we can only go back one folder. It’s all we need to extract a file to the Startup folder without knowing the user name.
Note: If we want to go back more than one folder, we should concatenate the following sequence “/../”. For example, “C:../../” and the “/../” sequence will be caught be the callback validator function and the extraction will be aborted.
Toward the end of our research, we discovered that WinACE created an extraction utility like unacev2.dll for linux which is called unace-nonfree (compiled using Watcom compiler). The source code is available. The source code for Windows (which unacev2.dll was built from) is included as well, but it’s older than the last version of unacev2.dll , and can’t be compiled/built for Windows. In addition, some functionality is missing in the source code – for example, the checks in Figure 17 are not included.
However, Figure 16 was taken from the source code. We also found the Path Traversal bug in the source code. It looks like this:
Figure 20: The path traversal bug in the source code of unace-nonfree
“Nadav Grossman from Check Point Software Technologies informed us about a security vulnerability in UNACEV2.DLL library.
Aforementioned vulnerability makes possible to create files in arbitrary folders inside or outside of destination folder
when unpacking ACE archives.
WinRAR used this third party library to unpack ACE archives.
UNACEV2.DLL had not been updated since 2005 and we do not have access to its source code.
So we decided to drop ACE archive format support to protect security of WinRAR users.
We are thankful to Check Point Software Technologies for reporting this issue.“
Check Point’s SandBlast Agent Behavioral Guard protect against these threats.
Check Point’s IPS blade provides protections against this threat: “RARLAB WinRAR ACE Format Input Validation Remote Code Execution (CVE-2018-20250)”
n this article, we are going to show you our journey of exploiting the Insecure Deserialization vulnerability and we will take WebGoat 8 deserialization challenge (deployed on Docker) as an example. The challenge can be solved by just executing sleepfor 5 seconds. However, we are going to move further for fun and try to get a reverse shell.
Object serialization mainly allows developers to convert in-memory objects to binary and textual data formats for storage or transfer. However, deserializing objects from untrusted data can cause an attacker to achieve remote code execution.
As mentioned in the challenge, the vulnerable page takes a serialized Java object in Base64 format from the user input and it blindly deserializes it. We will exploit this vulnerability by providing a serialized object that triggers a Property Oriented Programming Chain (POP Chain) to achieve Remote Command Execution during the deserialization.
By firing up Burp and installing a plugin called Java-Deserialization-Scanner. The plugin is consisting of 2 features: one of them is for scanning and the other one is for generating the exploit based on the ysoserial tool.
After scanning the remote endpoint the Burp plugin will report:
Hibernate 5 (Sleep): Potentially VULNERABLE!!!
Let’s move to the next step and go to the exploitation tab to achieve arbitrary command execution.
Huh?! It seems an issue with ysoserial. Let’s dig deeper into the issue and move to the console to see what is the issue exactly.
By looking at ysoserial, we see that two different POP chains are available for Hibernate. By using those payloads we figure out that none of them is being executed on the target system.
How the plugin generated this payload to trigger the sleep command then?
We decided to look at the source code of the plugin on the following link:
We noticed that the payload is hard-coded in the plugin’s source code, so we need to find a way to generate the same payload in order to get it working.
Based on some research and help, we figured out that we need to modify the current version of ysoserial in order to get our payloads working.
We downloaded the source code of ysoserialand decided to recompile it using Hibernate 5. In order to successfully build ysoserial with Hibernate 5 we need to add the javax.el package to the pom.xml file.
We also have sent out a Pull Request to the original project in order to fix the build when the hibernate5 profile is selected.
We can proceed to rebuild ysoserialwith the following command:
We can verify that our command was executed by accessing the docker container with the following command:
docker exec -it <CONTAINER_ID> /bin/bash
As we can see our payload was successfully executed on the machine!
We proceed to enumerate the binaries on the target machine.
webgoat@1d142ccc69ec:/$ which php
webgoat@1d142ccc69ec:/$ which python
webgoat@1d142ccc69ec:/$ which python3
webgoat@1d142ccc69ec:/$ which wget
webgoat@1d142ccc69ec:/$ which curl
webgoat@1d142ccc69ec:/$ which nc
webgoat@1d142ccc69ec:/$ which perl
webgoat@1d142ccc69ec:/$ which bash
Only Perl and Bash are available. Let’s try to craft a payload to send us a reverse shell.
We looked at some one-liners reverse shells on Pentest Monkeys:
Open a simple reverse shell on a target machine using C# code and bypassing AV solutions.
Open a reverse shell with a little bit of persistence on a target machine using C++ code and bypassing AV solutions.
Open C# Reverse Shell via Internet using Proxy Credentials.
Open Reverse Shell via C# on-the-fly compiling with Microsoft.Workflow.Compiler.exe.
Open Reverse Shell via PowerShell & C# live compiling
Open Reverse Shell via Excel Macro, PowerShell and C# live compiling
C# Simple Reverse Shell Code writing
Looking on github there are many examples of C# code that open reverse shells via cmd.exe. In this case i copied part of the codes and used the following simple C# program. No evasion, no persistence, no hiding code, only simple “open socket and launch the cmd.exe on victim machine”:
C++ Reverse Shell with a little bit of persistence
Trying to go deeper i found different C++ codes with the same goal of the above reverse shell but one has aroused my attention. In particular i founded @NinjaParanoid’s code that opens a reverse shell with a little bit of persistence. Following some details of the code. For all the details go to the original article.
This script has 3 main advantages:
while loop that try to reconnect after 5 seconds
invisible cmd instance
takes arguments if standard attackers ip change
After compiling the code I analyzed it with Windows Defender and no threats were detected. At this time the exe behavior begins to be a bit borderline between malicious and non. As you can imagine as soon as you run the file the shell will be opened after 5 seconds in “silent mode”.
From user side nothing appears on screen. There is only the background process that automatically reconnects to the Kali every 5 sec if something goes wrong.
Open C# Reverse Shell via Internet using Proxy Credentials
Reasoning on how to exploit the proxy credentials to open a reverse shell on the internet from an internal company network I developed the following code:
combine the peewpw script to dump Proxy credentials (if are present) from Credential Manager without admin privileges
encode the dumped credentials in Base64
insert them into Proxy authorization connect.
… and that’s it…
…before compile the code you need only the Proxy IP/PORT of the targeted company. For security reason i cannot share the source code for avoid the in the wild exploitation but if you have a little bit of programming skills you will write yourself all the steps chain. Obviously this attack has a very high failure rate because the victim may not have saved the domain credentials on the credential manager making the attack ineffective.
Also in this case no threats were detected by Windows Defender and other enterprise AV solutions.
Thanks to @SocketReve for helping me to write this code.
Open Reverse Shell via C# on-the-fly compiling with Microsoft.Workflow.Compiler.exe
Passing over and looking deeper i found different articles that talks about arbitrary, unsigned code execution in Microsoft.Workflow.Compiler.exe. Here the articles: 1– 2– 3.
As a result of these articles I thought … why not use this technique to open my reverse shell written in C#?
In short, the articles talk about how to abuse the Microsoft.Workflow.Compiler.exe service in order to compile C# code on-the-fly. Here an command example:
The REV.txt must need the following XOML structure:
Below you will find the RAW structure of the C# code that will be compiled (same of the C# reverse shell code described above):
After running the command, the following happens:
Not fileless: the C# source code is fetched from the Rev.Shell file.
Fileless: the C# payload is compiled and executed.
Fileless: the payload opens the reverse shell.
Open Reverse Shell via PowerShell & C# live compiling
At this point I thought … what could be the next step to evolve this attack to something more usable in a red team or in a real attack?
Easy… to give Microsoft.Workflow.Compiler.exe the files to compile, why not use PowerShell? …and here we are:
With this command the PS will download the two files described above and save them on the file system. Immediately afterwards it will abuse the Microsoft.Workflow.Compiler.exe to compile the C # live code and open the reverse shell. Following the gist links:
Many of the detections concern the macro that launch powershell and not for the actual behavior of the same. This means that if an attacker were able to obfuscate the code for not being detected or used other service to download the two files it could, without being detected, open a reversed shell as shown above.
Through the opening of several reverse shells written in different ways, this article wants to show that actions at the limit between good and evil are hardly detected by antivirus on the market. The first 2 shells are completely undetectable for all the AV on the market. The signatures related to the malicious macro concern only generic powershell and not the real abuse of microsoft services.
Critically, the arbitrary code execution technique using Microsoft.Workflow.Compiler.exe relies only on the ability to call a command, not on PowerShell. There is no need for the attacker to use some known PowerShell technique that might be detected and blocked by a security solution in place. You gain benefits such as bypassing application whitelisting and new ways of obfuscating malicious behavior. That said, when abusing Microsoft.Workflow.Compiler.exe, a temporary DLL will be created and may be detected by anti-virus.
Not long after the iOS 12 developer beta was released, I started analyzing the new kernelcaches in IDA to look for interesting changes. I immediately noticed that ida_kernelcache, my kernelcache analysis toolkit, was failing on the iPhone 6 Plus kernelcache: it appeared that certain segments, notably the prelink segments like __PRELINK_TEXT, were empty. Even stranger, when I started digging around the kernelcache, I noticed that the pointers looked bizarre, starting with 0x0017 instead of the traditional 0xffff.
It appears that Apple may be making significant changes to the iOS kernelcache on some devices, including abandoning the familiar split-kext design in favor of a monolithic Mach-O and introducing some form of static pointer tagging, most likely as a space-saving optimization. In this post I’ll describe some of the changes I’ve found and share my analysis of the tagged pointers.
The new kernelcache format
The iOS 12 beta kernelcache comes in a new format for some devices. In particular, if you look at build 16A5308e on the iPhone 6 Plus (iPhone7,1), you’ll notice a few differences from iOS 11 kernelcaches:
The __TEXT, __DATA_CONST, __TEXT_EXEC, __DATA, and __LINKEDIT segments are much bigger due to the integration of the corresponding segments from the kexts.
There are several new sections:
__DATA.__kmod_init, which is the combination of the kexts’ __mod_init_func sections.
__DATA.__kmod_term, which is the combination of the kexts’ __mod_term_func sections.
The segments __PRELINK_TEXT and __PLK_TEXT_EXEC, __PRELINK_DATA, and __PLK_DATA_CONST are now 0 bytes long.
The prelink info dictionary in the section __PRELINK_INFO.__info no longer has the _PrelinkLinkKASLROffsetsand _PrelinkKCID keys; only the _PrelinkInfoDictionary key remains.
So far, Apple hasn’t implemented this new format on all devices. The iPhone 7 (iPhone9,1) kernelcache still has split kexts and the traditional ~4000 symbols.
However, even on devices with the traditional split-kext layout, Apple does appear to be tweaking the format. The layout appears largely the same as before, but loading the kernelcache file into IDA 6.95 generates numerous warnings:
Loading prelinked KEXTs
FFFFFFF005928300: loading com.apple.iokit.IONetworkingFamily
entries start past the end of the indirect symbol table (reserved1 field greater than the table size)
FFFFFFF005929E00: loading com.apple.iokit.IOTimeSyncFamily
entries start past the end of the indirect symbol table (reserved1 field greater than the table size)
FFFFFFF00592D740: loading com.apple.kec.corecrypto
entries start past the end of the indirect symbol table (reserved1 field greater than the table size)
Thus, there appear to be at least 3 distinct kernelcache formats:
11-normal: The format used on iOS 10 and 11. It has split kexts, untagged pointers, and about 4000 symbols.
12-normal: The format used on iOS 12 beta for iPhone9,1. It is similar to 11-normal, but with some structural changes that confuse IDA 6.95.
12-merged: The format used on iOS 12 beta for iPhone7,1. It is missing prelink segments, has merged kexts, uses tagged pointers, and, to the dismay of security researchers, is completely stripped.
Unraveling the mystery of the tagged pointers
I first noticed that the pointers in the kernelcache looked weird when I jumped to __DATA_CONST.__mod_init_func in IDA. In the iPhone 7,1 16A5308e kernelcache, this section looks like this:
__DATA_CONST.__mod_init_func:FFFFFFF00748C000 ; Segment type: Pure data
__DATA_CONST.__mod_init_func:FFFFFFF00748C000 AREA __DATA_CONST.__mod_init_func, DATA, ALIGN=3
__DATA_CONST.__mod_init_func:FFFFFFF00748C000 off_FFFFFFF00748C000 DCQ 0x0017FFF007C95908
__DATA_CONST.__mod_init_func:FFFFFFF00748C000 ; DATA XREF: sub_FFFFFFF00794B1F8+438o
__DATA_CONST.__mod_init_func:FFFFFFF00748C000 ; sub_FFFFFFF00794B1F8+4CC7w
__DATA_CONST.__mod_init_func:FFFFFFF00748C008 DCQ 0x0017FFF007C963D0
__DATA_CONST.__mod_init_func:FFFFFFF00748C010 DCQ 0x0017FFF007C99E14
__DATA_CONST.__mod_init_func:FFFFFFF00748C018 DCQ 0x0017FFF007C9B7EC
__DATA_CONST.__mod_init_func:FFFFFFF00748C020 DCQ 0x0017FFF007C9C854
__DATA_CONST.__mod_init_func:FFFFFFF00748C028 DCQ 0x0017FFF007C9D6B4
This section should be filled with pointers to initialization functions; in fact, the values look almost like pointers, except the first 2 bytes, which should read 0xffff, have been replaced with 0x0017. Aside from that, the next 4 digits of the “pointer” are fff0, as expected, and the pointed-to values are all multiples of 4, as required for function pointers on arm64.
This pattern of function pointers was repeated in the other sections. For example, all of __DATA.__kmod_init and __DATA.__kmod_term have the same strange pointers:
__DATA.__kmod_init:FFFFFFF0088D4000 ; Segment type: Pure data
__DATA.__kmod_init:FFFFFFF0088D4000 AREA __DATA.__kmod_init, DATA, ALIGN=3
__DATA.__kmod_init:FFFFFFF0088D4000 DCQ 0x0017FFF007DD3DC0
__DATA.__kmod_init:FFFFFFF0088D4008 DCQ 0x0017FFF007DD641C
__DATA.__kmod_init:FFFFFFF0088D6600 DCQ 0x0017FFF0088C9E54
__DATA.__kmod_init:FFFFFFF0088D6608 DCQ 0x0017FFF0088CA1D0
__DATA.__kmod_init:FFFFFFF0088D6610 DCQ 0x0007FFF0088CA9E4
__DATA.__kmod_init:FFFFFFF0088D6610 ; __DATA.__kmod_init ends
__DATA.__kmod_term:FFFFFFF0088D6618 ; ===========================================================================
__DATA.__kmod_term:FFFFFFF0088D6618 ; Segment type: Pure data
__DATA.__kmod_term:FFFFFFF0088D6618 AREA __DATA.__kmod_term, DATA, ALIGN=3
__DATA.__kmod_term:FFFFFFF0088D6618 DCQ 0x0017FFF007DD3E68
__DATA.__kmod_term:FFFFFFF0088D6620 DCQ 0x0017FFF007DD645C
However, if you look carefully, you’ll see that the last pointer of __kmod_init actually begins with 0x0007 rather than 0x0017. After seeing this, I began to suspect that this was some form of pointer tagging: that is, using the upper bits of the pointer to store additional information. Thinking that this tagging could be due to some new kernel exploit mitigation Apple was about to release, I decided to work out exactly what these tags mean to help understand what the mitigation might be.
My next step was to look for different types of pointers to see if there were any other possible tag values. I first checked the vtable for AppleKeyStoreUserClient. Since there are no symbols, you can find the vtable using the following trick:
Search for the “AppleKeyStoreUserClient” string in the Strings window.
Look for cross-references to the string from an initializer function. In our case, there’s only one xref, so we can jump straight there.
The “AppleKeyStoreUserClient” string is being loaded into register x1 as the first explicit argument in the call to to OSMetaClass::OSMetaClass(char const*, OSMetaClass const*, unsigned int). The implicit this parameter passed in register x0 refers to the global AppleKeySoreUserClient::gMetaClass, of type AppleKeyStoreUserClient::MetaClass, and its vtable is initialized just after the call. Follow the reference just after the call and you’ll be looking at the vtable for AppleKeyStoreUserClient::MetaClass.
From there, just look backwards to the first vtable before that one, and that’ll be AppleKeyStoreUserClient’s vtable.
This is what those vtables look like in the new kernelcache:
As you can see, most of the pointers still have the 0x0017 tag, but there are also 0x0047 and 0x0057 tags.
You may also notice that the last valid entry in the AppleKeyStoreUserClient::MetaClass vtable is0x0057fff00844be28, which corresponds to the untagged pointer 0xfffffff00844be28, which is the address of the function sub_FFFFFFF00844BE28 that references AppleKeyStoreUserClient’s vtable. This supports the hypothesis that only the upper 2 bytes of each pointer are changed: the metaclass method at index 14 should be AppleKeyStoreUserClient::MetaClass::alloc, which needs to reference the AppleKeyStoreUserClient vtable when allocating a new instance of the class, and so everything fits together as expected.
At this point, I decided to gather more comprehensive information about the tagged pointers. I wrote a quick idapython script to search for 8-byte values that would be valid pointers except that the first 2 bytes were not 0xffff. Here’s the distribution of tagged pointers by section:
While studying the results, it became obvious that the distribution of tags across the 2-byte tag space (0x0000 to 0xffff) was not uniform: most of the tags seemed to use 0x0017, and almost all the tags started with the first digit 0. Additionally, almost all tags ended in 7, and the rest ended in f; no tags ended in any other digit.
I next examined whether there was a pattern to what each tagged pointer referenced, under the theory that the tags might describe the type of referred object. I wrote a script to print the section being referenced by one tagged pointer chosen at random for each tag. Unfortunately, the results didn’t offer any particularly illuminating insights:
Finally, while looking at the examples of various tags given by the previous script, I noticed a pattern: 0x0017 seemed to be found in the middle of sequences of pointers, while other tags appeared at the ends of sequences, when the following value was not a pointer. On further inspection, the second-to-last digit seemed to suggest how many (8-byte) words to skip before you’d get to the next tagged pointer: 0x0017 meant the following value was a pointer, 0x0027 meant value after next was a pointer, 0x0037 meant skip 2 values, etc.
After more careful analysis, I discovered that this pattern held for all tags I manually inspected:
The tag 0x0007 was usually found at the end of a section.
The tag 0x0017 was always directly followed by another pointer.
The tag 0x001f was followed by a pointer after 4 intermediate bytes.
The tag 0x0027 was followed by a pointer after 8 bytes.
The tag 0x0037 was followed by a pointer after 16 bytes.
Extrapolating from these points, I derived the following relation for tagged pointers: For a tagged pointer P at address A, the subsequent pointer will occur at address A + ((P >> 49) & ~0x3).
Even though tags as spans between pointers made little sense as a mitigation, I wrote a script to check whether all the tagged pointers in the kernelcache followed this pattern. Sure enough, all pointers except for those with tag 0x0007 were spot-on. The exceptions for 0x0007 tags occurred when there was a large gap between adjacent pointers. Presumably, if the gap is too large, 0x0007 is used even when the section has not ended to indicate that the gap cannot be represented by the tag.
Pointer tags as a kASLR optimization
So, the pointer tags describe the distance from each pointer to the next in the kernelcache, and there’s a formula that can compute the address of the next pointer given the address and tag of the previous one, kind of like a linked list. We understand the meaning, but not the purpose. Why did Apple implement this pointer tagging feature? Is it a security mitigation or something else?
Even though I initially thought that the tags would turn out to be a mitigation, the meaning of the tags as links between pointers doesn’t seem to support that theory.
In order to be a useful mitigation, you’d want the tag to describe properties of the referred-to value. For example, the tag might describe the length of the memory region referred to by the pointer so that out-of-bound accesses can be detected. Alternatively, the tag might describe the type of object being referred to so that functions can check that they are being passed pointers of the expected type.
Instead, these tags seem to describe properties of the address of the pointer rather than the value referred to by the pointer. That is, the tag indicates the distance to the pointer following this one regardless of to what or to where this pointer actually points. Such a property would be impossible to maintain at runtime: adding a pointer to the middle of a data structure would require searching backwards in memory for the pointer preceding it and updating that pointer’s tag. Thus, if this is a mitigation, it would be of very limited utility.
However, there’s a much more plausible theory. Buried among the other changes, the new kernelcache’s prelink info dictionary has been thinned down by removing the _PrelinkLinkKASLROffsets key. This key used to hold a data blob describing the offsets of all the pointers in the kernelcache that needed to be slid in order to implement kASLR. In the new kernelcache without the kASLR offsets, iBoot needs another way to identify where the pointers in the kernelcache are, and it just so happens that the pointer tags connect each pointer to the next in a linked list.
Thus, I suspect that the pointer tags are the new way for iBoot to find all the pointers in the kernelcache that need to be updated with the kASLR slide, and are not part of a mitigation at all. During boot, the tagged pointers would be replaced by untagged, slid pointers. This new implementation saves space by removing the large list of pointer offsets that used to be stored in the prelink info dictionary.
The new kernelcache format and pointer tagging make analysis using IDA difficult, but now that I have a plausible theory for what’s going on, I plan on extending ida_kernelcache to make working with these new kernelcaches easier. Since the tags are probably not present at runtime, it should be safe to replace all the tagged pointers with their untagged values. This will allow IDA to restore all the cross-references broken by the tags. Unfortunately, the loss of symbol information will definitely make analysis more difficult. Future versions of ida_kernelcache may have to incorporate known symbol lists or parse the XNU source to give meaningful names to labels.
In part 4 we extracted the entire firmware from the router and decompressed it. As I explained then, you can often get most of the firmware directly from the manufacturer’s website: Firmware upgrade binaries often contain partial or entire filesystems, or even entire firmwares.
In this post we’re gonna dig through the firmware to find potentially interesting code, common vulnerabilities, etc.
I’m gonna explain some basic theory on the Linux architecture, disassembling binaries, and other related concepts. Feel free to skip some of the parts marked as [Theory]; the real hunt starts at ‘Looking for the Default WiFi Password Generation Algorithm’. At the end of the day, we’re just: obtaining source code in case we can use it, using grep and common sense to find potentially interesting binaries, and disassembling them to find out how they work.
One step at a time.
Gathering and Analysing Open Source Components
GPL Licenses — What They Are and What to Expect [Theory]
Linux, U-Boot and other tools used in this router are licensed under the General Public License. This license mandates that the source code for any binaries built with GPL’d projects must be made available to anyone who wants it.
Having access to all that source code can be a massive advantage during the reversing process. The kernel and the bootloader are particularly interesting, and not just to find security issues.
When hunting for GPL’d sources you can usually expect one of these scenarios:
The code is freely available on the manufacturer’s website, nicely ordered and completely open to be tinkered with. For instance: apple products or theamazon echo
The source code is available by request
They send you an email with the sources you requested
They ask you for “a reasonable amount” of money to ship you a CD with the sources
In the case of this router, the source code was available on their website, even though it was a huge pain in the ass to find; it took me a long time of manual and automated searching but I ended up finding it in the mobile version of the site:
But what if they’re hiding something!? How could we possibly tell whether the sources they gave us are the same they used to compile the production binaries?
Challenges of Binary Verification [Theory]
Theoretically, we could try to compile the source code ourselves and compare the resulting binary with the one we extracted from the device. In practice, that is extremely more complicated than it sounds.
The exact contents of the binary are strongly tied to the toolchain and overall environment they were compiled in. We could try to replicate the environment of the original developers, finding the exact same versions of everything they used, so we can obtain the same results. Unfortunately, most compilers are not built with output replicability in mind; even if we managed to find the exact same version of everything, details like timestamps, processor-specific optimizations or file paths would stop us from getting a byte-for-byte identical match.
If you’d like to read more about it, I can recommend this paper. The authors go through the challenges they had to overcome in order to verify that the official binary releases of the application ‘TrueCrypt’ were not backdoored.
Introduction to the Architecture of Linux [Theory]
In multiple parts of the series, we’ve discussed the different components found in the firmware: bootloader, kernel, filesystem and some protected memory to store configuration data. In order to know where to look for what, it’s important to understand the overall architecture of the system. Let’s quickly review this device’s:
The bootloader is the first piece of code to be executed on boot. Its job is to prepare the kernel for execution, jump into it and stop running. From that point on, the kernel controls the hardware and uses it to run user space logic. A few more details on each of the components:
Hardware: The CPU, Flash, RAM and other components are all physically connected
Linux Kernel: It knows how to control the hardware. The developers take the Open Source Linux kernel, write drivers for their specific device and compile everything into an executable Kernel. It manages memory, reads and writes hardware registers, etc. In more complex systems, “kernel modules” provide the possibility of keeping device drivers as separate entities in the file system, and dynamically load them when required; most embedded systems don’t need that level of versatility, so developers save precious resources by compiling everything into the kernel
libc (“The C Library”): It serves as a general purpose wrapper for the System Call API, including extremely common functions like printf, malloc or system. Developers are free to call the system call API directly, but in most cases, it’s MUCH more convenient to use libc. Instead of the extremely common glibc (GNU C library) we usually find in more powerful systems, this device uses a version optimised for embedded devices: uClibc.
User Applications: Executable binaries in /bin/ and shared objects in /lib/(libraries that contain functions used by multiple binaries) comprise most of the high-level logic. Shared objects are used to save space by storing commonly used functions in a single location
Bootloader Source Code
As I’ve mentioned multiple times over this series, this router’s bootloader is U-Boot. U-Boot is GPL licensed, but Huawei failed to include the source code in their website’s release.
Having the source code for the bootloader can be very useful for some projects, where it can help you figure out how to run a custom firmware on the device or modify something; some bootloaders are much more feature-rich than others. In this case, I’m not interested in anything U-Boot has to offer, so I didn’t bother following up on the source code.
Kernel Source Code
Let’s just check out the source code and look for anything that might help. Remember the factory reset button? The button is part of the hardware layer, which means the GPIO pin that detects the button press must be controlled by the drivers. These are the logs we saw coming out of the UART port in a previous post:
With some simple grep commands we can see how the different components of the system (kernel, binaries and shared objects) can work together and produce the serial output we saw:
Having the kernel can help us find poorly implemented security-related algorithms and other weaknesses that are sometimes considered ‘accepted risks’ by manufacturers. Most importantly, we can use the drivers to compile and run our own OS on the device.
User Space Source Code
As we can see in the GPL release, some components of the user space are also open source, such as busybox and iptables. Given the right (wrong) versions, public vulnerability databases could be enough to find exploits for any of these.
That being said, if you’re looking for 0-days, backdoors or sensitive data, your best bet is not the open source projects. Devic specific and closed source code developed by the manufacturer or one of their providers has not been so heavily tested and may very well be riddled with bugs. Most of this code is stored as binaries in the user space; we’ve got the entire filesystem, so we’re good.
Without the source code for user space binaries, we need to find a way to read the machine code inside them. That’s where disassembly comes in.
Binary Disassembly [Theory]
The code inside every executable binary is just a compilation of instructions encoded as Machine Code so they can be processed by the CPU. Our processor’s datasheet will explain the direct equivalence between assembly instructions and their machine code representations. A disassembler has been given that equivalence so it can go through the binary, find data and machine code andtranslate it into assembly. Assembly is not pretty, but at least it’s human-readable.
Due to the very low-level nature of the kernel, and how heavily it interacts with the hardware, it is incredibly difficult to make any sense of its binary. User space binaries, on the other hand, are abstracted away from the hardware and follow unix standards for calling conventions, binary format, etc. They’re an ideal target for disassembly.
There are lots of disassemblers for popular architectures like MIPS; some better than others both in terms of functionality and usability. I’d say these 3 are the most popular and powerful disassemblers in the market right now:
IDA Pro: By far the most popular disassembler/debugger in the market. It is extremely powerful, multi-platform, and there are loads of users, tutorials, plugins, etc. around it. Unfortunately, it’s also VERY expensive; a single person license of the Pro version (required to disassemble MIPS binaries) costs over $1000
Radare2: Completely Open Source, uses an impressively advanced command line interface, and there’s a great community of hackers around it. On the other hand, the complex command line interface -necessary for the sheer amount of features- makes for a rather steep learning curve
Binary Ninja: Not open source, but reasonably priced at $100 for a personal license, it’s middle ground between IDA and radare. It’s still a very new tool; it was just released this year, but it’s improving and gaining popularity day by day. It already works very well for some architectures, but unfortunately it’s still missing MIPS support (coming soon) and some other features I needed for these binaries. I look forward to giving it another try when it’s more mature
In order to display the assembly code in a more readable way, all these disasemblers use a “Graph View”. It provides an intuitive way to follow the different possible execution flows in the binary:
Such a clear representation of branches, and their conditionals, loops, etc. is extremely useful. Without it, we’d have to manually jump from one branch to another in the raw assembly code. Not so fun.
If you read the code in that function you can see the disassembler makes a great job displaying references to functions and hardcoded strings. That might be enough to help us find something juicy, but in most cases you’ll need to understand the assembly code to a certain extent.
Gathering Intel on the CPU and Its Assembly Code [Theory]
Because ELF headers are designed to be platform-agnostic, we can easily find out some info about our binaries. As you can see, we know the architecture (32-bit MIPS), endianness (LSB), and whether it uses shared libraries.
We can verify that information thanks to the Ralink’s product brief, which specifies the processor core it uses: MIPS24KEc
Once we know the basics we can just drop the binary into the disassembler. It will help validate some of our findings, and provide us with the assembly code. In order to understand that code we’re gonna need to know the architecture’s instruction sets and register names:
MIPS Alternate Register Names: In MIPS, there’s no real difference between registers; the CPU doesn’t about what they’re called. Alternate register names exist to make the code more readable for the developer/reverser: $a0 to $a3 for function arguments, $t0 to $t9 for temporary registers, etc.
Beyond instructions and registers, some architectures may have some quirks. One example of this would be the presence of delay slots in MIPS: Instructions that appear immediately after branch instructions (e.g. beqz, jalr) but are actually executed before the jump. That sort of non-linearity would be unthinkable in other architectures.
Following up on the reset key example we were using for the Kernel, we’ve got the code that generated some of the UART log messages, but not all of them. Since we couldn’t find the ‘button has been pressed’ string in the kernel’s source code, we can deduce it must have come from user space. Let’s find out which binary printed it:
3 files contain the next string found in the logs: 2 executables in /bin/ and 1 shared object in /lib/. Let’s take a look at /bin/equipcmd with IDA:
If we look closely, we can almost read the C code that was compiled into these instructions. We can see a “clear configuration file”, which would match the ERASEcommands we saw in the SPI traffic capture to the flash IC. Then, depending on the result, one of two strings is printed: restore default success or restore default fail . On success, it then prints something else, flushes some buffers and reboots; this also matches the behaviour we observed when we pressed the reset button.
That function is a perfect example of delay slots: the addiu instructions that set both strings as arguments —$a0— for the 2 puts are in the delay slots of the branch if equals zero and jump and link register instructions. They will actually be executed before branching/jumping.
As you can see, IDA has the name of all the functions in the binary. That won’t necessarily be the case in other binaries, and now’s a good time to discuss why.
Function Names in a Binary — Intro to Symbol Tables [Theory]
The ELF format specifies the usage of symbol tables: chunks of data inside a binary that provide useful debugging information. Part of that information are human-readable names for every function in the binary. This is extremely convenient for a developer debugging their binary, but in most cases it should be removed before releasing the production binary. The developers were nice enough to leave most of them in there 🙂
In order to remove them, the developers can use tools like strip, which know what must be kept and what can be spared. These tools serve a double purpose: They save memory by removing data that won’t be necessary at runtime, and they make the reversing process much more complicated for potential attackers. Function names give context to the code we’re looking at, which is massively helpful.
In some cases -mostly when disassembling shared objects- you may see somefunction names or none at all. The ones you WILL see are the Dynamic Symbols in the .dymsym table: We discussed earlier the massive amount of memory that can be saved by using shared objects to keep the pieces of code you need to re-use all over the system (e.g. printf()). In order to locate pieces of data inside the shared object, the caller uses their human-readable name. That means the names for functions and variables that need to be publicly accessible must be left in the binary. The rest of them can be removed, which is why ELF uses 2 symbol tables: .dynsym for publicly accessible symbols and .symtab for the internal ones.
If you recall, these are the default WiFi credentials in my router:
So what do we know?
Each device is pre-configured with a different set of WiFi credentials
The credentials could be hardcoded at the factory or generated on the device. Either way, we know from previous posts that both SSID and password are stored in the reserved area of Flash memory, and they’re right next to each other
If they were hardcoded at the factory, the router only needs to read them from a known memory location
If they are generated in the device and then stored to flash, there must be an algorithm in the router that -given the same inputs- always generates the same outputs. If the inputs are public (e.g. the MAC address) and we can find, reverse and replicate the algorithm, we could calculate default WiFi passwords for any other router that uses the same algorithm
Let’s see what we can do with that…
Finding Hardcoded Strings
Let’s assume there IS such algorithm in the router. Between username and password, there’s only one string that remains constant across devices:TALKTALK-. This string is prepended to the last 6 characters of the MAC address. If the generation algorithm is in the router, surely this string must be hardcoded in there. Let’s look it up:
2 of those 3 binaries (nmbd and smbd) are part of samba, the program used to use the USB flash drive as a network storage device. They’re probably used to identify the router over the network. Let’s take a look at the other one: /bin/cms.
Reversing the Functions that Uses Them
That looks exactly the way we’d expect the SSID generation algorithm to look. The code is located inside a rather large function called ATP_WLAN_Init, and somewhere in there it performs the following actions:
Find out the MAC address of the device we’re running on:
Unfortunately, right after this branch the function simply does an ATP_DBSave and moves on to start running commands and whatnot. e.g.:
Further inspection of this function and other references to ATP_DBSave did not reveal anything interesting.
After some time using this process to find potentially relevant pieces of code, reverse them, and analyse them, I didn’t find anything that looked like the password generation algorithm. That would confirm the suspicions I’ve had since we found the default credentials in the protected flash area: The manufacturer used proper security techniques and flashed the credentials at the factory, which is why there is no algorithm. Since the designers manufacture their own hardware, the decision makes perfect sense for this device. They can do whatever they want with their manufacturing lines, so they decided to do it right.
I might take another look at it in the future, or try to find it in some other router (I’d like to document the process of reversing it), but you should know this method DOES work for a lot of products. There’s a long history of freely available default WiFi password generators.
Since we already know how to find relevant code in the filesystem binaries, let’s see what else we can do with that knowledge.
Looking for Command Injection Vulnerabilities
One of the most common, easy to find and dangerous vulnerabilities is command injection. The idea is simple; we find an input string that is gonna be used as an argument for a shell command. We try to append our own commands and get them to execute, bypassing any filters that the developers may have implemented. In embedded devices, such vulnerabilities often result in full root control of the device.
These vulnerabilities are particularly common in embedded devices due to their memory constraints. Say you’re developing the web interface used by the users to configure the device; you want to add the possibility to ping a user-defined server from the router, because it’s very valuable information to debug network problems. You need to give the user the option to define the ping target, and you need to serve them the results:
Once you receive the data of which server to target, you have two options: You find a library with the ICMP protocol implemented and call it directly from the web backend, or you could use a single, standard function call and use the router’s already existing ping shell command. The later is easier to implement, saves memory, etc. and it’s the obvious choice. Taking user input (target server address) and using it as part of a shell command is where the danger comes in. Let’s see how this router’s web application, /bin/web, handles it:
A call to libc’s system() (not to be confused with a system call/syscall) is the easiest way to execute a shell command from an application. Sometimes developers wrap system() in custom functions in order to systematically filter all inputs, but there’s always something the wrapper can’t do or some developer who doesn’t get the memo.
Looking for references to system in a binary is an excellent way to find vectors for command injections. Just investigate the ones that look like may be using unfiltered user input. These are all the references to system() in the /bin/web binary:
Even the names of the functions can give you clues on whether or not a reference to system() will receive user input. We can also see some references to PIN and PUK codes, SIMs, etc. Seems like this application is also used in some mobile product…
I spent some time trying to find ways around the filtering provided byatp_gethostbyname (anything that isn’t a domain name causes an error), but I couldn’t find anything in this field or any others. Further analysis may prove me wrong. The idea would be to inject something to the effects of this:
Which would result in this final string being executed as a shell command: ping google.com -c 1; reboot; ping 192.168.1.1 > /dev/null. If the router reboots, we found a way in.
As I said, I couldn’t find anything. Ideally we’d like to verify that for all input fields, whether they’re in the web interface or some other network interface. Another example of a network interface potentially vulnerable to remote command injections is the “LAN-Side DSL CPE Configuration” protocol, or TR-064. Even though this protocol was designed to be used over the internal network only, it’s been used to configure routers over the internet in the past. Command injection vulnerabilities in some implementations of this protocol have been used to remotely extract data like WiFi credentials from routers with just a few packets.
This router has a binary conveniently named /bin/tr064; if we take a look, we find this right in the main() function:
That’s the private RSA key we found in Part 2 being used for SSL authentication. Now we might be able to supplant a router in the system and look for vulnerabilities in their servers, or we might use it to find other attack vectors. Most importantly, it closes the mistery of the private key we found while scouting the firmware.
Looking for More Complex Vulnerabilities [Theory]
Even if we couldn’t find any command injection vulnerabilities, there are always other vectors to gain control of the router. The most common ones are good old buffer overflows. Any input string into the router, whether it is for a shell command or any other purpose, is handled, modified and passed around the code. An error by the developer calculating expected buffer lengths, not validating them, etc. in those string operations can result in an exploitable buffer overflow, which an attacker can use to gain control of the system.
The idea behind a buffer overflow is rather simple: We manage to pass a string into the system that contains executable code. We override some address in the program so the execution flow jumps into the code we just injected. Now we can do anything that binary could do -in embedded systems like this one, where everything runs as root, it means immediate root pwnage.
Developing an exploit for this sort of vulnerability is not as simple as appending commands to find your way around a filter. There are multiple possible scenarios, and different techniques to handle them. Exploits using more involved techniques like ROP can become necessary in some cases. That being said, most household embedded systems nowadays are decades behind personal computers in terms of anti-exploitation techniques. Methods like Address Space Layout Randomization(ASLR), which are designed to make exploit development much more complicated, are usually disabled or not implemented at all.
If you’d like to find a potential vulnerability so you can learn exploit development on your own, you can use the same techniques we’ve been using so far. Find potentially interesting inputs, locate the code that manages them using function names, hardcoded strings, etc. and try to trigger a malfunction sending an unexpected input. If we find an improperly handled string, we might have an exploitable bug.
Once we’ve located the piece of disassembled code we’re going to attack, we’re mostly interested in string manipulation functions like strcpy, strcat,sprintf, etc. Their more secure counterparts strncpy, strncat, etc. are also potentially vulnerable to some techniques, but usually much more complicated to work with.
Even though I’m not sure that function -extracted from /bin/tr064— is passed any user inputs, it’s still a good example of the sort of code you should be looking for. Once you find potentially insecure string operations that may handle user input, you need to figure out whether there’s an exploitable bug.
Try to cause a crash by sending unexpectedly long inputs and work from there. Why did it crash? How many characters can I send without causing a crash? Which payload can I fit in there? Where does it land in memory? etc. etc. I may write about this process in more detail at some point, but there’s plenty of literature available online if you’re interested.
Don’t spend all your efforts on the most obvious inputs only -which are also more likely to be properly filtered/handled-; using tools like the burp web proxy (or even the browser itself), we can modify fields like cookies to check for buffer overflows.
Web vulnerabilities like CSRF are also extremely common in embedded devices with web interfaces. Exploiting them to write to files or bypass authentication can lead to absolute control of the router, specially when combined with command injections. An authentication bypass for a router with the web interface available from the Internet could very well expose the network to being remotely man in the middle’d. They’re definitely an important attack vector, even though I’m not gonna go into how to find them.
Decompiling Binaries [Theory]
When you decompile a binary, instead of simply translating Machine Code to Assembly Code, the decompiler uses algorithms to identify functions, loops, branches, etc. and replicate them in a higher level language like C or Python.
That sounds like a brilliant idea for anybody who has been banging their head against some assembly code for a few hours, but an additional layer of abstraction means more potential errors, which can result in massive wastes of time.
In my (admittedly short) personal experience, the output just doesn’t look reliable enough. It might be fine when using expensive decompilers (IDA itself supports a couple of architectures), but I haven’t found one I can trust with MIPS binaries. That being said, if you’d like to give one a try, the RetDec online decompiler supports multiple architectures- including MIPS.
Even as a ‘high level’ language, the code is not exactly pretty to look at.
Whether we want to learn something about an algorithm we’re reversing, to debug an exploit we’re developing or to find any other sort of vulnerability, being able to execute (and, if possible, debug) the binary on an environment we fully control would be a massive advantage. In some/most cases -like this router-, being able to debug on the original hardware is not possible. In the next post, we’ll work on CPU emulation to debug the binaries in our own computers.
Thanks for reading! I’m sorry this post took so long to come out. Between work,hardwear.io and seeing family/friends, this post was written about 1 paragraph at a time from 4 different countries. Things should slow down for a while, so hopefully I’ll be able to publish Part 6 soon. I’ve also got some other reversing projects coming down the pipeline, starting with hacking the Amazon Echo and a router with JTAG. I’ll try to get to those soon, work permitting… Happy Hacking 🙂
Tips and Tricks
Mistaken xrefs and how to remove them
Sometimes an address is loaded into a register for 16bit/32bit adjustments. The contents of that address have no effect on the rest of the code; it’s just a routinary adjustment. If the address that is assigned to the register happens to be pointing to some valid data, IDA will rename the address in the assembly and display the contents in a comment.
It is up to you to figure out whether an x-ref makes sense or not. If it doesn’t, select the variable and press o in IDA to ignore the contents and give you only the address. This makes the code much less confusing.
Setting function prototypes so IDA comments the args around calls for us
Set the cursor on a function and press y. Set the prototype for the function: e.g. int memcpy(void *restrict dst, const void *restrict src, int n);. Note:IDA only understands built-in types, so we can’t use types like size_t.
Once again we can use the extern declarations found in the GPL source code. When available, find the declaration for a specific function, and use the same types and names for the arguments in IDA.
Taking Advantage of the GPL Source Code
If we wanna figure out what are the 1st and 2nd parameters of a function likeATP_DBSetPara, we can sometimes rely on the GPL source code. Lots of functions are not implemented in the kernel or any other open source component, but they’re still used from one of them. That means we can’t see the code we’re interested in, but we can see the extern declarations for it. Sometimes the source will include documentation comments or descriptive variable names; very useful info that the disassembly doesn’t provide:
Unfortunately, the function documentation comment is not very useful in this case -seems like there were encoding issues with the file at some point, and everything written in Chinese was lost. At least now we know that the first argument is a list of keys, and the second is something they call ParamCMO. ParamCMO is a constant in our disassembly, so it’s probably just a reference to the key we’re trying to set.
Disassembly Methods — Linear Sweep vs Recursive Descent
The structure of a binary can vary greatly depending on compiler, developers, etc. How functions call each other is not always straightforward for a disassembler to figure out. That means you may run into lots of ‘orphaned’ functions, which exist in the binary but do not have a known caller.
Which disassembler you use will dictate whether you see those functions or not, some of which can be extremely important to us (e.g. the ping function in theweb binary we reversed earlier). This is due to how they scan binaries for content:
Linear Sweep: Read the binary one byte at a time, anything that looks like a function is presented to the user. This requires significant logic to keep false positives to a minimum
Recursive Descent: We know the binary’s entry point. We find all functions called from main(), then we find the functions called from those, and keep recursively displaying functions until we’ve got “all” of them. This method is very robust, but any functions not referenced in a standard/direct way will be left out
Make sure your disassembler supports linear sweep if you feel like you’re missing any data. Make sure the code you’re looking at makes sense if you’re using linear sweep.
In Parts 1 to 3 we’ve been gathering data within its context. We could sniff the specific pieces of data we were interested in, or observe the resources used by each process. On the other hand, they had some serious limitations; we didn’t have access to ALL the data, and we had to deal with very minimal tools… And what if we had not been able to find a serial port on the PCB? What if we had but it didn’t use default credentials?
In this post we’re gonna get the data straight from the source, sacrificing context in favour of absolute access. We’re gonna dump the data from the Flash IC and decompress it so it’s usable. This method doesn’t require expensive equipment and is independent of everything we’ve done until now. An external Flash IC with a public datasheet is a reverser’s great ally.
Dumping the Memory Contents
As discussed in Part 3, we’ve got access to the datasheet for the Flash IC, so there’s no need to reverse its pinout:
We also have its instruction set, so we can communicate with the IC using almost any device capable of ‘speaking’ SPI.
We also know that powering up the router will cause the Ralink to start communicating with the Flash IC, which would interfere with our own attempts to read the data. We need to stop the communication between the Ralink and the Flash IC, but the best way to do that depends on the design of the circuit we’re working with.
Do We Need to Desolder The Flash IC? [Theory]
The perfect way to avoid interference would be to simply desolder the Flash IC so it’s completely isolated from the rest of the circuit. It gives us absolute control and removes all possible sources of interference. Unfortunately, it also requires additional equipment, experience and time, so let’s see if we can avoid it.
The second option would be to find a way of keeping the Ralink inactive while everything else around it stays in standby. Microcontrollers often have a Reset pin that will force them to shut down when pulled to 0; they’re commonly used to force IC reboots without interrupting power to the board. In this case we don’t have access to the Ralink’s full datasheet (it’s probably distributed only to customers and under NDA); the IC’s form factor and the complexity of the circuit around it make for a very hard pinout to reverse, so let’s keep thinking…
What about powering one IC up but not the other? We can try applying voltage directly to the power pins of the Flash IC instead of powering up the whole circuit. Injecting power into the PCB in a way it wasn’t designed for could blow something up; we could reverse engineer the power circuit, but that’s tedious work. This router is cheap and widely available, so I took the ‘fuck it’ approach. The voltage required, according to the datasheet, is 3V; I’m just gonna apply power directly to the Flash IC and see what happens. It may power up the Ralink too, but it’s worth a try.
We start supplying power while observing the board and waiting for data from the Ralink’s UART port. We can see some LEDs light up at the back of the PCB, but there’s no data coming out of the UART port; the Ralink must not be running. Even though the Ralink is off, its connection to the Flash IC may still interfere with our traffic because of multiple design factors in both power circuit and the silicon. It’s important to keep that possibility in mind in case we see anything dodgy later on; if that was to happen we’d have to desolder the Flash IC (or just its data pins) to physically disconnect it from everything else.
The LEDs and other static components can’t communicate with the Flash IC, so they won’t be an issue as long as we can supply enough current for all of them. I’m just gonna use a bench power supply, with plenty of current available for everything. If you don’t have one you can try using the Master’s power lines, or some USB power adapter if you need some more current. They’ll probably do just fine.
Time to connect our SPI Master.
Connecting to the Flash IC
Now that we’ve confirmed there’s no need to desolder the Ralink we can connect any device that speaks SPI and start reading memory contents block by block. Any microcontroller will do, but a purpose-specific SPI-USB bridge will often be much faster. In this case I’m gonna be using a board based on the FT232H, which supports SPI among some other low level protocols.
We’ve got the pinout for both the Flash and my USB-SPI bridge, so let’s get everything connected.
Now that the hardware is ready it’s time to start pumping data out.
Dumping the Data
We need some software in our computer that can understand the USB-SPI bridge’s traffic and replicate the memory contents as a binary file. Writing our own wouldn’t be difficult, but there are programs out there that already support lots of common Masters and Flash ICs. Let’s try the widely known and open source flashrom.
flashrom is old and buggy, but it already supports both the FT232H as Master and the FL064PIF as Slave. It gave me lots of trouble in both OSX and an Ubuntu VM, but ended up working just fine on a Raspberry Pi (Raspbian):
Success! We’ve got our memory dump, so we can ditch the hardware and start preparing the data for analysis.
Splitting the Binary
The file command has been able to identify some data about the binary, but that’s just because it starts with a header in a supported format. In a 0-knowledge scenario we’d use binwalk to take a first look at the binary file and find the data we’d like to extract.
Binwalk is a very useful tool for binary analysis created by the awesome hackers at /dev/ttyS0; you’ll certainly get to know them if you’re into hardware hacking.
In this case we’re not in a 0-knowledge scenario; we’ve been gathering data since day 1, and we obtained a complete memory map of the Flash IC in Part 2. The addresses mentioned in the debug message are confirmed by binwalk, and it makes for much cleaner splitting of the binary, so let’s use it:
With the binary and the relevant addresses, it’s time to split the binary into its 4 basic segments. dd takes its parameters in terms of block size (bs, bytes), offset (skip, blocks) and size (count, blocks); all of them in decimal. We can use a calculator or let the shell do the hex do decimal conversions with $(()):
$ dd if=spidump.bin of=bootloader.bin bs=1 count=$((0x020000))
131072+0 records in
131072+0 records out
131072 bytes transferred in 0.215768 secs (607467 bytes/sec)
$ dd if=spidump.bin of=mainkernel.bin bs=1 count=$((0x13D000-0x020000)) skip=$((0x020000))
1167360+0 records in
1167360+0 records out
1167360 bytes transferred in 1.900925 secs (614101 bytes/sec)
$ dd if=spidump.bin of=mainrootfs.bin bs=1 count=$((0x660000-0x13D000)) skip=$((0x13D000))
5386240+0 records in
5386240+0 records out
5386240 bytes transferred in 9.163635 secs (587784 bytes/sec)
$ dd if=spidump.bin of=protect.bin bs=1 count=$((0x800000-0x660000)) skip=$((0x660000))
1703936+0 records in
1703936+0 records out
1703936 bytes transferred in 2.743594 secs (621060 bytes/sec)
We have created 4 different binary files:
bootloader.bin: U-boot. The bootloader. It’s not compressed because the Ralink wouldn’t know how to decompress it.
mainkernel.bin: Linux Kernel. The basic firmware in charge of controlling the bare metal. Compressed using lzma
mainrootfs.bin: Filesystem. Contains all sorts of important binaries and configuration files. Compressed as squashfs using the lzma algorithm
protect.bin: Miscellaneous data as explained in Part 3. Not compressed
Extracting the Data
Now that we’ve split the binary into its 4 basic segments, let’s take a closer look at each of them.
Binwalk found the uImage header and decoded it for us. U-Boot uses these headers to identify relevant memory areas. It’s the same info that the file command displayed when we fed it the whole memory dump because it’s the first header in the file.
We don’t care much for the bootloader’s contents in this case, so let’s ignore it.
Compression is something we have to deal with before we can make any use of the data. binwalk has confirmed what we discovered in Part 2, the kernel is compressed using lzma, a very popular compression algorithm in embedded systems. A quick check with strings mainkernel.bin | less confirms there’s no human readable data in the binary, as expected.
There are multiple tools that can decompress lzma, such as 7z or xz. None of those liked mainkernel.bin:
$ xz --decompress mainkernel.bin
xz: mainkernel.bin: File format not recognized
The uImage header is probably messing with tools, so we’re gonna have to strip it out. We know the lzma data starts at byte 0x40, so let’s copy everything but the first 64 bytes.
And when we try to decompress…
$ xz --decompress mainkernel_noheader.lzma
xz: mainkernel_noheader.lzma: Compressed data is corrupt
xz has been able to recognize the file as lzma, but now it doesn’t like the data itself. We’re trying to decompress the whole mainkernel Flash area, but the stored data is extremely unlikely to be occupying 100% of the memory segment. Let’s remove any unused memory from the tail of the binary and try again:
xz seems to have decompressed the data successfully. We can easily verify that using the strings command, which finds ASCII strings in binary files. Since we’re at it, we may as well look for something useful…
The Wi-Fi Easy and Secure Key Derivation string looks promising, but as it turns out it’s just a hardcoded string defined by the Wi-Fi Protected Setup spec. Nothing to do with the password generation algorithm we’re interested in.
We’ve proven the data has been properly decompressed, so let’s keep moving.
The mainrootfs memory segment does not have a uImage header because it’s relevant to the kernel but not to U-Boot.
SquashFS is a very common filesystem in embedded systems. There are multiple versions and variations, and manufacturers sometimes use custom signatures to make the data harder to locate inside the binary. We may have to fiddle with multiple versions of unsquashfs and/or modify the signatures, so let me show you what the signature looks like in this case:
Since the filesystem is very common and finding the right configuration is tedious work, somebody may have already written a script to automate the task. I came across this OSX-specific fork of the Firmware Modification Kit, which compiles multiple versions of unsquashfs and includes a neat script called unsquashfs_all.sh to run all of them. It’s worth a try.
Wasn’t that easy? We got lucky with the SquashFS version and supported signature, and unsquashfs_all.sh managed to decompress the filesystem. Now we’ve got every binary in the filesystem, every symlink and configuration file, and everything is nice and tidy:
In the complete file tree we can see we’ve got every file in the system, (other than runtime files like those in /var/, of course).
Using the intel we have been gathering on the firmware since day 1 we can start looking for potentially interesting binaries:
If we were looking for network/application vulnerabilities in the router, having every binary and config file in the system would be massively useful.
As we discussed in Part 3, this memory area is not compressed and contains all pieces of data that need to survive across reboots but be different across devices. strings seems like an appropriate tool for a quick overview of the data:
Everything in there seems to be just the curcfg.xml contents, some logs and those few isolated strings in the picture. We already sniffed and analysed all of that data in Part 3, so there’s nothing else to discuss here.
At this point all hardware reversing for the Ralink is complete and we’ve collected everything there was to collect in ROM. Just think of what you may be interested in and there has to be a way to find it. Imagine we wanted to control the router through the UART debug port we found in Part 1, but when we try to access the ATP CLI we can’t figure out the credentials. After dumping the external Flash we’d be able to find the XML file in the protect area, and discover the credentials just like we did in Part 2 (The Rambo Approach to Intel Gathering, admin:admin).
If you couldn’t dump the memory IC for any reason, the firmware upgrade files provided by the manufacturers will sometimes be complete memory segments; the device simply overwrites the relevant flash areas using code previously loaded to RAM. Downloading the file from the manufacturer would be the equivalent of dumping those segments from flash, so we just need to decompress them. They won’t have all the data, but it may be enough for your purposes.
Now that we’ve got the firmware we just need to think of anything we may be interested in and start looking for it through the data. In the next post we’ll dig a bit into different binaries and try to find more potentially useful data.
The best thing about hardware hacking is having full access to very bare metal, and all the electrical signals that make the system work. With ingenuity and access to the right equipment we should be able to obtain any data we want. From simply sniffing traffic with a cheap logic analyser to using thousands of dollars worth of equipment to obtain private keys by measuring the power consumed by the device with enough precision (power analysis side channel attack); if the physics make sense, it’s likely to work given the right circumstances.
In this post I’d like to discuss traffic sniffing and how we can use it to gather intel.
Traffic sniffing at a practical level is used all the time for all sorts of purposes, from regular debugging during the delopment process to reversing the interface of gaming controllers, etc. It’s definitely worth a post of its own, even though this device can be reversed without it.
Full disclosure: I’m in contact with Huawei’s security team. I tried to contact TalkTalk, but their security staff is nowhere to be seen.
Data Flows In the PCB
Data is useless within its static memory cells, it needs to be read, written and passed around in order to be useful. A quick look at the board is enough to deduce where the data is flowing through, based on IC placement and PCB traces:
We’re not looking for hardware backdoors or anything buried too deep, so we’re only gonna look into the SPI data flowing between the Ralink and its external Flash.
Pretty much every IC in the market has a datasheet documenting all its technical characteristics, from pinouts to power usage and communication protocols. There are tons of public datasheets on google, so find the ones relevant to the traffic you want to sniff:
Now we’ve got pinouts, electrical characteristics, protocol details… Let’s take a first look and extract the most relevant pieces of data.
Understanding the Flash IC
We know which data flow we’re interested: The SPI traffic between the Ralink IC and Flash. Let’s get started; the first thing we need is to figure out how to connect the logic analyser. In this case we’ve got the datasheet for the Flash IC, so there’s no need to reverse engineer any pinouts:
Standard SPI communication uses 4 pins:
MISO (Master In Slave Out): Data line Ralink<-Flash
MOSI (Master Out Slave In): Data line Ralink->Flash
SCK (Clock Signal): Coordinates when to read the data lines
CS# (Chip Select): Enables the Flash IC when set to 0 so multiple of them can share MISO/MOSI/SCK lines.
We know the pinout, so let’s just connect a logic analyser to those 4 pins and capture some random transmission:
In order to set up our logic analyser we need to find out some SPI configuation options, specifically:
Transmission endianness [Standard: MSB First]
Number of bits per transfer [Standard: 8]. Will be obvious in the capture
CPOL: Default state of the clock line while inactive [0 or 1]. Will be obvious in the capture
CPHA: Clock edge that triggers the data read in the data lines [0=leading, 1=trailing]. We’ll have to deduce this
The datasheet explains that the flash IC understands only 2 combinations of CPOL and CPHA: (CPOL=0, CPHA=0) or (CPOL=1, CPHA=1)
Let’s take a first look at some sniffed data:
In order to understand exactly what’s happenning you’ll need the FL064PIF’s instruction set, available in its datasheet:
Now we can finally analyse the captured data:
In the datasheet we can see that the FL064PIF has high-performance features for read and write operations: Dual and Quad options that multiplex the data over more lines to increase the transmission speed. From taking a few samples, it doesn’t seem like the router uses these features much -if at all-, but it’s important to keep the possibility in mind in case we see something odd in a capture.
Transmission modes that require additional pins can be a problem if your logic analyser is not powerful enough.
The Importance of Your Sampling Rate [Theory]
A logic analyser is a conceptually simple device: It reads signal lines as digital inputs every x microseconds for y seconds, and when it’s done it sends the data to your computer to be analysed.
For the protocol analyser to generate accurate data it’s vital that we record digital inputs faster than the device writes them. Otherwise the data will be mangled by missing bits or deformed waveforms.
Unfortunately, your logic analyser’s maximum sampling rate depends on how powerful/expensive it is and how many lines you need to sniff at a time. High-speed interfaces with multiple data lines can be a problem if you don’t have access to expensive equipment.
I recorded this data from the Ralink-Flash SPI bus using a low-end Saleae analyser at its maximum sampling rate for this number of lines, 24 MS/s:
As you can see, even though the clock signal has the 8 low to high transitions required for each byte, the waveform is deformed.
Since the clock signal is used to coordinate when to read the data lines, this kind of waveform deformation may cause data corruption even if we don’t drop any bits (depending partly on the design of your logic analyser). There’s always some wiggle room for read inaccuracies, and we don’t need 100% correct data at this point, but it’s important to keep all error vectors in mind.
Let’s sniff the same bus using a higher performance logic analyser at 100 MS/s:
As you can see, this clock signal is perfectly regular when our Sampling Rate is high enough.
If you see anything dodgy in your traffic capture, consider how much data you’re willing to lose and whether you’re being limited by your equipment. If that’s the case, either skip this Reversing vector or consider investing in a better logic analyser.
Seeing the Data Flow
We’re already familiar with the system thanks to the overview of the firmware we did in Part 2, so we can think of some specific SPI transmissions that we may be interested in sniffing. Simply connecting an oscilloscope to the MISO and MOSI pins will help us figure out how to trigger those transmissions and yield some other useful data.
Here’s a video (no audio) showing both the serial interface and the MISO/MOSI signals while we manipulate the router:
This is a great way of easily identifying processes or actions that trigger flash read/write actions, and will help us find out when to start recording with the logic analyser and for how long.
Analysing SPI Traffic — ATP’s Save Command
In Post 2 I mentioned ATP CLI has a save command that stores something to flash; unfortunately, the help menu (save ?) won’t tell you what it’s doing and the only output when you run it is a few dots that act as a progress bar. Why don’t we find out by ourselves? Let’s make a plan:
Wait until boot sequence is complete and the router is idle so there’s no unexpected SPI traffic
Start the ATP Cli as explained in Part 1
Connect the oscilloscope to MISO/MOSI and run save to get a rough estimate of how much time we need to capture data for
Set a trigger in the enable line sniffed by the logic analyser so it starts recording as soon as the flash IC is selected
Analyse the captured data
Steps 3 and 4 can be combined so you see the data flow in real time in the scopewhile you see the charge bar for the logic analyser; that way you can make sure you don’t miss any data. In order to comfortably connect both scope and logic sniffer to the same pins, these test clips come in very handy:
Once we’ve got the traffic we can take a first look at it:
Let’s consider what sort of data could be extracted from this traffic dump that might be useful to us. We’re working with a memory storage IC, so we can see the data that is being read/written and the addresses where it belongs. I think we can represent that data in a useful way by 2 means:
Traffic map depicting which Flash areas are being written, read or erased in chronological order
Create binary files that replicate the memory blocks that were read/written, preferably removing all the protocol rubbish that we sniffed along with them.
Saleae’s SPI analyser will export the data as a CSV file. Ideally we’d improve their protocol analyser to add the functionality we want, but that would be too much work for this project. One of the great things about low level protocols like SPI is that they’re usually very straightforward; I decided to write some python spaghetti code to analyse the CSV file and extract the data we’re looking for: binmaker.py andtraffic_mapper.py
The workflow to analyse a capture is the following:
Export sniffed traffic as CSV
Run the script:
Iterate through the CSV file
Identify different commands by their index
Recognise the command expressed by the first byte
Process its arguments (addresses, etc.)
Identify the read/write payload
Convert ASCII representation of each payload byte to binary
Write binary blocks to different files for MISO (read) and MOSI (write)
Read the traffic map (regular text) and the binaries (hexdump -C output.bin | less)
Replicated Memory Blocks, Split by address: Files list
The traffic map is much more useful when combined with the Flash memory map we found in Part 2:
From the traffic map we can see the bulk of the save command’s traffic is simple:
Read about 64kB of data from the protect area
Overwrite the data we just read
In the MISO binary we can see most of the read data was just tons of 1s:
Most of the data in the MOSI binary is plaintext XML, and it looks exactly like the /var/curcfg.xml file we discovered in Part 2. As we discussed then, this “current configuration” file contains tons of useful data, including the current WiFi credentials.
It’s standard to keep reserved areas in flash; they’re mostly for miscellaneous data that needs to survive across reboots and be configurable by user, firmware or factory. It makes sense for a command called save to write data to such area, it explains why the data is perfectly readable as opposed to being compressed like the filesystem, and why we found the XML file in the /var/ folder of the filesystem (it’s a folder for runtime files; data in the protect area has to be loaded to memory separately from the filesystem).
The Pot of Gold at the End of the Firmware [Theory]
During this whole process it’s useful to have some sort of target to keep you digging in the same general direction.
Our target is an old one: the algorithm that generates the router’s default WiFi password. If we get our hands on such algorithm and it happens to derive the password from public information, any HG533 in the world with default WiFi credentials would probably be vulnerable.
That exact security issue has been found countless times in the past, usually deriving the password from public data like the Access Point’s MAC address or its SSID.
That being said, not all routers are vulnerable, and I personally don’t expect this one to be. The main reason behind targeting this specific vector is that it’s caused by a recurrent problem in embedded engineering: The need for a piece of data that is known by the firmware, unique to each device and known by an external entity. From default WiFi passwords to device credentials for IoT devices, this problem manifests in different ways all over the Industry.
Future posts will probably reference the different possibilities I’m about to explain, so let me get all that theory out of the way now.
The Sticker Problem
In this day and era, connecting to your router via ethernet so there’s no need for default WiFi credentials is not an option, using a display to show a randomly generated password would be too expensive, etc. etc. etc. The most widely adopted solution for routers is to create a WiFi network using default credentials, print those credentials on a sticker at the factory and stick it to the back of the device.
The WiFi password is the ‘unique piece of data’, and the computer printing the stickers in the factory is the ‘external entity’. Both the firmware and the computer need to know the default WiFi credentials, so the engineer needs to decide how to coordinate them. Usually there are 2 options available:
The same algorithm is implemented in both the device and the computer, and its input parameters are known to both of them
A computer generates the credentials for each device and they’re stored into each device separately
Developer incompetence aside, the first approach is usually taken as a last resort; if you can’t get your hardware manufacturer to flash unique data to each device or can’t afford the increase in manufacturing cost.
The second approach is much better by design: We’re not trusting the hardware with data sensitive enough to compromise every other device in the field. That being said, the company may still decide to use an algorithm with predictable outputs instead of completely random data; that would make the system as secure as the weakest link between the algorithm -mathematically speaking-, the confidentiality of their source code and the security of the computers/network running it.
Sniffing Factory Reset
So now that we’ve discussed our target, let’s gather some data about it. The first thing we wanna figure out is which actions will kickstart the flow of relevant data on the PCB. In this case there’s 1 particular action: Pressing the Factory Reset button for 10s. This should replace the existing WiFi credentials with the default ones, so the default creds will have to be generated/read. If the key or the generation algorithm need to be retrieved from Flash, we’ll see them in a traffic capture.
That’s exactly what we’re gonna do, and we’re gonna observe the UART interface, the oscilloscope and the logic analyser during/after pressing the reset button. The same process we followed for ATP’s save gives us these results:
The traffic map tells us the device first reads and overwrites 2 large chunks of data from the protect area and then reads a smaller chunk of data from the filesystem (possibly part of the next process to execute):
Once again, we combine transmission map and binary files to gain some insight into the system. In this case, the ‘factory reset’ code seems to:
Read ATP_LOG from Flash; it contains info such as remote router accesses or factory resets. It ends with a large chunk of 1s (0xff)
Overwrite that memory segment with 1s
write a ‘new’ ATP_LOG followed by the “current configuration” curcfg.xmlfile
Read compressed (unintelligible to us) memory chunk from the filesystem
The chunk from the filesystem is read AFTER writing the new password to Flash, which doesn’t make sense for a password generation algorithm. That being said, the algorithm may be already loaded into memory, so its absence in the SPI traffic is not conclusive on whether or not it exists.
As part of the MOSI data we can see the new WiFi password be saved to Flash inside the XML string:
What about the default password being read? If we look in the MISO binary, it’s nowhere to be seen. Either the Ralink is reading it using a different mode (secure/dual/quad/?) or the credentials/algorithm are already loaded in RAM (no need to read them from Flash again, since they can’t change). The later seems more likely, so I’m not gonna bother updating my scripts to support different read modes. We write down what we’ve found and we’ll get back to the default credentials in the next part.
Since we’re at it, let’s take a look at the SPI traffic generated when setting new WiFi credentials via HTTP: Map, MISO, MOSI. We can actually see the default credentials being read from the protect area of Flash this time (not sure why the Ralink would load it to set a new password; it’s probably incidental):
As you can see, they’re in plain text and separated from almost anything else in Flash. This may very well mean there’s no password generation algorithm in this device, but it is NOT conclusive. The developers could have decided to generate the credentials only once (first boot?) and store them to flash in order to limit the number of times the algorithm is accessed/executed, which helps hide the binary that contains it. Otherwise we could just observe the running processes in the router while we press the Factory Reset button and see which ones spawn or start consuming more resources.
Now that we’ve got the code we need to create binary recreations of the traffic and transmission maps, getting from a capture to binary files takes seconds. I captured other transmissions such as the first few seconds of boot (map, miso), but there wasn’t much worth discussing. The ability to easily obtain such useful data will probably come in handy moving forward, though.
In the next post we get the data straight from the source, communicating with the Flash IC directly to dump its memory. We’ll deal with compression algorithms for the extracted data, and we’ll keep piecing everything together.
In part 1 we found a debug UART port that gave us access to a Linux shell. At this point we’ve got the same access to the router that a developer would use to debug issues, control the system, etc.
This first overview of the system is easy to access, doesn’t require expensive tools and will often yield very interesting results. If you want to do some hardware hacking but don’t have the time to get your hands too dirty, this is often the point where you stop digging into the hardware and start working on the higher level interfaces: network vulnerabilities, ISP configuration protocols, etc.
These posts are hardware-oriented, so we’re just gonna use this access to gather some random pieces of data. Anything that can help us understand the system or may come in handy later on.
Full disclosure: I’m in contact with Huawei’s security team; they’ve had time to review the data I’m going to reveal in this post and confirm there’s nothing too sensitive for publication. I tried to contact TalkTalk, but their security staff is nowhere to be seen.
Picking Up Where We Left Off
We get our serial terminal application up and running in the computer and power up the router.
We press enter and get the login prompt from ATP Cli; introduce the credentials admin:admin and we’re in the ATP command line. Execute the command shell and we get to the BusyBox CLI (more on BusyBox later).
-----Welcome to ATP Cli------
Password: #Password is ‘admin'
BusyBox vv1.9.1 (2013-08-29 11:15:00 CST) built-in shell (ash)
Enter 'help' for a list of built-in commands.
var usr tmp sbin proc mnt lib init etc dev bin
At this point we’ve seen the 3 basic layers of firmware in the Ralink IC:
U-boot: The device’s bootloader. It understands the device’s memory map, kickstarts the main firmware execution and takes care of some other low level tasks
Linux: The router is running Linux to keep overall control of the hardware, coordinate parallel processes, etc. Both ATP CLI and BusyBox run on top of it
Busybox: A small binary including reduced versions of multiple linux commands. It also supplies the shell we call those commands from.
Lower level interfaces are less intuitive, may not have access to all the data and increase the chances of bricking the device; it’s always a good idea to start from BusyBox and walk your way down.
For now, let’s focus on the boot sequence itself. The developers thought it would be useful to display certain pieces of data during boot, so let’s see if there’s anything we can use.
Boot Debug Messages
We find multiple random pieces of data scattered across the boot sequence. We’ll find useful info such as the compression algorithm used for some flash segments:
Intel on how the external flash memory is structured will be very useful when we get to extracting it.
And more compression intel:
We’ll have to deal with the compression algorithms when we try to access the raw data from the external Flash, so it’s good to know which ones are being used.
What Are ATP CLI and BusyBox Exactly? [Theory]
The Ralink IC in this router runs a Linux kernel to control memory and parallel processes, keep overall control of the system, etc. In this case, according to the Ralink’s product brief, they used the Linux 2.6.21 SDK. ATP CLI is a CLI running either on top of Linux or as part of the kernel. It provides a first layer of authentication into the system, but other than that it’s very limited:
Welcome to ATP command line tool.
If any question, please input "?" at the end of command.
help doesn’t mention the shell command, but it’s usually either shell orsh. This ATP CLI includes less than 10 commands, and doesn’t support any kind of complex process control or file navigation. That’s where BusyBox comes in.
BusyBox is a single binary containing reduced versions of common unix commands, both for development convenience and -most importantly- to save memory. From ls and cd to top, System V init scripts and pipes, it allows us to use the Ralink IC somewhat like your regular Linux box.
One of the utilities the BusyBox binary includes is the shell itself, which has access to the rest of the commands:
BusyBox vv1.9.1 (2013-08-29 11:15:00 CST) built-in shell (ash)
Enter 'help' for a list of built-in commands.
var usr tmp sbin proc mnt lib init etc dev bin
# ls /bin
zebra swapdev printserver ln ebtables cat
wpsd startbsp pppc klog dns busybox
wlancmd sntp ping kill dms brctl
web smbpasswd ntfs-3g iwpriv dhcps atserver
usbserver smbd nmbd iwconfig dhcpc atmcmd
usbmount sleep netstat iptables ddnsc atcmd
upnp siproxd mount ipp date at
upg sh mldproxy ipcheck cwmp ash
umount scanner mknod ip cp adslcmd
tr111 rm mkdir igmpproxy console acl
tr064 ripd mii_mgr hw_nat cms ac
telnetd reg mic ethcmd cli
tc radvdump ls equipcmd chown
switch ps log echo chmod
You’ll notice different BusyBox quirks while exploring the filesystem, such as the symlinks to a busybox binary in /bin/. That’s good to know, since any commands that may contain sensitive data will not be part of the BusyBox binary.
Exploring the File System
Now that we’re in the system and know which commands are available, let’s see if there’s anything useful in there. We just want a first overview of the system, so I’m not gonna bother exposing every tiny piece of data.
The top command will help us identify which processes are consuming the most resources. This can be an extremely good indicator of whether some processes are important or not. It doesn’t say much while the router’s idle, though:
One of the processes running is usbmount, so the router must support connecting ‘something’ to the USB port. Let’s plug in a flash drive in there…
usb 1-1: new high speed USB device using rt3xxx-ehci and address 2
++++++sambacms.c 2374 renice=renice -n +10 -p 1423
The USB is recognised and mounted to /mnt/usb1_1/, and a samba server is started. These files show up in /etc/samba/:
/var/ and /etc/ always contain tons of useful data, and some of it makes itself obvious at first sight. Does that say /etc/serverkey.pem??
It’s not unusual to find private keys for TLS certificates in embedded systems. By accessing 1 single device via hardware you may obtain the keys that will help you attack any other device of the same model.
This key could be used to communicate with some server from Huawei or the ISP, although that’s less common. On the other hand, it’s also very common to findpublic certs used to communicate with remote servers.
In this case we find 2 certificates next to the private key; both are self-signed by the same ‘person’:
/etc/servercert.pem: Most likely the certificate for the serverkey
/etc/root.pem: Probably used to connect to a server from the ISP or Huawei. Not sure.
And some more data in /etc/ppp256/config and /etc/ppp258/config:
These credentials are also available via the HTTP interface, which is why I’m publishing them, but that’s not the case in many other routers (more on this later).
With so many different files everywhere it can be quite time consuming to go through all the info without the right tools. We’re gonna copy as much data as we can into the USB drive and go through it on our computer.
The Rambo Approach to Intel Gathering
Once we have as many files as possible in our computer we can check some things very quick. find . -name *.pem reveals there aren’t any other TLS certificates.
What about searching the word password in all files? grep -i -r password .
We can see lots of credentials; most of them are for STUN, TR-069 and local services. I’m publishing them because this router proudly displays them all via the HTTP interface, but those are usually hidden.
If you wanna know what happens when someone starts pulling from that thread, check out Alexander Graf’s talk “Beyond Your Cable Modem”, from CCC 2015. There are many other talks about attacking TR-069 from DefCon, BlackHat, etc. etc.
The credentials we can see are either in plain text or encoded in base64. Of course, encoding is worthless for data protection:
$ echo "QUJCNFVCTU4=" | base64 -D
That is the current WiFi password set in the router. It leads us to 2 VERY interesting files. Not just because of their content, but because they’re a vital part of how the router operates:
/var/curcfg.xml: Current configuration file. Among other things, it contains the current WiFi password encoded in base64
/etc/defaultcfg.xml: Default configuration file, used for ‘factory reset’. Does not include the default WiFi password (more on this in the next posts)
Exploring ATP’s CLI
The ATP CLI includes very few commands. The most interesting one -besidesshell— is debug. This isn’t your regular debugger; debug display will simply give you some info about the commands igmpproxy, cwmp, sysuptime or atpversion. Most of them don’t have anything juicy, but what about cwmp? Wasn’t that related to remote configuration of routers?
Once again, these are the CWMP (TR-069) credentials used for remote router configuration. Not even encoded this time.
The rest of the ATP commands are pretty useless: clear screen, help menu, save to flash and exit. Nothing worth going into.
Exploring Uboot’s CLI
The bootloader’s command line interface offers raw access to some memory areas. Unfortunately, it doesn’t give us direct access to the Flash IC, but let’s check it out anyway.
Please choose operation:
3: Boot system code via Flash (default).
4: Entr boot command line interface.
You choosed 4
Stopped Uboot WatchDog Timer.
4: System Enter Boot Command Line Interface.
U-Boot 1.1.3 (Aug 29 2013 - 11:16:19)
RT3352 # help
? - alias for 'help'
bootm - boot application image from memory
cp - memory copy
erase - erase SPI FLASH memory
go - start application at address 'addr'
help - print online help
md - memory display
mdio - Ralink PHY register R/W command !!
mm - memory modify (auto-incrementing)
mw - memory write (fill)
nm - memory modify (constant address)
printenv- print environment variables
reset - Perform RESET of the CPU
rf - read/write rf register
saveenv - save environment variables to persistent storage
setenv - set environment variables
uip - uip command
version - print monitor version
Don’t touch commands like erase, mm, mw or nm unless you know exactly what you’re doing; you’d probably just force a router reboot, but in some cases you may brick the device. In this case, md (memory display) and printenv are the commands that call my atention.
We can see settings like the UART baudrate, as well as some interesting memory locations. Those memory addresses are not for the Flash IC, though. The flash memory is only addressed by 3 bytes: [0x00000000, 0x00FFFFFF].
Let’s take a look at some of them anyway, just to see the kind of access this interface offers.What about kernel_addr=BFC40000?
Nope, that badd message means bad address, and it has been hardcoded in md to let you know that you’re trying to access invalid memory locations. These are good addresses, but they’re not accessible to u-boot at this point.
It’s worth noting that by starting Uboot’s CLI we have stopped the router from loading the linux Kernel onto memory, so this interface gives access to a very limited subset of data.
We can find random pieces of data around memory using this method (such as thatSPI Flash Image string), but it’s pretty hopeless for finding anything specific. You can use it to get familiarised with the memory architecture, but that’s about it. For example, there’s a very obvious change in memory contents at 0x000d0000:
And just because it’s about as close as it gets to seeing the girl in the red dress, here is the md command in action. You’ll notice it’s very easy to spot that change in memory contents at 0x000d0000.
In the next post we combine firmware and bare metal, explain how data flows and is stored around the device, and start trying to manipulate the system to leak pieces of data we’re interested in.
In this series of posts we’re gonna go through the process of Reverse Engineering a router. More specifically, a Huawei HG533.
At the earliest stages, this is the most basic kind of reverse engineering. We’re simple looking for a serial port that the engineers who designed the device left in the board for debug and -potentially- technical support purposes.
Even though I’ll be explaining the process using a router, it can be applied to tons of household embedded systems. From printers to IP cameras, if it’s mildly complex it’s quite likely to be running some form of linux. It will also probably have hidden debug ports like the ones we’re gonna be looking for in this post.
Finding the Serial Port
Most UART ports I’ve found in commercial products are between 4 and 6 pins, usually neatly aligned and sometimes marked in the PCB’s silkscreen somehow. They’re not for end users, so they almost never have pins or connectors attached.
After taking a quick look at the board, 2 sets of unused pads call my atention (they were unused before I soldered those pins in the picture, anyway):
This device seems to have 2 different serial ports to communicate with 2 different Integrated Circuits (ICs). Based on the location on the board and following their traces we can figure out which one is connected to the main IC. That’s the most likely one to have juicy data.
In this case we’re simply gonna try connecting to both of them and find out what each of them has to offer.
Identifying Useless Pins
So we’ve found 2 rows of pins that -at first sight- could be UART ports. The first thing you wanna do is find out if any of those contacts is useless. There’s a very simple trick I use to help find useless pads: Flash a bright light from the backside of the PCB and look at it from directly above. This is what that looks like:
We can see if any of the layers of the PCB is making contact with the solder blob in the middle of the pad.
Connected to something (we can see a trace “at 2 o’clock”)
100% connected to a plane or thick trace. It’s almost certainly a power pin, either GND or Vcc
Connections at all sides. This one is very likely to be the other power pin. There’s no reason for a data pin in a debug port to be connected to 4 different traces, but the pad being surrounded by a plane would explain those connections
Connected to something
Soldering Pins for Easy Access to the Lines
In the picture above we can see both serial ports.
The pads in these ports are through-hole, but the holes themselves are filled in with blobs of very hard, very high melting point solder.
I tried soldering the pins over the pads, but the solder they used is not easy to work with. For the 2nd serial port I decided to drill through the solder blobs with a Dremel and a needle bit. That way we can pass the pins through the holes and solder them properly on the back of the PCB. It worked like a charm.
Identifying the Pinout
So we’ve got 2 connectors with only 3 useful pins each. We still haven’t verified the ports are operative or identified the serial protocol used by the device, but the number and arrangement of pins hint at UART.
Let’s review the UART protocol. There are 6 pin types in the spec:
Tx [Transmitting Pin. Connects to our Rx]
Rx [Receiving Pin. Connects to our Tx]
GND [Ground. Connects to our GND]
Vcc [The board’s power line. Usually 3.3V or 5V. DO NOT CONNECT]
CTS [Typically unused]
DTR [Typically unused]
We also know that according to the Standard, Tx and Rx are pulled up (set to 1) by default. The Transmitter of the line (Tx) is in charge of pulling it up, which means if it’s not connected the line’s voltage will float.
So let’s compile what we know and get to some conclusions:
Only 3 pins in each header are likely to be connected to anything. Those must be Tx, Rx and GND
Two pins look a lot like Vcc and GND
One of them -Tx- will be pulled up by default and be transmitting data
The 3rd of them, Rx, will be floating until we connect the other end of the line
That information seems enough to start trying different combinations with your UART-to-USB bridge, but randomly connecting pins you don’t understand is how you end up blowing shit up.
Let’s keep digging.
A multimeter or a logic analyser would be enough to figure out which pin is which, but if you want to understand what exactly is going on in each pin, nothing beats a half decent oscilloscope:
After checking the pins out with an oscilloscope, this is what we can see in each of them:
GND and Vcc verified — solid 3.3V and 0V in pins 2 and 3, as expected
Tx verified — You can clearly see the device is sending information
One of the pins floats at near-0V. This must be the device’s Rx, which is floating because we haven’t connected the other side yet.
So now we know which pin is which, but if we want to talk to the serial port we need to figure out its baudrate. We can find this with a simple protocol dump from a logic analyser. If you don’t have one, you’ll have to play “guess the baudrate” with a list of the most common ones until you get readable text through the serial port.
This is a dump from a logic analyser in which we’ve enabled protocol analysis and tried a few different baudrates. When we hit the right one, we start seeing readable text in the sniffed serial data (\n\r\n\rU-Boot 1.1.3 (Aug...)
Once we have both the pinout and baudrate, we’re ready to start communicating with the device:
Connecting to the Serial Ports
Now that we’ve got all the info we need on the hardware side, it’s time to start talking to the device. Connect any UART to USB bridge you have around and start wandering around. This is my hardware setup to communicate with both serial ports at the same time and monitor one of the ports with an oscilloscope:
And when we open a serial terminal in our computer to communicate with the device, the primary UART starts spitting out useful info. These are the commands I use to connect to each port as well as the first lines they send during the boot process:
Please choose operation:
3: Boot system code via Flash (default).
4: Entr boot command line interface.
‘Command line interface’?? We’ve found our way into the system! When we press 4we get a command line interface to interact with the device’s bootloader.
Furthermore, if we let the device start as the default 3, wait for it to finish booting up and press enter, we get the message Welcome to ATP Cli and a login prompt. If the devs had modified the password this step would be a bit of an issue, but it’s very common to find default credentials in embedded systems. After a few manual tries, the credentials admin:admin succeeded and I got access into the CLI:
-----Welcome to ATP Cli------
Password: #Password is ‘admin'
BusyBox vv1.9.1 (2013-08-29 11:15:00 CST) built-in shell (ash)
Enter 'help' for a list of built-in commands.
var usr tmp sbin proc mnt lib init etc dev bin
Running the shell command in ATP will take us directly into Linux’s CLI with root privileges 🙂
This router runs BusyBox, a linux-ish interface which I’ll talk about in more detail in the next post.
Now that we have access to the BusyBox CLI we can start nosing around the software. Depending on what device you’re reversing there could be plain text passwords, TLS certificates, useful algorithms, unsecured private APIs, etc. etc. etc.
In the next post we’ll focus on the software side of things. I’ll explain the differences between boot modes, how to dump memory, and other fun things you can do now that you’ve got direct access to the device’s firmware.