Coldcard isolation bypass

Coldcard isolation bypass

Original text by benma

Coldcard isolation bypass

Shift Crypto responsibly disclosed the remote exploit to Coinkite (Coldcard) on August 4th, 2020, and mutually agreed to a 90-day disclosure embargo period. Coinkite created a fix on September 30th but, as of this date, have not yet released a firmware update to mitigate against potential exploits in the wild. The fix on September 30th explicitly described the issue and attack scenarios in the changelog, and the 90-day embargo period, including requested extensions, has lapsed. Therefore, we now disclose the issue, and we encourage Coldcard users to take appropriate precautions until an update is available.

The original issue

On August 4th 2020, @mo_nokh disclosed a vulnerability in the Ledger hardware wallets in which a user could unknowingly confirm a Bitcoin transaction that was masquerading as an altcoin or testnet transaction:

An attacker can exploit this method to transfer Bitcoin while the user is under the impression that a transaction of another, less valuable altcoin (e.g. Litecoin, Testnet Bitcoins, Bitcoin Cash, etc.) is being executed

A quick high-level summary of how this is possible:

When you create a transaction proposal on your computer, the transaction is sent to the hardware wallet to be confirmed by the user and formally signed. The computer also tells the hardware wallet what kind of coin we are dealing with, e.g. “this is a Litecoin transaction, please show the amounts as Litecoin and display the addresses in Litecoin’s address formats”. The hardware wallet will just take that info, and show, for example “Sending 1 LTC to ltc1…”.

Litecoin, Bitcoin, their testnets and some other coins all have the exact same transaction representation under the hood. For example, there is nothing in a Bitcoin transaction that says it is a Bitcoin transaction and not a Litecoin transaction. From a hardware wallet’s point of view, a valid Litecoin transaction proposal is also a valid Bitcoin transaction proposal.

As a consequence, a compromised wallet on the computer can simply create a transaction spending bitcoins to the attacker, and send it to the hardware wallet as a Litecoin transaction. The user will only see Litecoin information on the device (addresses in Litecoin’s address format, amounts denominated in Litecoin, etc) while not suspecting that they are in fact sending bitcoins.

The problem is mitigated by allocating separate private keys to separate coins, as described by BIP44. If all your bitcoins use one set of private keys, and all your litecoins use a different set of private keys, then signing a Litecoin transaction with Litecoin keys can never spend actual bitcoins. The BitBox02 has always enforced this key separation strictly.


When the isolation bypass vulnerability was disclosed, Coldcard responded quickly that they were not affected:

This, unfortunately, was not true. While the Coldcard does not support “shitcoins”, it does support testnet. A quick test confirmed that the Coldcard was in fact vulnerable in the exact same way as Ledger. A user confirming a testnet transaction on the device could be spending mainnet (i.e. the real thing) funds without their knowledge.

For example, this real mainnet transaction:

…is confirmed and signed like this when sending the same real mainnet transaction to the Coldcard while it is in testnet mode:

Impact and severity

As a hardware wallet user, you should assume your computer is compromised. That is the reason to use a hardware wallet in the first place. Starting from there, exploiting this attack is not very far fetched. The attacker merely has to convince the user to e.g. “try a testnet transaction”, or to buy an ICO with testnet coins (I’ve heard there was a ICO like this recently) or any number of social engineering attacks to make the user perform a testnet transaction. After the user confirms a testnet transaction, the attacker receives mainnet bitcoin in the same amount.

Since the attack can be performed remotely, it counts as a critical issue according to Shift’s security assessment guideline. Severity points are deducted as the attack does not scale very well:

  • the device has to be unlocked and user interaction is required
  • the attacker has to convince the user to make a testnet transaction to the attacker’s testnet address


I did not discover the isolation bypass attack, all credit goes to @mo_nokh, who published the original attack on the Ledger. After reading about it, I remembered that the Coldcard has testnet support. I quickly built a proof-of-concept to see if this attack is feasible. Seeing that it indeed is, I immediately responsibly disclosed the issue to the Coldcard team.

Disclosure timeline:

  • The issue was responsibly disclosed to Coinkite on Aug. 4th, 2020.
  • Coinkite acknowledged the issue on Aug. 5th and asked for 90 days to fix the issue.
  • Coinkite created a fix on Sept. 30th, publicly disclosing the issue and attack scenario in the changelog:
  • Noticing the fix, on Oct. 14th we asked when to expect a firmware release. Coinkite informed that additional items are being added in advance of the next firmware release.
  • On Nov 12th, one week beyond the embargo period, we asked again when to expect a firmware release and gave notice that we would release our disclosure soon. Coinkite informed that the release would be further delayed, but imminent, as additional items were being included.
  • On Nov 21st, we informed Coinkite that we would publish our disclosure on Nov 24th.
  • On Nov 23rd, Coldcard published a BETA version of the next firmware with the fix included:

Discuss on Reddit.

Originally published at

Finding and exploiting CVE-2018–7445 (unauthenticated RCE in MikroTik’s RouterOS SMB)

Original text by @maxi.

Summary for the anxious reader

  • CVE-2018–7445 is a stack buffer overflow in the SMB service binary present in all RouterOS versions and architectures prior to 6.41.3/6.42rc27.
  • It was found using dumb-fuzzing assisted with the Mutiny Fuzzer tool from Cisco Talos and reported/fixed about a year ago.
  • The vulnerable binary was not compiled with stack canaries.
  • The exploit does ROP to mark the heap as executable and jumps to a fixed location in the heap. The heap base was not randomized.
  • Dumb fuzzing still found bugs in interesting targets in 2018 (although I’m sure there must be none left for 2019!)
  • The post describes the full process from target selection to identifying a vulnerability and then producing a working exploit.


The last few years have seen a surge in the number of public vulnerabilities found and reported in MikroTik RouterOS devices. From a remote buffer overflow affecting the built-in web server included in the CIA Vault 7 leak to a plethora of other vulnerabilities reported by Kirils Solovjovs from Possible Security and Jacob Baines from Tenable that result in full remote compromise.

MikroTik was recently added to the list of eligible router brands in the exploit acquisition program maintained by Zerodium, including a one-month offer to buy pre-auth RCEs for $100,000. This might reflect an increasing interest in MikroTik products and their security posture.

This blog post is an attempt to make a small contribution to the ongoing MikroTik RouterOS vulnerability research. I will outline the steps we took with my colleague Juan (thanks Juan!) during our time together at Core Security to find and exploit CVE-2018–7445, a remote buffer overflow in MikroTik’s RouterOS SMB service that could be triggered from the perspective of an unauthenticated attacker.

The vulnerability is easy to find and exploitation is straight-forward, so the idea is to provide a detailed walk-through that will (hopefully!) be useful for other beginners interested in memory corruption. I will try to cover the full process from “hey! let’s look at this MikroTik thing” to actually finding a vulnerability in a network service and writing an exploit for it.

The original advisory can be found here.

Mandatory disclaimer: I am no longer affiliated with Core Security, so the content of this post does not reflect its views or represents the company in any way.


The vulnerability is present in all architectures and devices running RouterOS prior to 6.41.3/6.42rc27, so the first step is getting a vulnerable system running.

MikroTik makes this very easy by maintaining an archive of all previously released versions. It is also possible to download the Cloud Hosted Router version of RouterOS, which is available as a virtual machine that boasts full RouterOS features. This allows running RouterOS in x86–64 architectures using popular hypervisors without needing an actual hardware device.

Let’s get the 6.40.5 version of the Cloud Hosted Router from here and create the virtual machine on VirtualBox.

Default administrator credentials consist of admin as the username and an empty password.

The RouterOS console is a restricted environment and does not allow the user to execute any command outside of a pre-defined set of configuration options.

In order to replicate the vulnerability discovery, the SMB service needs to be enabled. This can be achieved with the ip smb set enabled=yes command.

Note that the fact that the service is not enabled by default makes the likelihood of active exploitation much smaller. In addition, you should probably not be exposing your SMB service to public networks, but well, there’s always those pesky users in the internal network that might have access to this service.

The restricted console is not suitable to perform proper debugging, so before looking for vulnerabilities it is useful to have full shell access. Kirils Solovjovs has published extensive research on jailbreaking RouterOS, including the release of a tool that can be used to jailbreak 6.40.5. It would not make sense to repeat the underlying details here, so head to Kirils’ research hub or the more recent Jacob Baines’ post for newer versions where the entry point used for 6.40.5 has been patched.

Jailbreaking RouterOS 6.40.5 is as easy as cloning the repository and running the interactive exploit-backup/ exploit pointing to our VM.

Finally, download a pre-compiled version of GDB from and upload it to the system using FTP.

Connecting to the device via Telnet would allow us to attach to running processes and debug them properly.

We are now ready to start looking for vulnerabilities in the network services.

Target selection

There are lots of services running in RouterOS. A quick review shows common services such as HTTP, FTP, SSH, and Telnet, and some other RouterOS-specific services such as the bandwidh test server running on port 2000.

Jacob Baines pointed out that over 90 different binaries that implement network services can be reached by speaking the Winbox protocol (see The Real Attack Surface in his excellent blog post).

We were not aware of all that reachable functionality when we started poking around with RouterOS and did not invest the time to reverse engineer the binaries that spoke Winbox, so we just went ahead and looked at the few binaries that were explicitly listening on the network.

Most (all?) services in RouterOS seem to have been implemented from scratch, so there are thousands of lines of custom low level code waiting to be audited.

Our objective was to achieve unauthenticated remote code execution, and on a first look the binaries for common services such as FTP or Telnet did not provide much reachable functionality without providing credentials. This made us turn to other services that might not be enabled by default but require the implementation of a rich feature set. The fact that these services are not enabled by default means they might have been neglected by other attackers wanting to maximize their ROI on vulnerabilities that affect default installations of RouterOS and are therefore much more valuable.

By following this rationale and inspecting the available services we decided to take a look at the SMB implementation.

Finding the vulnerability

We know we want to find vulnerabilities in the SMB service. We have the virtual machine setup, the service running, we have full shell access to the device and we can debug any processes. How do we find a vulnerability?

One option would be to disassemble the binary and look for insecure coding patterns. We would identify interesting operations such as strcpymemcpy, etc. and see if the correct size checks are in place. We would then see if those code paths can be reached with user controlled input. We could combine this with dynamic analysis and use our ability to attach to a running process with GDB to inspect registers at runtime, memory locations, etc. However, this can be time consuming and it is easy to feel frustrated if you do not have experience doing reverse engineering, especially if it is a large binary.

Another option is to fuzz the network service. This approach consists of sending data to the remote service and checking if it causes an unexpected behavior or a crash. This data will contain malformed messages, invalid sizes, very long strings, etc.

There are different ways to conduct the fuzzing process. The two most popular strategies are generation and mutation-based fuzzing. Generation-based fuzzing requires knowledge of the protocol to build test cases that comply with the format specified by the protocol and will (most likely) result in a more thorough coverage. More coverage means more chances of hitting vulnerable code paths and therefore more bugs. On the other hand, mutation-based fuzzing assumes no prior knowledge of the protocol being fuzzed and takes much less effort at the cost of potentially poor code coverage and additional difficulties in protocols that need to compute checksums to ensure data integrity.

We decided to try our luck with a dumb fuzzer and chose the Mutiny Fuzzer tool that had been released a few months earlier by the Cisco Talos team. Mutiny takes a sample of legitimate network traffic and replays it through a mutational fuzzer. In particular, Mutiny uses Radamsa to mutate the traffic.

Performing this kind of fuzzing has the benefit of being very quick to get up running and, as we will see, might provide great results if we have a good selection of test cases that stress various features.

Putting this together, the steps to fuzz a network service are:

  • Capture legitimate traffic
  • Create a Mutiny fuzzer template from the resulting PCAP file
  • Run Mutiny to mutate the traffic and replay it to the service
  • Observe what happens with the running service

Mutiny does provide a monitoring script that can be used to (d’oh!) monitor the service and identify weird behavior. This can be accomplished by implementing the monitorTarget function as described in Sample checks could be pinging the remote service or connecting to it to assess its availability, monitoring the process, logs, or whatever else might signal weird behavior.

In this case, the SMB service will take a while to restart after a crash and log a stack trace message, so we decided it was not worth scripting any monitoring actions. Instead, we just captured the traffic throughout the fuzz process with Wireshark and relied on the default behavior of Mutiny, which is to exit when the request fails due to a connection refused error, meaning that the service is down. This is rather rudimentary and leaves a lot of room for improvement, but it was enough for our tests.

It is important to enable full logging before we initiate the fuzzing process. This could prove useful to track any crashes that might occur, as the full stack trace will be included in the logs that are located in /rw/logs/backtrace.log. This can be configured from RouterOS’ web interface.

Another thing that proved useful was running the binary in an interactive console to get the debug output in real time. This can be achieved by killing the running process and relaunching it from the full-fledged terminal. Errors and general status of processed requests will be printed.

Now that we have a high level overview of the steps involved, let’s recap and actually fuzz the SMB service.

First we clone and follow the setup instructions.

The next step in our plan consists of generating some network traffic. In order to do this, open Wireshark and attempt to access a resource on the router with smbclient.

Smbclient will send a Negotiate Protocol request to port 445/TCP and receive a response which we do not care about. This can be observed in the Wireshark capture.

We want to use this request as the starting point to produce (hopefully!) meaningful mutations. Stop the Wireshark capture and save the request packet by going to File -> Export Specified Packets with the request packet selected. Output format should be PCAP.

Once we have the PCAP containing the request to fuzz, we prepare Mutiny’s .fuzzer file with the interactive script.

It is a good idea to review the resulting file to identify any weirdness that could come up during conversion.

Here we could configure Mutiny to fuzz only parts of the message. This would be useful if we wanted for example to focus our efforts on individual fields. In this case we will fuzz the entire message. It is worth mentioning that Mutiny can also handle multi-message exchanges.

If the test cases we use as initial templates contain parts that do not cause the program to take different paths, then all the modifications we make to this data will never increase code coverage, which results in wasted time and inefficient fuzzing.

Without going into much detail of the SMB protocol, we can observe that the request contains a list of about a dozen Requested Dialects. Each dialect corresponds to a specific set of supported commands. This could be interesting if we were fuzzing a particular set of commands, but right now we do not care about this.

Providing a shorter list of one or two dialects would result in Radamsa creating more meaningful mutations and a larger variety of SMB request types being sent. Our reasoning is that maybe mutating one dialect or the other will not make the application take very different paths on a single message conversation, so we do this and edit the template to look as follows:

outbound fuzz ‘\x00\x00\x00\xbe\xffSMBr\x00\x00\x00\x00\x18C\xc8\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfe\xff\x00\x00\x00\x00\x00\x9b\x00\x02NT LANMAN 1.0\x00\x02NT LM 0.12\x00’

With the template in place, we can begin fuzzing. Remember to capture the full session with Wireshark. Mutiny can also record each packet sent, but we found it easier to look at Wireshark for when the server stopped responding after a crash.

Open a telnet connection to the router using the devel account and run pkill smb; /nova/bin/smb to start a new SMB process and observe its output.

The following command will instruct Mutiny to sleep half a second between packets and log all requests: ./ -s 0.5 — logAll negotiate_protocol_request-0.fuzzer HOST

The verbose output will show packets of different sizes being sent and the numeric identifier of each of them. This value is useful to repeat the exact same sequence of mutations and provide a way to reproduce crashes. This is important because, even if we find a crash, previous requests could have corrupted something or altered the application state in a way that is required for the crash to happen. If we cannot recreate the state before the crash, we might be left empty-handed even if we identify which particular request ended up causing the crash.

If the fuzzing session is interrupted and we do not want to replay the previous mutations, it is possible to use the -r parameter to instruct Mutiny to start sending mutations from that iteration onwards (e.g: -r1500- will send mutations 1500, 1501, 1502, and so on).

If we observe Wireshark while the fuzzer runs, we will see that not all packets conform to the expected format, which is a good thing for us. Vulnerabilities usually arise when the application cannot handle unexpected data in a proper manner.

The terminal where we are running the SMB binary will also contain useful data to confirm that we are in fact feeding malformed requests to the service.

Now we let the fuzzer run. We can play with different delay values and see if the server can process requests that fast, but two requests per second is OK for this proof of concept.

A few minutes later Mutiny finishes running after trying to connect to the service and not being able to do so.

If we take a look at the terminal running the binary, we will be greeted with a Segmentation fault message.

As mentioned before, the backtrace.log file contains the register dump and a bit more information about what caused the crash.

Finally, by inspecting Wireshark we can see that the last packet sent to the server is described as “Session request to Illegal NetBIOS name”.

Understanding the crash

First we will make sure that we can reproduce the crash at will. Copying the last packet sent by Mutiny or extracting the message from Wireshark is equivalent here. We are not interested in the layers below NetBIOS, as we will create a small script to send the packet over TCP.

Extract the raw bytes from the exported file.

>>> open(“req”).read()
‘\x81\x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00 \x00\x00’

Create a simple python script that sends the payload over to the remote service. Note that I have replaced the whitespace with its equivalent hex representation for clarity.

Run the script a few times after spawning a new SMB process (pkill smb && /nova/bin/smb) and see what happens. We now have a reliable way to reproduce the crash with a single request.

In this case we are dealing with a protocol for which Wireshark has a dissector, so we can use that information to understand the cause of the crash at a protocol level. Apparently, sending a NetBIOS session request (message type 0x81) exercises a vulnerable code path in the SMB binary.

Let’s extract the binary from the router so we can open it in a disassembler. Copy /nova/bin/smb to /flash/rw/pckg so it can be accessed via FTP and download it.

This is also a good time to be able to debug the process with GDB. I like to use PEDA to enhance GDB’s display and add some useful commands.

We open two connections to the target router. In one we do pkill smb && /nova/bin/smb to get real time output, and on the other we start gdbserver attached to the newly spawned process.

Finally, we open GDB in our testing machine and connect to the debugging server with target remote IP:PORT.

It is also useful to instruct GDB which binary we are attached to by doing file smb. The next time it connects to the debugging server it will attempt to resolve symbols for the loaded libraries.

Press c on the debugging session so execution continues as usual.

Running the proof of concept will cause the service to stop due to a SIGSEGV as expected. Here we see that a NULL pointer was dereferenced when performing a copy operation.

Now, I must admit that I am very bad doing static analysis, especially when C++ programs are involved. In a lame attempt to overcome this limitation, I will rely as much as I can on dynamic analysis and, in this particular case, on the information provided by the Wireshark dissector that gives more insight about the protocol fields.

As can be observed, the first byte of the NetBIOS Session Service packet we are sending sets the message type to Session request (0x81).

The second byte contains the flags, which in our proof of concept has all bits set to zero.

Next two bytes represent the message length, which is set to 32.

Finally, the remaining 32 bytes are referred to as Illegal NetBIOS names.

We can assume that this size is being read at some point, and since it is the only message length information we are sending, it might be related to the vulnerability. To test this assumption, we are going to place breakpoints on common functions such as read and recv to identify where the application is reading the packet from the socket.

After running the script, the program breaks in read().

We navigate to the next instructions using ni and stop right after the read system call is executed.

The definition for read looks as follows:

ssize_t read(int fd, void *buf, size_t count);

EAX holds the number of bytes read, which seems to be 0x24 (36). This corresponds to the header we analyzed before: 1 byte for the message type — 1 byte for the flags — 2 bytes for the message length — 32 bytes for the NetBIOS names.

ECX contains the address of the buffer where the data read is stored. We can use vmmap $ecx or vmmap 0x8075068 to verify that this corresponds to the heap area.

Finally, EDX states that the read operation was called to read up to 0x10000 bytes from the socket.

From here on we can continue stepping through the execution and add watchpoints to see what happens with our data.

Since Wireshark did not identify anything relevant to the analyzed protocol in the NetBIOS names, let’s change our payload to contain more distinguishable characters such as “A”s so it is easier to identify that payload in our debugging session. This is also a good idea to see if additional copy operations might be triggered that would otherwise stop at the first NULL byte.

We have two bytes to play with different sizes, so before moving forward it is interesting to try sending different message lengths and see if the crash still happens.

We already tried 32 bytes, so let’s get crazy and do 64, 250, 1000, 4000, 16000, 65500.

  • 64 (payload = “\x81\x00\x00\x40”+”A”*0x40)

Same as original proof of concept. Registers look the same.

  • 250 (payload = “\x81\x00\x00\xfa”+”A”*0xfa)

This is a very interesting variation. We see most registers set to 0x41414141, which is our input, we see the stack filled with lots of “A”s as well and even EIP seems to have been corrupted.

  • 1000 (payload = “\x81\x00\x03\xe8”+”A”*0x3e8)

Same as previous payload.

  • 4000 (payload = “\x81\x00\x0f\xa0”+”A”*0xfa0)

Same as previous payload.

  • 16000 (payload = “\x81\x00\x3e\x80”+”A”*0x3e80)

This is crashing at a different instruction, although we see that the stack is corrupted as well.

  • 65500 (payload = “\x81\x00\xff\xdc”+”A”*0xffdc)

Same as previous payload.

So… we see the program crashes when executing different instructions. However, the common thing we can observe is that the stack has been corrupted at some point when parsing NetBIOS names from a single NetBIOS session request message, and that most registers included parts of our payload when we sent a 250 bytes message. This makes it particularly interesting for analysis, since we have a direct EIP overwrite and control over the stack.

Note that we cannot really ensure that all crashes are due to the exact same bug at this point. Maybe sending a larger buffer took us down a different path that ended up being more easily exploitable, so you will have to answer that question yourself.

There are also some seemingly random number of “.” (0x2e) characters in between. We will see what they are later on.

Right before the crash the program prints a message that reads “New connection:”. This can be useful to get some situational awareness without having to add watchpoints to our buffer and track dozens of read operations (you can add a read watchpoint in GDB with rwatch *addr and execution will be stopped whenever the program accesses that memory address).

We open the /nova/bin/smb binary in Binary Ninja and search for the string.

There is only one occurrence at 0x80709fb. Inspecting the cross references shows a single usage, which is probably what we want.

If we go to the beginning of sub_806b11c, we will notice that a couple of conditions need to be met in order for us to get to the block that prints the string.

The first condition is a byte comparison with 0x81, which is the message type we are sending.

Putting a breakpoint at 0x806b12e and following the execution allows us to inspect the register values and get a better picture of what is happening. We can observe for example that the size we send in the request needs to be above 0x43 to enter the interesting block.

Based on our previous tests, we know that one of the functions called from this block needs to be the one corrupting the stack. We continue going through each instruction in GDB using n instead of s to avoid stepping into the functions. After each function run, we take a look at the stack.

The first function we encounter is 0x805015e.

After it runs we see that the stack seems to be OK, so this is probably not the function responsible for the overflow.

A few instructions later we have the next candidate, function at 0x8054607. Once again, we let it run and observe the stack and register context afterwards.

Aaaaand we found our culprit. Take a look at EBP and observe that the stack frame has been corrupted. Continue debugging until the function is about to return. Here various registers are popped from the stack that contains our data.

Unpopular thought here: you do not really need to understand what this function is doing at all to exploit the vulnerability. We already have EIP control and most of the stack looks more or less as uncorrupted input data.

Taking some time to review the function at 0x8054607 with GDB’s help results in the following pseudo-code:

int parse_names(char *dst, char *src) {
int len;
int i;
int offset;

// take the length of the first string
len = *src;
offset = 0;

while (len) {
// copy the bytes of the string into the destination buffer
for (i = offset; (i - offset) < len; ++i) {
dst[i] = src[i+1];

// take the length of the next string
len = src[i+1];

// if it exists, then add a separator
if (len) {
dst[i] = ".";

// start over with the next string
offset = i + 1;

// nul-terminate the string
dst[offset] = 0;

return offset;

In essence, the function receives two stack-allocated buffers, where the source buffer is expected to be in the format SIZE1 — BUF1, SIZE2 — BUF2, SIZE3 — BUF3, etc. “.” is used as the entry separator.

The first byte of the source buffer is read and used as the size for the copy operation. The function then copies that amount of bytes into the destination buffer. Once that is done, the next byte of the source buffer is read and used as the new size. This loop finishes when the size to copy is equal to zero. No validation is done to ensure that the data fits on the destination buffer, resulting in a stack overflow.

Writing an exploit

How to approach the exploitation depends on the specifics of the targeted device and architecture. Here we are only interested in the Cloud Hosted Router x86 binary.

It is worth mentioning that there might be several different ways to achieve reliable exploitation of this vulnerability, so we are going to review the one *we* used, which might not be the most elegant or efficient way to do it.

Tobias Klein’s checksec script is a great resource to check which mitigations we will need to fight against. This script can be invoked from PEDA.

The lack of stack canaries is probably the most relevant mitigation that is missing, enabling stack based buffer overflows to be easily exploited. If the program had been compiled with stack canaries, then our previous tests would have had a very different result. Stack canaries place random values before important data in each function frame that allocates buffers, and these values are checked before the function returns. In case an overflow occurs, execution will be terminated and no further exploitation possible.

PIE disabled means we can rely on fixed locations for the program code, and a disabled RELRO means we can overwrite entries in the Global Offset Table.

To sum up, we will only be dealing with NX, which restricts execution from writable areas such as the stack or the heap.

Another mitigation that is implemented at a system level is ASLR. This is a 32 bits system, so a partial overwrite or even brute forcing may be considered a feasible bypass of ASLR. In this case, that will not be necessary.

Inspecting the memory mappings for any program in RouterOS shows that the stack base is indeed randomized but the heap is not. This can be verified running cat /proc/self/maps a few times and comparing the results.

The first step to build our exploit is getting the exact offset to get control of EIP. In order to do this we can generate a unique pattern with PEDA with the command pattern create 256 and plug it into our exploit skeleton. Note that the first byte after the header will be the size parsed by the vulnerable function, so we specify 0xFF to read 256 bytes unaltered and avoid the “.” character being placed in the middle of our payload.

When the crash takes place, it is possible to use the accompanying pattern offset VALUE command to determine the exact location to overwrite EIP.

Alter the payload and verify that EIP can be set to an arbitrary value.

We do not observe annoying “.” characters, which is good.

Now that we control EIP and the rest of the stack, we can use the borrowed code chunks technique, which is better known as return oriented programming or ROP (some people seem to be very annoyed with those using the latter term, so it is probably better to mention all the alternatives).

The main idea is that we will chain various code snippets that end in a RET instruction to execute more or less arbitrary code. Given enough of these gadgets we should be able to run anything we want. In this particular case though, we only want to mark the heap area as executable. The end goal is to store something in the heap (which already contains messages read from the client) and jump there taking advantage of the static base address.

The relevant function here is mprotect, which looks as follows:

int mprotect(void *addr, size_t len, int prot);

The address will be 0x8072000, which is the base of the heap. This needs to be page-aligned for it to work.

Len can be anything we want, but let’s change the protection of the whole 0x14000 bytes.

Finally, prot refers to a bitwise-or of the desired protections to enforce. 7 refers to PROT_READ | PROT_WRITE | PROT_EXEC, which is essentially the RWX we are aiming for.

There are various tools that can attempt to create the chain automatically, such as ROPGadget and Ropper. We will use ropper but build the ROP chain manually to show how it can be done.

As per the Linux system call convention, EAX will contain the syscall number, which is 0x7d for mprotect. EBX will contain the address parameter, ECX the size and EDX the desired protection.

Let’s start setting EBX to 0x8072000. We look for a gadget that contains the POP EBX instruction and has the least side-effects possible.

We choose the smaller gadget and start constructing our chain. This looks as follows. Execution will be redirected to 0x804c39d, which will first execute a POP EBX instruction, setting EBX to the desired value of 0x8072000. Next, POP EBP will be executed, so we need to provide some dummy value so there is something to pop from the stack. Finally, the RET instruction is executed, which pops whatever is next in the stack and jumps there. This needs to be our next step in the chain.

All values are packed as little-endian unsigned integers.

We do the same process to set the desired size in ECX. It is important to understand that the order matters, as we could be unknowingly overwriting the registers we have already set if we are not careful. Moreover, sometimes the gadgets will not look as nice as POP DESIRED_REG; RET and we will have to deal with potential side effects that need additional adjustments.

Here we will choose the more benign 0x080664f5. This gadget alters the value of EAX, but we do not rely on anything specific set in EAX at this time, so it is useful. We append this to our ROP chain.

We repeat the process, this time to set EDX to 7, which is the RWX protection level.

This time we select the gadget at 0x08066f24, which does not mess with our previously set registers.

Finally, we need to set EAX to the syscall number 0x7d. We search gadgets containing POP EAX and do not find anything that will not alter our current setup.

We could try to reorder our gadgets in a different way, but we will just search for another gadget that does XCHG EAX, EBP and chain it with the ubiquitous POP EBP; RET.

From here we take 0x804f94a and 0x804c39e and append them.

The registers are now configured as desired to execute the mprotect system call. In order to do this, we need to call INT 0x80, which notifies the kernel that we want to execute the system call.

However, when we look for gadgets containing this instruction, we find none. This can make things a bit more difficult.

Luckily, there is another place where we can find this kind of gadget. All user space applications have a small shared library mapped into their address space by the kernel that is called vDSO (virtual dynamic shared object). This exists for performance reasons and is a way for the kernel to export certain functions to user space and avoid the context switch for functions that are called very often. If we take a look at the man page, we will see something interesting:

This means there is a function in the vDSO that might know how to perform system calls. We can inspect what this function does in GDB.

As can be observed in the screenshot above, __kernel_vsyscall contains a useful gadget. We execute the process a few times and realize that this mapping is not affected by ASLR, which allows us to use this gadget. The values of EBX, ECX and EBP do not really matter right now, as they will be set after the system call is executed anyway.

We update the exploit code to send the chain we built and attach GDB to the running SMB binary.

EIP will redirect execution to our first gadget, so it is a good idea to put a breakpoint at 0x804c39d, which is the start of the chain.

Use stepi to observe how the registers are set to the desired values. Right after INT 0x80, we can list the mapped areas and if everything worked OK, the heap will be marked as RWX.

The remaining piece consists of storing arbitrary code in the heap at a known location so we can jump there and get a shell, but how can we do this?

When we put a breakpoint in read(), we observed that the request data was being stored somewhere in the heap. In addition, we had various samples of Negotiate Protocol Request requests, so it is possible to determine that if the message type byte is set to 0x00 then we are going to reach some path in the program where the payload will be processed and stored in the heap.

To test this assumption, let’s put a breakpoint in read() again and change the PoC payload to send a benign Negotiate Protocol Request message with 512 “A”s as content. As a reminder, the format is:

message type (1 byte) — flags (1 byte) — message length (2 bytes) —message

This time message type will be set to NETBIOS_SESSION_MESSAGE (0x00). We are not using another Session Request message (0x81) to avoid accidentally triggering the vulnerability and having to deal with the “.” characters that the vulnerable function places in between.

Step through the read function until the 0x204 bytes (512 “A”s + the 4 byte header) are read from the network. As stated before, ECX contains the address of the buffer.

Inspecting the memory contents at the specified address shows our payload.

Press c to allow execution to continue normally and send a new request to check if the previous one gets overwritten or if it is just left there in the heap for now.

When the breakpoint is reached again, we try to print the contents of the read buffer and unfortunately we realize that it has been zeroed out.

However, the previous request could still be lingering somewhere else if the application made a copy that was not zeroed out. It is possible to search the current address space with PEDA by using the find or searchmem commands. Our message consists of 512 “A”s, so we attempt to find a contiguous block of “A”s. The commands take an optional parameter separated by a white-space to confine the search to a specific area. We are only interested in results that might be present in the heap.

This means that the contents of the request are being copied and left in some buffer that is not cleared out. We need to make a few more tests to be able to trust this location to store our payload. In particular, if we change the script to send 512 “B”s instead of “A”s, we will see that 0x8085074 will end up containing the “B”s after the request is processed. We need data to persist throughout additional requests, so this is not good.

However, if we first send 512 “A”s and then let’s say 256 “B”s, it will become evident that the first half is overwritten but the second half will still contain the bytes from the previous request. The weird looking 0x00000e89 is chunk metadata from the heap control structures and is not relevant to our scenario.

Knowing that the data will be persistent across at least two requests, we can craft the following plan:

  1. Send a Negotiate Protocol Request with the code we want to execute. The first part will be a few hundred NOP instructions because these bytes will be overwritten when we issue the second request with the corresponding Session Request message that triggers the vulnerability.
  2. Send a Session Request message that corrupts the stack, ROPs to mprotect, marks the heap as executable and jumps to the hard-coded location where the payload from #1 is stored, abusing the fact that the heap base is not randomized.

We make an arbitrary decision to leave 512 bytes for the second request, so we will jump at the hard-coded location of 0x8085074 + 512 = 0x8085270. This address needs to be appended to our ROP chain. The previous gadget will execute its final RET instruction, 0x8085270 will be popped from the stack and the program execution will follow.

The first version of the shellcode will consist only of INT3 instructions so the debugger breaks upon execution. The opcode for INT3 is CC.

The script is also modified to open two connections, one for each request.

Attach to a new SMB process and run the exploit.

We are now executing arbitrary code. Let’s generate a reverse shell payload with msfvenom.

We modify the first stage to store this payload and run the exploit again.

This time we open a netcat listener at the specified port so we can receive the connection.

…and we have a shell. Hooray!


Fuzzing a network service using a mutation-based approach can be done with very little effort and may produce great results.

If you are looking for vulnerabilities in applications that talk arcane proprietary protocols or are just too lazy to build a comprehensive template, give dumb fuzzing a go. You can even leave the fuzzer running with minimum effort while you apply your ninja reverse engineering skills to understand the protocol and build something better.

RouterOS powered devices are now everywhere and the lack of modern (for arbitrary definitions of modern) exploit mitigations is a bit worrisome. Having full ASLR enabled would make the life of exploit writers a bit more difficult, and most stack overflows would be rendered unexploitable in the absence of info-leak vulnerabilities if the binaries were compiled with stack canaries support.

It is worth mentioning that MikroTik’s response and patch times were great. At first the changelog did not hint a security vulnerability existed:

What's new in 6.41.3 (2018-Mar-08 11:55):
*) smb - improved NetBIOS name handling and stability;

However, they seem to be more serious now. They include more detailed comments in their changelogs regarding security vulnerabilities and seem to have a blog where they post official announcements regarding these types of issues as well.

The astute reader might have noticed that if you reproduced the steps outlined in this post, you might even have found a few additional 0days in RouterOS SMB.

Have fun!

Additional resources

Debug UEFI code by single-stepping your Coffee Lake-S hardware CPU

Original text by Teddy Reed V

In the post I will cover:

  • Configuring an ASRock H370M-ITX/ac to allow DCI DbC debugging
  • Using Intel System Studio and System Debugger to single-step a Coffee Lake-S i7-8700 CPU
  • Debugging an example exploitable UEFI application on hardware

USB DCI DbC Debugging (JTAG over USB3)

TL;DR, if you have a newer CPU & chipset you can purchase a $15 off-the-shelf cable and single-step your hardware threads. The cable is a USB 3.0 debugging cable; and is similar to an ethernet crossover cable in the sense that the internal wiring is crossed. Be careful with this cable as unsupported machines will have undefined behavior due to the electronics of USB.

Newer Intel CPUs support debugging over USB3 via a proprietary Direct Connection Interface (DCI) with the use of off-the-shelf hardware. This applies to some 6th-generation CPU and chipset combinations, and most 7th-generation and newer setups. I have not found the specific CPU/chipsec combinations but my educated guess from the Core series is as follows:

  • Kaby Lake / Intel 100 or 200 series SunrisePoint
  • Coffee Lake-S / Intel Z370, H370, H310, or B360
  • Kaby Lake R / 6th-gen Intel Core
  • Whiskey Lake-U (8565U, 8265U, 8145U)
  • Coffee Lake-S / H370, H310, B360

These combinations should support «DCI USB 3.x Debug Class» debugging. This means you only need the inexpensive debug cable linked above. Note that if debug-cable debugging is not support then a proprietary interposing device is required via a purchase from Intel.

From the documentation I’ve read, the USB3 hardware on a supported machine decodes DCI commands, forwards them to an appropriate hardware module on the target CPU that translates them to JTAG sequences. Intel provides a free-to-use, renewably-licenced, Intel System Studio and System Debugger software along with a DCI implementation called OpenDCI. This debugging environment is built with Eclipse and supported on macOS, Linux, and Windows. I’ve only found OpenDCI support for DbC-compatible targets on the Windows version.

You will need a Windows 10 install and Intel System Studio if you are following along.

Enable DCI on the ASRock H370M-ITX/ac

TL;DR you will need to enable and disable undocumented settings within UEFI by flipping several bits in a UEFI variable.

If you are doing casual research on DCI you will find several references to using a BIOS version with DCI enabled or using a UEFI debug build. I am sure they will be very helpful but it is not possible to acquire this in a general sense. However, we can still follow guidance on «modding» our UEFI to enable DCI. I found eiselekd’s DCI-enable guidance extremely helpful.

  1. Use chipsec to dump your SPI contents to disk. e.g., chipsec_util spi dump rom.bin
  2. Open rom.bin with UEFITool and extract GUID 899407D7-99FE-43D8-9A21-79EC328CAC21 (the Setup UEFI variable).
  3. Use IFRExtractor to print a textual representation of the variable options.

The variables settings required for the H370M-ITX/ac are as follows, tested on version 3.10 and 4.00 UEFI releases:

  • Enable/Disable IED (Intel Enhanced Debug): offset 0x960, set to enabled 0x1
  • CPU Run Control: offset 0x663, set to enabled 0x1
  • CPU Run Control Lock: offset 0x664, set to disabled 0x0
  • Platform Debug COnnect: offset 0x114F, set to 0x03 to enable DCI DbC
  • xDCI Support: offset 0xABD, set to enabled 0x1

To modify and save these offsets follow the guidance above to use the UEFI Shell and RU.efi application by James Wang.

You can confirm that DCI is enabled by reading the USB3 device class label when you connect the debug cable into your host and target machines. The host should have Intel System Studio installed and the target is the H370M-ITC/ac. The host USB driver will read «Intel USB Native Debug Class Devices» if DCI is enabled. If there is an error you will see «Port Reset Failed«. An easy way to view the detailed USB device information is with USB Tree View. Chipsec will also report if DCI is enabled but I found that DbC-specific availability is not reported; so use the USB device driver selection in Windows to confirm the UEFI options are set correctly.

Single-stepping the i7-8700

To recap the requirements and setup:

  • You have a host machine running Windows 10 with Intel System Studio installed
  • The host machine and target i7-8700/H370M-ITX/ac are connected via a USB3 DbC cabled
  • The host machine shows a connected «Intel USB Native Debug Class Device» USB device

Interrupt the target machine’s boot such that you enter UEFI Setup (press F2). This is not required but it will help while following along with the address space and other layout details. I have not figured out how to halt the CPU on reset with DCI and DbC.

In Intel System Studio you should open System Debugger and configure your target connection to use «8th Gen Intel Core Processors (Coffee Lake-S) _ Intel H370 Chipset Intel H310 Chipset Intel B360 Chipset for Consumer (Cannon Lake PCH)» using the connection method: «Intel(R) DCI USB 3.x Debug Class«

Upon success you will see status output similar to the following:

22:02:20 [INFO ] TCA - IPConnection: Open Connection, configuration: CFL_CNP_OpenDCI_DBC_Only_ReferenceSettings.
22:02:57 [INFO ] Starting DAL ...
22:02:57 [DAL  ] The system cannot find the batch label specified - SetScriptPath
22:02:58 [DAL  ] Registering MasterFrame...
22:03:00 [DAL  ] Using Intel DAL 1.1905.602.100 
22:03:00 [DAL  ] Using python.exe 2.7.15 (64bit), .NET 2.0.50727.8940, Python.NET 2.0.19, pyreadline 2.1.1
22:03:02 [DAL  ]     Note:    The 'coregroupsactive' control variable has been set to 'GPC'
22:03:10 [DAL  ] Using CFL_CNP_OpenDCI_DBC_Only_ReferenceSettings
22:03:10 [DAL  ] >>? DAL startup completed
22:03:10 [INFO ] Connection Manager: Status change: CONNECTED
    Connection: 8th Gen Intel Core Processors (Coffee Lake-S) _ Intel H370 Chipset Intel H310 Chipset Intel B360 Chipset for Consumer (Cannon Lake PCH)
    Target: 8th Gen Intel Core Processors (Coffee Lake-S) / Intel H370 Chipset, Intel H310 Chipset, Intel B360 Chipset for Consumer (Cannon Lake PCH)
    Connection Method: Intel(R) DCI USB 3.x Debug Class

And output similar to the following screen captures:

The connection will also pause the CPU threads and show you the nearby disassembly. If the CPU is not paused and clicking the «pause» button fails you have not enabled DCI completely. For example, if you encounter either, ExecutionControlUnableToHaltAllException, or operation not allowed while the processor is in state 'running' then double-check the UEFI Setup variable options.

A successful connection will show a UI similar to the following:

And you can now View and inspect memory as well as other common JTAG-debugging features.

Debugging an example exploitable UEFI application on hardware

TL;DR this is extremely simple and thus a great toy example, due to the lack of platform runtime security in UEFI and lack of build and compile security in the UEFI development kit (EDK/UDK).

The goal is to build a «toy» vulnerable UEFI application, trigger the exploitation, and observe the behavior within the System Debugger on the connected host. The first step is to configure the edk2 build environment. This is well-documented in several places.

I will modify the HelloWorld application and replace the MdeModulePkg/Application/HelloWorld/HelloWorld.c with the following content.

#include <Uefi.h>
#include <Library/UefiLib.h>
#include <Library/UefiApplicationEntryPoint.h>

#include <Protocol/LoadedImage.h>
#include <Library/UefiBootServicesTableLib.h>
#include <Library/MemoryAllocationLib.h>

VOID RunAsm();

CHAR16* GetArgv(IN EFI_HANDLE ImageHandle)
  EFI_GUID loaded_image_protocol = LOADED_IMAGE_PROTOCOL;
  gBS->HandleProtocol(ImageHandle, &loaded_image_protocol, (void**) &li);

  CHAR16* wargv = (CHAR16 *)li->LoadOptions;
  return wargv;

VOID RunMe()
  Print(L"You win\n");

UINT32 StrLenChar(CHAR8* src) {
  UINT32 ret = 0;
  while (src[ret++] != 0) {}
  return ret - 1;

VOID StrCpy(CHAR8* dst, CHAR16* src, UINT32 length) {
  CHAR8 *src8 = (CHAR8*)src;
  for (UINT32 i = 0; i < length; i++) {
    dst[i] = src8[(i*2)];

  UINT64 loc = (UINT64)&RunMe;
  dst[length - 1] = 0;
  dst[length - 2] = 0;
  dst[length - 3] = 0;
  dst[length - 4] = 0;
  dst[length - 5] = ((loc >> (8 * 3)) & 0xFF);
  dst[length - 6] = ((loc >> (8 * 2)) & 0xFF);
  dst[length - 7] = ((loc >> (8 * 1)) & 0xFF);
  dst[length - 8] = ((loc >> (8 * 0)) & 0xFF);

 __attribute__((noinline)) VOID
 TestBufferOverflow(CHAR16* input)
  /* Test stack buffer overflow */

  // Compiled with EDKII that auto-adds (-fno-stack-protector)
  CHAR8 buffer[32];
  StrCpy((CHAR8*)buffer, input, StrLen(input));
  buffer[StrLen(input)] = 0;

  IN EFI_HANDLE        ImageHandle,
) {
  // Run with: fs0:X64\HelloWorld.efi A*222

  Print(L"UefiMain=0x%p\n", &UefiMain);
  CHAR16* wargv = GetArgv(ImageHandle);
  UINT32 wargv_len = StrLen(wargv);

  return EFI_SUCCESS;

The specific build command is

$ . ./ BaseTools
$ build -m MdeModulePkg/Application/HelloWorld/HelloWorld.inf -p MdeModulePkg/MdeModulePkg.dsc

And if you would like to test that this runs follow the QEMU debugging guide and use:

$ qemu-system-x86_64 -bios /usr/share/OVMF/OVMF_PURE_EFI.fd -display none -nodefaults -serial stdio -hda fat:Build/MdeModule/DEBUG_GCC5

The code above is a sythethetic stack-based buffer overflow example. It will auto-fill in the overwritten ret address for you. If you want to learn what is happening here please read Dhaval’s articles on Buffer Overflows. As a note, we could choose to make this more realistic (e.g., remove the auto-filled ret) by reading a file into the vulnerable stack variable.

The default edk2 build configuration will compile the overflow into the following flow, where the StrCpy logic is inlined:

Our goal is to copy 0x30 characters into the buffer, overflowing the expected 0x20, the 8 for the saved RBX, and 16 for RSP and RIP; at which point the final 8 will be filled in with the address of RunMe.

For some fast feedback we’ll print to ConsoleOut then reset the CPU using:

    mov $254, %al
    out %al, $100

If a console is not available then this functions well for blind-testing control of rip.

Because we are printing the location of UefiMain we can both confirm that each time the application is executed the address is constant and know what location to set a hardware breakpoint in System Debugger so we can single-step and watch the overflow.

For my UEFI build this location was 0x600BC69C, which means the .text is loaded to an offset of 0x600BB000 as this subroutine is 0x169C. From here we can add more breakpoints in System Debugger.

TP-Link ‘smart’ router proves to be anything but smart – just like its maker: Zero-day vuln dropped after silence

Original text by Thomas Claburn

TP-Link’s all-in-one SR20 Smart Home Router allows arbitrary command execution from a local network connection, according to a Google security researcher.

On Wednesday, 90 days after he informed TP-Link of the issue and received no response, Matthew Garrett, a well-known Google security engineer and open-source contributor, disclosed a proof-of-concept exploit to demonstrate a vulnerability affecting TP-Link’s router.

The 38-line script shows that you can execute any command you choose on the device with root privileges, without authentication. The SR20 was announced in 2016.

Via Twitter, Garrett explained that TP-Link hardware often incorporates TDDP, the TP-Link Device Debug Protocol, which has had multiple vulnerabilities in the past. Among them, version 1 did not require a password.

«The SR20 still exposes some version 1 commands, one of which (command 0x1f, request 0x01) appears to be for some sort of configuration validation,» he said. «You send it a filename, a semicolon and then an argument.»

Once it receives the command, says Garrett, the router responds to the requesting machine via TFTP, asks for the filename, imports it to a Lua interpreter, running as root, and sends the argument to the config_test() function within the imported file.

The Lua os.execute() method passes a command to be executed by an operating system shell. And since the interpreter is running as root, Garret explains, you have arbitrary command execution.

However, while TDDP listens on all interfaces, the default firewall prevents network access, says Garrett. This makes the issue less of a concern that remote code execution flaws identified in TP-Link 1GbE VPN routers in November.

Even so, vulnerability to a local attack could be exploited if an attacker manages to get a malicious download onto a machine connected to an SR20 router.

TP-Link did not immediately respond to a request for comment.

Garrett concluded his disclosure by urging TP-Link to provide a way to report security flaws and not to ship debug daemons on production firmware.

Insomni’Hack 2019 CTF – Perfectly Unbreakable Flag – 500

Original text by Phil

Challenge description

To our surprise, we found out that our challenge from last year has been counterfeited by another CTF.
Since we must protect our flag business as much as we can, we invested in the most secure technology around : the cloud™®©.
Since each device is uniquely fingerprinted, we are confident that our unclonable devices will be safe from those french knockoffs.

More info :
- If the board fails to connect to the cloud, perform a hard reset (ie. disconnect it completely before rebooting it)
- The cloud endpoint used to get the flag is /flag, in case you need to guess it

And a .tgz is given, containing the 3 firmwares of the 3 available boards:

$ ls -l
total 1376
-rwxrwxrwx 1 root root 287536 mars 27 2019 board-2.bin
-rwxrwxrwx 1 root root 287536 mars 27 2019 board-3.bin
-rwxrwxrwx 1 root root 287536 mars 27 2019 board-4.bin
-rwxrwxrwx 1 root root 545222 mars 22 18:17 firmwares-d1bd1fcbfb1fdef7678608460ed96b16074aae3f43ed052ebcc3e2724d7efc27.tgz
$ sha256sum board-*
aadc9e62ba75bda60b1412d0514bae00a28f51636c1291590e70c217bcf25a2f board-2.bin
27e7b7d39566bbdbd109a56e50f546681770ef3fad261118d64e1319ff0d53e7 board-3.bin
32682457545043f8611078d43549cf4414f9f0bd29700c1f2c42ad80d5013229 board-4.bin

Understanding what to do

As this challenge looks not trivial at all, I’ve spent 15 minutes on understanding the goal and the path to achieve the job. All the 3 boards are in free access on a desk beside the organisation team.

Picture of the board number 2

When you power-up the device using the black USB cable, it start running and show the activity on the network connector. As this device is a development board, the left secable part is a ST-Link V2 ready to handle the right part of the board, composed of the main MCU and a few components. Connecting a PC to the USB port and running it with the official ST-Link utility give you this trace:

19:53:18 : ST-LINK SN : 0669FF494849887767175629
19:53:18 : ST-LINK Firmware version : V2J29M18
19:53:18 : Connected via SWD.
19:53:18 : SWD Frequency = 4,0 MHz.
19:53:18 : Connection mode : Connect Under Reset.
19:53:18 : Debug in Low Power mode enabled.
19:53:18 : Device ID:0x419
19:53:18 : Device family :STM32F42xxx/F43xxx
19:53:18 : Can not read memory!
Disable Read Out Protection and retry.

The MCU is protected, but the ST-link is non altered and can be used.

Now let’s see if the virtual COM port (VCP) is mapped by the ST-Link for debug purpose. Just start a terminal and RESET the board to have a look at the boot sequence:

Starting mbed-os-example-tls/tls-client
Using Mbed OS 5.11.5
Successfully connected to perfectlyunbreakable-cloud.insomni.hack at port 443
Starting the TLS handshake…
Successfully completed the TLS handshake
Server certificate:
cert. version : 1
serial number : 29:98:FB:FA:5B:65:0A:2D:15:E0:A4:BF:9B:06:6C:0B:1D:72:C8:8A
issuer name : C=CH, ST=Geneva, O=Insomni'hack
subject name : C=CH, ST=Geneva, O=Insomni'hack, CN=perfectlyunbreakable-cloud.insomni.hack
issued on : 2019-03-14 11:00:24
expires on : 2020-07-26 11:00:24
signed using : ECDSA with SHA256
EC key size : 256 bits

Certificate verification passed
Established TLS connection to perfectlyunbreakable-cloud.insomni.hack
HTTP: Received 175 chars from server
HTTP: Received '200 OK' status … OK
HTTP: Received message:
HTTP/1.1 200 OK
Server: nginx
Date: Fri, 22 Mar 2019 18:11:52 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 20
Connection: keep-alive

Cloud connection OK.


At this point, nothing other is possible over the serial port, impossible to send command to the board.

The next check is to try to connect with a regular PC from LAN of the CTF to the URL https://perfectlyunbreakable-cloud.insomni.hack/ and see what happened:

No way to connect to the « secure cloud »

To summarize: The goal is to connect to the https://perfectlyunbreakable-cloud.insomni.hack/flag URL. I can deduce that only the official boards can do it because they own a client side certificate in their flash. So, the only way to connect to /flag with a regular browser is to steal the private certificate key from the flash of the MCU and import it to the browser.

Let’s start the reverse

Check the difference between all the 3 firmwares

As the authors gives you the 3 binary firmwares from the 3 running board, this looks too simple to spot the certificate by this way, but let’s try it.

The client side public certificate change …
… and a few bytes too

The public certificate is the first difference, and the 32 bytes at offset 0x080437B0 is the second one. The second one is the most interesting because it should be the —–BEGIN PRIVATE KEY—– but it was not the case.

Let’s the long reverse start

Now it’s time to reverse the 281KB STM32 firmware file… And guess what, just to be sure to maximise the complexity of the task, let’s use a newcomer: Ghidra!

The tool worth a look and from my previous tests, the ARM-thumb decompiler was fine on all the examples I’ve tried.


Loading the firmware and giving at this first stage the correct description to Ghidra is mandatory. The STM32 used for the challenge is a STM32F42xxx/F43xxx (according to the previous ST-Link trace). Checking in the reference guide for the ARM level instruction will point you Cortex-M4. And if you dig more, you’ll find it’s an ARMv7E ISA. The mistake I’ve done is to select in Ghidra the ARM v7 little endian target. The correct one is Cortex (thanks Balda for the correction):

Set the correct target

And do not forget to set the base address of the firmware:

0x08000000 came from reference guide


Now we need to find the public and private key in the firmware. For the pub cert chain, it’s trivial, just need to look for strings « BEGIN CERTIFICATE »:



But now the complex things start: where the f*ck is the private key… At this point you have no choice to understand how the HTTPS connection is done to the server. The first and winning idea is to take back the serial log and try to identify the SDK used. At the beginning « Mbed OS 5.11.5 » explicitly give you the answer. Then, you need to dig more for guessing how the TLS is done.

The interesting part is :

Starting the TLS handshake…
Successfully completed the TLS handshake

After a few minutes digging with Google, this PAGE give you nearly the same trace I’ve obtain through the serial interface. From this sample code found in the SDK, you can find your way in the firmware:

SDK: allocating the object « HelloHttpsClient »
Decompiled version

As I’ve never pay attention on how to reverse some C++ code in an embedded target, I was stuck by the pointer added to a method without parameter in the original source code. Ghidra is doing a good job, but you need to understand that the pointer renamed here « complexStruct » is the pointer to the current memory segment of the instance of the object.

Then, digging more in the TLS part is needed. According to the SDK, using a client private certificate measn you need to call the function « mbedtls_ssl_conf_own_cert ». By searching in the strings I found « mbedtls_ssl_conf_own_cert() returned -0x%04X » and a XREF. This code is setting up the certificate pub/priv key pair:

Generation and setup of the private key

Now, it’s time to study the function genPrivateKey() and see how it works:

Computing the private key

The funniest part of the challenge is here. This code is nothing more than a bitwise AND with 2 offset in memory. One in flash, OK, but the other one in a non initialized SRAM zone! Now it’s time to have a look at the hint given during the CTF:

Fri Mar 22 2019, 22:20:22 [Perfectly Unbreakable Flag Hint]
The title acronym means something else in the hardware community!

« PUF » acronym. What? Google point THIS page. My friend dok tells me, « I know what it is, it’s something you can’t clone because it use some physical unpredictable parameter ». But in the current case the PUF function is the SRAM at boot. 64 bytes are used as the private key. But, as there is some flipping bits in those 64 bytes during the powerup sequence, another 64 bytes table is used as mask for keeping only the stables states bits, and remove the flipping one. This tech needs to boot a huge number of time the board to monitor the states of the 8×64 bits and only keep the stable one. That’s a REALLY GOOD TRIX!


Now I need to dump the content of the SRAM3, forgotten during the first dump  . It’s quite easy, even with the protection fuse set. You just need to connect your PC, run the ST-Link utility and press « connect », then on the target hit RESET and at the very first moment of the boot you can dump the whole SRAM zone. Even if the debug port is closed.

With the memory dump and the flash dump, here is the code who compute and display the private key:

import base64
sram = "\x09\xE6\xF1\x20\x32\xE2\x38\xDD\xCF\x29\x27\x7F\x6F\xEB\x76\x34\x40\xC4\x44\xDC\xCA\xCD\x3B\x87\x0B\xAB\xE1\xB8\xE8\x80\x7B\x9B\x3B\xAA\xD5\x04\x61\xCA\xA2\x91\x66\x32\x49\xDF\xE5\x42\x98\xF5\x98\xB2\x37\x7E\x7E\xEB\xFD\x2E\xAB\xC1\x9F\x5A\xC0\xE3\xFF\xD9"
flash = "\x59\x3D\x32\xFE\x47\xA5\x4A\x85\x88\x35\x4E\x27\x63\x49\x37\xB6\xFF\x1B\xBE\xC2\xCE\x63\x95\xAB\x30\x3F\x77\x9D\x59\xD3\xE2\x75\xDD\xFF\x1E\x03\x2E\xF1\xEE\xE1\x52\xE8\xAA\x8B\x0E\x9D\xFA\xEA\x4E\x3D\x79\x0C\xD7\xEB\xBD\x7E\x73\x35\x9E\x5B\xBE\x5D\x42\xD7"
res = []
for x in range(len(sram)) :
res.append( ord(sram[x]) & ord(flash[x]) )
print("Private key = ",res)

Private key = [9, 36, 48, 32, 2, 160, 8, 133, 136, 33, 6, 39, 99, 73, 54, 52, 64, 0, 4, 192, 202, 65, 17, 131, 0, 43, 97, 152, 72, 128, 98, 17, 25, 170, 20, 0, 32, 192, 162, 129, 66, 32, 8, 139, 4, 0, 152, 224, 8, 48, 49, 12, 86, 235, 189, 46, 35, 1, 158, 90, 128, 65, 66, 209]

At this point it was 3H56. My first think was « shit, it miss me 10 minutes to generate the private key and solve the challenge ».


As it’s always a big deception to not finish a challenge in time, I continue at home to solve it. But I was wrong. It was far more complex to finish the reverse until the flag, and the 10 minutes changed to another 4 hours of job.

After obtaining the bits from SRAM who doesn’t flip, you need to reverse this:

Unknown hash function

And the funny stuff is for example:

no way to understand what’s running here…

This one doesn’t decompile, and the ASM view is not so clear. My guess is this an interrupt hook to an external crypto-engine who run in a few cycles a cryptographic function.

To help identifying the function, I’ve download an official TLS library from Mbed: mbedtls-2.16.0-apache.tgz. With this reference source code, the unknown function can be commented and is a little bit more readable:

a clean SHA256 code

If you think it’s trivial now, your right but with the solution on the eyes it’s more easy, believe me  . So the unknown part of the private key become:

import base64
import hashlib
from array import array

sram = "\x09\xE6\xF1\x20\x32\xE2\x38\xDD\xCF\x29\x27\x7F\x6F\xEB\x76\x34\x40\xC4\x44\xDC\xCA\xCD\x3B\x87\x0B\xAB\xE1\xB8\xE8\x80\x7B\x9B\x3B\xAA\xD5\x04\x61\xCA\xA2\x91\x66\x32\x49\xDF\xE5\x42\x98\xF5\x98\xB2\x37\x7E\x7E\xEB\xFD\x2E\xAB\xC1\x9F\x5A\xC0\xE3\xFF\xD9"
flash = "\x59\x3D\x32\xFE\x47\xA5\x4A\x85\x88\x35\x4E\x27\x63\x49\x37\xB6\xFF\x1B\xBE\xC2\xCE\x63\x95\xAB\x30\x3F\x77\x9D\x59\xD3\xE2\x75\xDD\xFF\x1E\x03\x2E\xF1\xEE\xE1\x52\xE8\xAA\x8B\x0E\x9D\xFA\xEA\x4E\x3D\x79\x0C\xD7\xEB\xBD\x7E\x73\x35\x9E\x5B\xBE\x5D\x42\xD7"

res = []

for x in range(len(sram)) :
res.append( chr(ord(sram[x]) & ord(flash[x])) )
res = array('B', map(ord,res)).tostring()

print("Private key = ",res)
print("sha256 = " , hashlib.sha256(res).hexdigest())

$ python
Private key = b"\t$0 \x02\xa0\x08\x85\x88!\x06'cI64@\x00\x04\xc0\xcaA\x11\x83\x00+a\x98H\x80b\x11\x19\xaa\x14\x00 \xc0\xa2\x81B \x08\x8b\x04\x00\x98\xe0\x0801\x0cV\xeb\xbd.#\x01\x9eZ\x80AB\xd1"
sha256 = 8e140886f96ef269e736cb1fe24ea12627df6971f32d6c15b6cbc2810af19382

Fake the board and grab the flag

Now it’s time to start a little bit of crypto. EDIT: no, not a little! I have something looking like the private key and the full chain of certificate. I need to craft a correct certificate, so I can deploy it and visit the /flag URL. If you wonder how I can do that after the CTF you’re right: I have asked to the creators of the challenge the Docker files to run it here and finish the work.

First, craft the private key. For this one you need to generate the ECC correct private + public key file in .pem format. I never found a regular way working because of a lack of knowledge in certificate / keys manipulation. Thanks to Sylvain for correct my silly Python code. The use an enhanced Python crypto lib is needed, I’ve used Pycryptodome.

$ pip install pycryptodome

$ cat
from Crypto.PublicKey import ECC

e=ECC.construct(curve="prime256v1", d=0x8e140886f96ef269e736cb1fe24ea12627df6971f32d6c15b6cbc2810af19382)

print e.export_key(format="PEM")

$ python2 > privateKey.pem
$ cat privateKey.pem

Now you need to concatenate the 2 public certificates found in the flash of the board in a file called « chain.pem ». And finally generate a single file with all the stuff to import it on a regular browser:

$ openssl pkcs12 -inkey privateKey.pem -in chain.pem -export -out personnal.pfx

$ openssl pkcs12 -info -in personnal.pfx
Enter Import Password:
MAC: sha1, Iteration 2048
MAC length: 20, salt length: 8
PKCS7 Encrypted data: pbeWithSHA1And40BitRC2-CBC, Iteration 2048
Certificate bag
Bag Attributes
localKeyID: 95 5D 33 B2 38 0B 4C CE FC 46 DD 1C 55 17 63 45 5A 7A 17 82
subject=C = CH, ST = Geneva, O = Insomni'hack, CN = board-2.insomni.hack
issuer=C = CH, ST = Geneva, O = Insomni'hack
Certificate bag
Bag Attributes:
subject=C = CH, ST = Geneva, O = Insomni'hack
issuer=C = CH, ST = Geneva, O = Insomni'hack
PKCS7 Data
Shrouded Keybag: pbeWithSHA1And3-KeyTripleDES-CBC, Iteration 2048
Bag Attributes
localKeyID: 95 5D 33 B2 38 0B 4C CE FC 46 DD 1C 55 17 63 45 5A 7A 17 82
Key Attributes:
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:

One fuckin’ thing to know: if you don’t set a password to your .pfx file, Firefox will fail silently to import it.

Another funny thing: at this point you don’t know if there is more computing on the 32 bytes used for generate the private key. The firmware is so huge, you can’t check all functions between the last key manipulation and the TCP_connect to the HTTPS port. You just need to try and pray…

Now you just need to connect to the super-secure cloud with the fake credz:

The extracted certificate roxx !!!

And now you just need to grab the flag:

The flag, hum

No, not exactly the flag …

Finish him


I was wondering the needs to this last step, who’ve made lost the flag to Marius (@nSinusR) from Tasteless (@TeamTasteless). Yes, Marius arrived during the CTF at this point at 3h55. It’s the difference between skilled teams and amateurs  . As we have access to the boards, we have the firmware, it would have been possible to patch the board to connect directly to the url https://perfectlyunbreakable-cloud.insomni.hack/flag instead of https://perfectlyunbreakable-cloud.insomni.hackduring the boot sequence. So the last step involve the private key you’ve used for generate the certificate as a proof of work. To remove the AES-CBC I’ve used Openssl:

$ hexdump -C flag.enc 
00000000 0f b8 b7 c7 53 8e 1e 20 93 ea 93 13 e3 08 9f 46 |….S.. …….F|
00000010 1e cb 13 8e 42 28 d0 46 52 39 27 28 09 15 2a cf |….B(.FR9'(..*.|
$ openssl enc -aes-256-cbc -d -in flag.enc -K '8e140886f96ef269e736cb1fe24ea12627df6971f32d6c15b6cbc2810af19382' -iv ' 00000000000000000000000000000000'


I personally go to Insomni’Hack CTF for one thing: the hardware challenges. This year 2 challenges were here for our pleasure. The first one from @_noskill of , intern at SCRT at the moment, were cool and a good warm-up (write-up from Sylvain of DUKS HERE). And this « monster » from Balda & Sylvain.

I must say this challenge occupy me during the whole CTF. I’ve learn a tech’ I’ve never seen before, the PUF concept is really funny and, I guess, used IRL. Solving a task close to a real project is far away more exiting, and it was the case here! Using Ghidra was a good experience, I’ll do it again and hope to forget ASAP IDA-PRO to focus only on this wonderful open-source tool.


A little regret on this one is the missing in the description of the « crypto » categorie. With this more accurate description I would not tried it alone, and I would asked for some helps to other members of the team at the very first moment of the CTF. And the complexity was too much for a 10 hours CTF, so the task wasn’t solved at 4h00 by anyone. To be honest, without the help of the conceptors, I’ll not be able to solve it, even afterwards (I guess I would ragequit() before the flag  ).

The troll


From the description: « To our surprise, we found out that our challenge from last year has been counterfeited by another CTF. » is well sent  . Last year I solved in 3 minutes the hardware challenge, because the flash read protection fuse on the STM32 was forgotten (write-up HERE). In November 2018 Balda got a kind word at GreHack CTF on the first hardware challenge :

"An Insomni'Hack 2018 tribute":
Was a 400 points at Insomni'hack and is only a 50 points at GreHack ... with the good tools ( Hello Baldanos  )

This year you win, so 1 – 1. See you the 15th of November for the next edition of GreHack  .

Credits & Greetings

Nice challenge by Baldanos (@Baldanos) and Sylvain (@Pelissier_S). Thanks for your time and the technical trix on Ghidra during the CTF. Big up guyz!


Thanks to Azox (@8008135_) for help me at … 3H25! Pretty sure that together we would solve it in time, bourricot  !

Thanks to Marius (@nSinusR) from Tasteless (@TeamTasteless) for review & suggestions on this write-up.


And also thanks to the SCRT team, especially Michael (@0xGrimmlin) for making things possible  . See you next year!

Write-up by Phil (@PagetPhil) 27/03/2019