Espressif ESP32: Bypassing Encrypted Secure Boot (CVE-2020-13629)

SP32: Bypassing Encrypted Secure Boot (CVE-2020-13629)

Original text by Raelize

We arrived at the last post about our Fault Injection research on the ESP32. Please read our previous posts as it provides context to the results described in this post.

During our Fault Injection research on the ESP32, we gradually took steps forward in order to identify the required vulnerabilities that allowed us to bypass Secure Boot and Flash Encryption with a single EM glitch. Moreover, we did not only achieve code execution, we also extracted the plain-text flash data from the chip.

Espressif requested a CVE for the attack described in this post: CVE-2020-13629. Please note, that the attack as described in this post, is only applicable to ESP32 silicon revision 0 and 1. The newer ESP32 V3 silicon supports functionality to disable the UART bootloader that we leveraged for the attack.

UART bootloader

The ESP32 implements an UART bootloader in its ROM code. This feature allows, among other functionality, to program the external flash. It’s not uncommon that such functionality is implemented in the ROM code as it’s quite robust as the code cannot get corrupt easily. If this functionality would be implemented by code stored in the external flash, any corruption of the flash may result in a bricked device.

Typically, this type of functionality is accessed by booting the chip in a special boot mode. The boot mode selection is often done using one or more external strap pin(s) which are set before resetting the chip. On the ESP32 it works exactly like this pin G0 which is exposed externally.

The UART bootloader supports many interesting commands that can be used to read/write memory, read/write registers and even execute a stub from SRAM.

Executing arbitrary code

The UART bootloader supports loading and executing arbitrary code using the load_ram command. The ESP32‘s SDK includes all the tooling required to compile the code that can be executed from SRAM. For example, the following code snippet will print SRAM CODE\n on the serial interface.

void __attribute__((noreturn)) call_start_cpu0()
{
    ets_printf("SRAM CODE\n");
    while (1);
}

The esptool.py tool, which is part of the ESP32‘s SDK, can be used to load the compiled binary into the SRAM after which it will be executed.

esptool.py --chip esp32 --no-stub --port COM3 load_ram code.bin

Interestingly, the UART bootloader cannot disabled and therefore always accessible, even when Secure Boot and Flash Encryption are enabled.

Additional measures

Obviously, if no additional security measures would be taken, leaving the UART bootloader always accessible would render Secure Boot and Flash Encryption likely useless. Therefore, Espressif implemented additional security measures which are enabled using dedicated eFuses.

These are security configuration bits implemented in special memory, often referred to as OTP memory, which can typically only change from 0 to 1. This guarantees, that once enabled, is enabled forever. The following OTP memory bits are used to disable specific functionality when the ESP32 is in the UART bootloader boot mode.

  • DISABLE_DL_ENCRYPT: disables flash encryption operation
  • DISABLE_DL_DECRYPT: disables transparent flash decryption
  • DISABLE_DL_CACHE: disables the entire MMU flash cache

The most relevant OTP memory bit is DISABLE_DL_DECRYPT as it disables the transparent decryption of the flash data.

If not set, it would be possible to simply access the plain-text flash data while the ESP32 is in its UART bootloader boot mode.

If set, any access to the flash, when the chip is in UART bootloader boot mode, will yield just the encrypted data. The Flash Encryption feature, which is fully implemented in hardware and transparent to the processor, is only enabled in when the ESP32 is in Normal boot mode.

The attacks described in this post have all these bits set to 1.

Persistent data in SRAM

The SRAM memory that’s used by the ESP32 is typical technology that’s used by many chips. It’s commonly used to the ROM‘s stack and executing the first bootloader from flash. It’s convenient to use at early boot as it typically require no configuration before it can be used.

We know from previous experience that the data stored in SRAM memory is persistent until it’s overwritten or the required power is removed from the physical cells. After a cold reset (i.e. power-cycle) of the chip, the SRAM will be reset to its default state. This often semi-random and unique per chip as the default value for each bit (i.e. 0 or 1) is different.

However, after a warm reset, where the entire chip is reset without removing the power, it may happen that the data stored in SRAM remains unaffected. This persistence of the data is visualized in the picture below.

We decided to figure out if this behavior holds up for the ESP32 as well. We identified that the hardware watchdog can be used to issue a warm reset from software. This watchdog can also be issued when the chip is in UART bootloader boot mode and therefore we can use it to reset the ESP32 back into Normal boot mode.

Using some test code, loaded and executed in SRAM using the UART bootloader, we determined that the data in SRAM is indeed persistent after issuing a warm reset using the watchdog. Effectively this means we can boot the ESP32 in Normal boot mode with the SRAM filled with controlled data.

But… how can we (ab)use this?

Road to failure

We envisioned that we may be able to leverage the persistence of data in SRAM across warm resets for an attack. The first attack we came up with is to fill the SRAM with code using the UART bootloader and issue a warm reset using the watchdog. Then, we inject a glitch while the ROM code is overwriting this code with the flash bootloader during a normal boot.

We got this ideas as during our previous experiments, where we turned data transfers into code execution, we noticed that for some experiments the chip started executing from the entry address before the bootloader was finished copying.

Sometimes you just need to try it…

Attack code

The code that we load into the SRAM using the UART bootloader is shown below.

#define a "addi a6, a6, 1;"
#define t a a a a a a a a a a
#define h t t t t t t t t t t
#define d h h h h h h h h h h

void __attribute__((noreturn)) call_start_cpu0() {
    uint8_t cmd;

    ets_printf("SRAM CODE\n");

    while (1) {

        cmd = 0;
        uart_rx_one_char(&cmd);

        if(cmd == 'A') {                                    // 1
            *(unsigned int *)(0x3ff4808c) = 0x4001f880;
            *(unsigned int *)(0x3ff48090) = 0x00003a98;
            *(unsigned int *)(0x3ff4808c) = 0xc001f880;
        }
    }

    asm volatile ( d );                                     // 2

    "movi a6, 0x40; slli a6, a6, 24;"                       // 3
    "movi a7, 0x00; slli a7, a7, 16;"
    "xor a6, a6, a7;"
    "movi a7, 0x7c; slli a7, a7, 8;"
    "xor a6, a6, a7;"
    "movi a7, 0xf8;"
    "xor a6, a6, a7;"

    "movi a10, 0x52; callx8  a6;" // R
    "movi a10, 0x61; callx8  a6;" // a            
    "movi a10, 0x65; callx8  a6;" // e               
    "movi a10, 0x6C; callx8  a6;" // l               
    "movi a10, 0x69; callx8  a6;" // i               
    "movi a10, 0x7A; callx8  a6;" // z               
    "movi a10, 0x65; callx8  a6;" // e               
    "movi a10, 0x21; callx8  a6;" // !               
    "movi a10, 0x0a; callx8  a6;" // \n               

    while(1);
}

To summarize, the above code implements the following:

  1. Command handler with a single command to perform a watchdog reset
  2. NOP-like padding using addi instructions
  3. Assembly for printing Raelize! on the serial interface

Please note, the listing’s numbers match the numbers in the code.

Timing

We target a reasonably small attack window at the start of F which is shown in the picture below. We know from previous experiments that during this moment the flash bootloader is copied.

The glitch must be injected before our code in SRAM is entirely overwritten by the valid flash bootloader.

Attack cycle

We took the following steps for each experiment to determine if the attack idea actually works. A successful glitch will print Raelize! on the serial interface.

  1. Set pin G0 to low and perform a cold reset to enter UART bootloader boot mode
  2. Use the load_ram command to execute our attack code from SRAM
  3. Send an A to the program to issue a warm reset into normal boot mode
  4. Inject a glitch while the flash bootloader is being copied by the ROM code

Results

After running these experiments for more than a day, resulting in more than 1 million experiments, we did not observe any successful glitch…

An unexpected result

Nonetheless, while analyzing the results, we noticed something unexpected.

The serial interface output for one of the experiments, which is shown below, indicated that the glitch caused an illegal instruction exception.

ets Jun  8 2016 00:22:57
rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0008,len:4
load:0x3fff000c,len:3220
load:0x40078000,len:4816
load:0x40080400,len:18640
entry 0x40080740
Fatal exception (0): IllegalInstruction
epc1=0x661b661b, epc2=0x00000000, epc3=0x00000000, 
excvaddr=0x00000000, depc=0x00000000

These type of exceptions happened quite often when glitches are injected in a chip. This was not different for the ESP32. For most the exceptions the PC register is set to a value that’s expected (i.e. a valid address). It does not happen often the PC register is set to such an interesting value.

The Illegal Instruction exception is caused as there is no valid instruction stored at the 0x661b661b address. We conclude this value must come from somewhere and that is cannot magically end up in the PC register.

We analyzed the code that we load into the SRAM in order to find an explanation. The binary code, of which a snippet is shown below, quickly gave us the answer we were looking for. The value 0x661b661b is easily identified in the above binary image. It actually represents two addi a6, a6, 1 instructions of which we implemented 1000 in our test code.

00000000  e9 02 02 10 28 04 08 40  ee 00 00 00 00 00 00 00  |....(..@........|
00000010  00 00 00 00 00 00 00 01  00 00 ff 3f 0c 00 00 00  |...........?....|
00000020  53 52 41 4d 20 43 4f 44  45 0a 00 00 00 04 08 40  |SRAM CODE......@|
00000030  50 09 00 00 00 00 ff 3f  04 04 fe 3f 4d 04 08 40  |P......?...?M..@|
00000040  00 04 fe 3f 8c 80 f4 3f  90 80 f4 3f 98 3a 00 00  |...?...?...?.:..|
00000050  80 f8 01 c0 54 7d 00 40  d0 92 00 40 36 61 00 a1  |....T}.@...@6a..|
00000060  f5 ff 81 fc ff e0 08 00  0c 08 82 41 00 ad 01 81  |...........A....|
00000070  fa ff e0 08 00 82 01 00  4c 19 97 98 1f 81 ef ff  |........L.......|
00000080  91 ee ff 89 09 91 ee ff  89 09 91 f0 ff 81 ee ff  |................|
00000090  99 08 91 ef ff 81 eb ff  99 08 86 f2 ff 5c a9 97  |.............\..|
000000a0  98 c5 1b 66 1b 66 1b 66  1b 66 1b 66 1b 66 3e 0c  |...f.f.f.f.f.f>.|
000000b0  1b 66 1b 66 1b 66 1b 66  1b 66 1b 66 1b 66 1b 66  |.f.f.f.f.f.f.f.f|
000000c0  1b 66 1b 66 1b 66 1b 66  1b 66 1b 66 1b 66 1b 66  |.f.f.f.f.f.f.f.f|
000000d0  1b 66 1b 66 1b 66 1b 66  1b 66 1b 66 1b 66 1b 66  |.f.f.f.f.f.f.f.f|
...
00000330  1b 66 1b 66 1b 66 1b 66  1b 66 1b 66 1b 66 1b 66  |.f.f.f.f.f.f.f.f|
00000340  1b 66 1b 66 1b 66 1b 66  1b 66 1b 66 1b 66 1b 66  |.f.f.f.f.f.f.f.f|
00000350  1b 66 1b 66 1b 66 1b 66  1b 66 1b 66 1b 66 1b 66  |.f.f.f.f.f.f.f.f|

We just use these instructions as NOPs in order to create a landing zone in a similar fashion a NOP-sled is often used in software exploits. We did not anticipate these instructions would end up in the PC register.

Of course, we did not mind either. We concluded that, we are able to load data from SRAM into the PC register when we inject a glitch while the flash bootloader is being copied by the ROM code .

We quickly realized, we now have all the ingredients to cook up an attack where we bypass Secure Boot and Flash Encryption using a single glitch. We reused some of the knowledge obtained during a previously described attack where we take control of the PC register.

Road to success

We reused most of the code that we previously loaded into SRAM using the UART bootloader. Only the payload (i.e. printing) that we intended to execute is removed as our strategy is now to set the PC register to an arbitrary value in order to take control.

#define a "addi a6, a6, 1;"
#define t a a a a a a a a a a
#define h t t t t t t t t t t
#define d h h h h h h h h h h

void __attribute__((noreturn)) call_start_cpu0() {
    uint8_t cmd;
   
    ets_printf("SRAM CODE\n");

    while (1) {

        cmd = 0;
        uart_rx_one_char(&cmd);

        if(cmd == 'A') {
            *(unsigned int *)(0x3ff4808c) = 0x4001f880;
            *(unsigned int *)(0x3ff48090) = 0x00003a98;
            *(unsigned int *)(0x3ff4808c) = 0xc001f880;
        }
    }

    asm volatile ( d );

    while(1);
}

After compiling the above code, we overwrite directly in the binary the addi instructions with the address pointer 0x4005a980. This address points to a function in the ROM code that prints something on the serial interface. This allows us to identify when we are successful.

We fixed the glitch parameters to that of the experiment that caused the Illegal Instruction exception. After a short while, we successfully identified several experiments during which the address pointer is loaded into the PC register. Effectively this provides us with control of the PC register and we can likely achieve arbitrary code execution.

Why does this work?

Good question. Not so easy to answer.

Unfortunately, we do not have a sound answer for you. We definitely did not anticipate that controlling the data at the destination could yield control of the PC register. We came up with a few possibilities, but we cannot say with full confidence if any of these is actually correct.

One explanation is that the glitch may corrupt both operands of the ldr instruction in order to load a value from the destination into the a0. This is similar as the previously described attack where we control PC indirectly by controlling the source data.

Moreover, it’s a possibility that the ROM code implements functionality that facilitates this attack. In other words, we may execute valid code within the ROM due to our glitch that causes the value from SRAM to be loaded into the PC register.

More thorough investigation is required in order to determine what exactly allows us to perform this attack. However, from an attacker’s perspective, it’s sufficient to realize how to get control of PC in order to build the exploit.

Extracting plain-text data

Even though we have control of the PC register, we are not yet able to extract the plain-text data from the flash. We decided to leverage the UART bootloader functionality to do so.

We decided to jump directly to the UART bootloader while the chip is in Normal boot mode. For this attack we overwrite the addi instructions in the code that we load into SRAM with address pointers to the start of the UART bootloader (0x0x40007a19).

The UART bootloader prints a string on the serial interface which is shown below. We can use this to identify if we are successful or not.

waiting for download\n"

Once we observe a successful experiment, we can simply use the esptool.py to issue a read_mem command in order to access plain-text flash data. The command below reads 4 bytes from the address where the external flash is mapped (0x3f400000).

esptool.py --no-stub --before no_reset --after no_reset read_mem 0x3f400000

Unfortunately, this did not work. For some reason the processor is replying with 0xbad00bad which is an indication we read from an unmapped page.

esptool.py v2.8
Serial port COM8
Connecting....
Detecting chip type... ESP32
Chip is ESP32D0WDQ6 (revision 1)
Crystal is 40MHz
MAC: 24:6f:28:24:75:08
Enabling default SPI flash mode...
0x3f400000 = 0xbad00bad
Staying in bootloader.

We noticed that there is quite some configuration done at the start of the UART bootloader. We assume it may affect the MMU as well.

Just to try something different, we decided to jump directly to the command handler of the UART bootloader itself (0x40007a4e). Once in the hander, we can send a raw read_mem command directly on the serial interface which is shown below.

target.write(b'\xc0\x00\x0a\x04\x00\x00\x00\x00\x00\x00\x00\x40\x3f\xc0')

Unfortunately, by jumping directly to the handler, the string that’s printed (i.e. waiting for download\n") is not printed anymore. Therefore, we cannot easily identify successful experiments. Therefore, we decided to simply always send the command, regardless if we are successful or not. We used a very short serial interface timeout in order to minimize the overhead of almost always hitting the timeout.

After a short while, we observed the first successful experiments!

Conclusion

In this post we described an attack on the ESP32 where we bypass its Secure Boot and Flash Encryption features using a single EM glitch. Moreover, we leveraged the vulnerability exploited by this attack to extract the plain-text data from the encrypted flash.

We can use FIRM to break down the attack in multiple comprehensible stages.

Interestingly, two weaknesses of the ESP32 facilitated this attack. First, the UART bootloader cannot be disabled and is always accessible. Second, the data loaded in SRAM is persistent across warm resets and can therefore be filled with arbitrary data using UART bootloader.

Espressif indicated in their advisory related to this attack that newer versions of the ESP32 include functionality to completely disable this feature.

Final thoughts

All standard embedded technologies are vulnerable to Fault Injection attacks. Therefore, it’s not surprising at all that the ESP32 is vulnerable as well. These type of chips are simply not made to be resilient against these type of attacks. However, and this is important, this does not mean that these attacks do not impose a risk.

Our research has shown that leveraging chip-level weaknesses for Fault Injection attack is very effective. We have not seen many public examples yet as most attack still focus on traditional approaches where the focus is mostly on bypassing just a check.

We believe the full potential of Fault Injection attacks is still unexplored. Most research until recently focused mostly on the injection method itself (i.e. ActivateInject and Glitch) compared to what can be accomplished due to a vulnerable chip (i.e. FaultExploit and Goal).

We are confident that creative usage of new and undefined fault models, will give rise to unforeseen attacks, where exciting exploitation strategies are used, for a wide variety of different goals.

Windows Processes

Origin text: Blue Team fundamentals Part Two: Windows Processes.

 

There is a lot of information to be gleaned from Windows processes. The thing with processes is that there are a lot of them, and it can seem massively overwhelming, However with a bit of patience and the aid of a book or three (which I will touch on at the end of this post) you can get really quite far under your own steam.

 

This obviously requires some form of logging of process trees in a multi host/network scenario, there are a good number of products out there that do this now, especially with buzzword bingo firing on all cylinders, if your endpoint product/agent declares itself ‘next gen’ it should do this. We however, will look at these processes from a single host perspective for the sake of this write-up. The rules and theories will be the same though.

First off we need to arm ourselves with ‘Process Explorer’, which is part of the Windows Sys Internals Suite which can be found at https://technet.microsoft.com/en-gb/sysinternals/bb842062

and when fired up looks a bit like this:

All those lovely processes!

So, lets take a look shall we?

System/Idle

These two are special. Why? Well they are not technically full processes. They are not running a user-mode executable as they are both created by ntoskrnl.exe (NT OS Kernel), which as the name implies means it is running in kernel mode. More information on the difference between user mode and kernel mode can be found here:

Some key features of these two processes are:

  • Neither should have a visible parent process
  • In the case of the idle process it should be operating one thread per CPU as seen here on my quad core machine.
  • The PID value for System is always 4:
  • There should only ever be one instance of System:

What can we look for:

  • If either process has a visible parent ID.
  • If there are more idle threads than CPU’s on the host.
  • If the PID for system is not 4.
  • If you have multiple instances of System.

Session Manager Subsystem/SMSS.EXE

smss.exe is executed during the startup of the OS and is the first user-mode process to be started by the kernel. It starts both kernel and user modes of the Win32 subsystem. This includes win32k.sys (kernel-mode), winsrv.dll (user-mode), and csrss.exe (user-mode). Any other subsystems listed in the following required registry value are also started:

HKLM\System\CurrentControlSet\Control\Session Manager\SubSystems

Required is the value that lists which subsystems should be loaded at boot time.

Windows lists the file specification of the Windows subsystem and we can see it contains %SystemRoot%\system32\csrss.exe

Specifics of this process:

  • It’s Parent will always be System
  • It’s user will always be NT AUTHORITY\SYSTEM
  • Runs from \systemroot\System32\smss.exe
  • all of the above can be seen below:
  • There should only ever be one instance SMSS.EXE running. This is actually very important because it is responsible for launching both WINLOGON.EXE, WININIT.EXE and CSRSS.EXE. Now you might be wondering why this is important? Well, after launching these SMSS.EXE exits and a new instance is started, meaning that when viewing either of these processes, they may have Parent PID (Process ID) attached but the process name should be unknown.
Unknown Parent Process
  • The OS Session is started under Session 0
  • The first user Session is started under session 1

What can we look for:

  • Parent/User is not SYSTEM
  • Seen running from outside of \%systemroot%\system32\
  • Either winlogon, wininit or csrss having a parent process (at all).
  • Unexpected subsystem registry entries.

Flying off on a tangent:

As we touched on Sessions I think it’s worth have a slightly more in depth look at them. If you launch WinObj.exe from sys internals (as admin) and browse to the /Sessions folder you will see something similar to this:

As we already said 0 is the OS session instance and 1 is the first user.

Taking a look at one of the OS session we can see a shared network drive:

Whilst taking a look at user session under WindowStations we see:

So what does this prove? Well preemptively jumping ahead and looking at explorer.exe and it’s associated handlers we can see:

This shows we can view the namespace instance for a specific process.

anyway…


Windows Logon Process/WINLOGON.EXE

 
  • Does not have a parent process
  • Runs as NT AUTHORITY\SYSTEM
  • Runs out of \Windows\System32
  • Can spawn child processes if alternate login devices are used such as card /biometric readers etc
  • Secure Authentication Sequence/SAS (Ctrl+Alt+Delete) is handled by winlogon
  • Loads Userinit from registry location:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon

which:

Specifies the programs that Winlogon runs when a user logs on. By default, Winlogon runs Userinit.exe, which runs logon scripts, reestablishes network connections, and then starts Explorer.exe, the Windows user interface.

https://technet.microsoft.com/en-gb/library/cc939862.aspx

  • The registry entry for userinit should be userinit.exe, (The comma is intended). This can be appended with other locations and can be utilised by malware to run things at logon.
  • The shell registry entry in the same location should be Explorer.exe (Windows Explorer). This can also be appended with other values allowing the running of malware at logon.

Registry entries that when modified will load content at logon.
  • As with SMSS.EXE, userinit.exe exits after it has run, meaning anything loaded from registry (explorer.exe, dodgy malwares) will not have a parent process. It also means userinit.exe will not be visible to you in Process Explorer.

What can we look for:

  • Shell and Userinit registry entries with undesirable values.
  • Having a parent process
  • Not running as NT AUTHORITY\SYSTEM
  • Runs out of a location that isn’t \Windows\System32

Explorer.exe / Windows Explorer

As Winlogon runs userinit which spawns explorer, let’s look at the explorer process next.

  • We have already covered the fact that userinit.exe exits after execution meaning no parent process.
  • Again as already covered, the registry entry lives under
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\Shell
  • The user will be the logged in account (that which was used to logon via the WINLOGON.EXE process).
  • It runs out of \%Systemroot%\Explorer.exe
Take my word for it, that was my username.
  • Explorer spawns child processes
Child processes spawned from explorer.exe

What can we look for:

  • Not running out of system root.
  • Assuming a genuine naming convention in use, running under context of an unrecognised user.
  • Making any TCP/IP connections.
No TCP/IP connections — This is good.

CSRSS.EXE / Client-Server Run

  • Runs as NT AUTHORITY\SYSTEM
  • Runs out of \%SystemRoot%\system32\csrss.exe
  • There is an instance loaded for each session, which loads three DLLs; basesrv.dll, winsrv.dll and csrsrv.dll. These dll’s support the creation and deletion of threads and processes (amongst other stuff).
  • It’s a session 0 process

What can we look for:

  • Running outside \system32
  • Running as non SYSTEM user

WININIT.EXE / Windows Initialisation Process

  • Runs as NT AUTHORITY\SYSTEM
  • Runs from %SystemRoot%\system32\wininit.exe
  • Creates a Window station (Winsta0) and two desktops (Winlogon and Default) for processes to run on in session 0
  • Creates Services.exe
  • Starts Lsass.exe
  • Starts Lsm.exe
  • Always a child of SMSS.EXE (We know this won’t have a Parent process due to smss termination after running)
  • Creates %windir%\temp
  • Runs within session 0

What can we look for:

  • Running outside system32
  • Running as non SYSTEM user
  • Child of an identifiable process.

  • Runs as NT AUTHORITY\SYSTEM
  • Run from %SystemRoot%\System32\services.exe
  • Child to WININIT.EXE
  • Spawns Multiple services which can be seen in registry location:
SYSTEM\CurrentControlSet\Services
  • Run from %SystemRoot%\system32\services.exe
  • Runs as NT AUTHORITY\SYSTEM
  • The process should spawn many generic svchost.exe which are responsible for running the services.

What can we look for:

  • Running outside \system32
  • Running as non SYSTEM user
  • Child of a process that is not WININIT.EXE

SVCHOST.EXE / Service Hosting Process

  • There will be multiple instances of SVCHOST running as per the following screen grabs:
tasklist.exe /svc on the left, compared to process explorer on the right and you can see the PIDS line up.
all the svchosts
an expanded svchost
  • Runs from %SystemRoot%\System32\svchost.exe
  • Can run as any of the following accounts: NT AUTHORITY\SYSTEM, LOCAL SERVICE, or NETWORK SERVICE
  • Always the child process of services.exe
  • It will always follow the command line convention of “svchost.exe -k [name]”
  • -k [name] values should exist within the following registry location:
Software\Microsoft\Windows NT\CurrentVersion\Svchost
Registry entries
Running under svchost context

What can we look for:

  • Processes not running under NT AUTHORITY\SYSTEM, LOCAL SERVICE, or NETWORK SERVICE
  • Processes not using the -k [name] convention
  • Parent not services.exe
  • Running from outside \system32

LSASS.EXE / Local Security Authority

  • Run as NT AUTHORITY\SYSTEM
  • Run from %SystemRoot%\System32\lsass.exe
  • Spawned by WININIT.EXE
  • There should never be more than one LSASS.EXE process.
  • It should never spawn any child processes.
  • Runs within session 0

What can we look for:

  • Not running as SYSTEM.
  • Due to the nature of LSASS it is a high value target for malware and frequently spoofed.
  • lsass not running out of %SystemRoot%\System32\ is highly likely to be malicious.
  • Anything that looks like it has been spawned by lsass is also likely malicious. In most cases this is a malware author using clever font techniques to obfuscate the file; Isass vs lsass for example.

LSM.EXE / Load Session Manager Service

  • Runs as NT AUTHORITY\SYSTEM
  • Runs from %systemroot%\System32\lsm.exe
  • Parent Process should always be wininit.exe
  • The Local Session Manager manages (← shocking) the state of terminal server sessions on the local machine. It sends requests to Smss through the ALPC port SmSsWinStationApiPort to start new sessions.
  • LSM.EXE should never spawn any child processes
  • Runs within session 0

What can we look for:

  • LSM spawning any child process
  • Running outside system32
  • Not running as SYSTEM
  • Parent process no wininit.exe

Своя маленькая операционная система

А что если хочется написать программку, для которой ничего не надо? Вставляем дискетку в компьютер, загружаемся с нее и …»Hello World»! Можно даже прокричать это приветствие из защищенного режима…
Сказано — сделано. С чего бы начать?.. Набраться знаний, конечно. Для этого очень хорошо полазить в исходниках Linux и Thix. Первая система всем хорошо знакома, вторая менее известна, но не менее полезна.

Подучились? Теперь займемся. Понятно, что первым делом надо написать загрузочный сектор для нашей мини-операционки (а ведь это будет именно мини-операционка!). Поскольку процессор грузится в 16-разрядном режиме, то для создания загрузочного сектора используется ассемблер и линковщик из пакета bin86. Можно, конечно, поискать еще что-нибудь, но оба наших примера используют именно его; и мы тоже пойдем по стопам учителей. Синтаксис этого ассемблера немного странноватый, совмещающий черты, характерные и для Intel и для AT&T, но после пары недель мучений можно привыкнуть.

Загрузочный сектор (boot.S)

Сознательно не буду приводить полных листингов программ. Так станут понятней основные идеи, да и вам будет намного приятней, если все напишете своими руками.

Для начала определимся с основными константами.
START_HEAD = 0 — Головка привода, которою будем использовать.
START_TRACK = 0 — Дорожка, откуда начнем чтение.
START_SECTOR = 2 — Сектор, начиная с которого будем считывать наше ядрышко.
SYSSIZE = 10 — Размер ядра в секторах (каждый сектор содержит 512 байт)
FLOPPY_ID = 0 — Идентификатор привода. 0 — для первого, 1 — для второго
HEADS = 2 — Количество головок привода.
SECTORS = 18 — Количество дорожек на дискете. Для формата 1.44 МБ это количество равно 18.

В процессе загрузки будет происходить следующее. Загрузчик BIOS считает первый сектор дискеты, положит его по адресу 0000:0x7c00 и передаст туда управление. Мы его получим и — для начала — переместим себя пониже по адресу 0000:0x600, перейдем туда и спокойно продолжим работу. Собственно вся наша работа будет состоять из загрузки ядра (сектора 2 — 12 первой дорожки дискеты) по адресу 0x100:0000, переходу в защищенный режим и скачку на первые строки ядра. В связи с этим еще несколько констант:
BOOTSEG = 0x7c00 — Сюда поместит загрузочный сектор BIOS.
INITSEG = 0x600 — Сюда его переместим мы.
SYSSEG = 0x100 — А здесь приятно расположится наше ядро.
DATA_ARB = 0x92 — Определитель сегмента данных для дескриптора
CODE_ARB = 0x9A — Определитель сегмента кода для дескриптора.

Первым делом произведем перемещение самих себя в более приемлемое место.

cli
xor ax, ax
mov ss, ax
mov sp, #BOOTSEG
mov si, sp
mov ds, ax
mov es, ax
sti
cld
mov di, #INITSEG
mov cx, #0x100
repnz
movsw
jmpi go, #0
;прыжок в новое местоположение загрузочного сектора на метку go
Теперь необходимо настроить как следует сегменты для данных (es, ds) и для стека. Неприятно, конечно, что все приходится делать вручную, но что поделаешь — ведь кроме нас и BIOS в памяти компьютера никого нет.

go:
mov ax, #0xF0
mov ss, ax
mov sp, ax ; Стек разместим как 0xF0:0xF0 = 0xFF0
mov ax, #0x60 ; Сегменты для данных ES и DS зададим в 0x60
mov ds, ax
mov es, ax
Наконец, можно вывести победное приветствие. Пусть мир узнает, что мы смогли загрузиться! Поскольку у нас есть все-таки целый BIOS, воспользуемся готовой функцией 0x13 прерывания 0x10. Можно, конечно, его презреть и написать напрямую в видеопамять, но у нас каждый байт команды на счету, а байт таких всего 512. Потратим их лучше на что-нибудь более полезное.

mov cx,#18
mov bp,#boot_msg
call write_message
Функция write_message выглядит следующим образом
write_message:
push bx
push ax
push cx
push dx
push cx
mov ah,#0x03 ; прочитаем текущее положение курсора,
; дабы не выводить сообщения где попало.
xor bh,bh
int 0x10
pop cx
mov bx,#0x0007 ; Параметры выводимых символов : видеостраница 0,
; атрибут 7 (серый на черном)
mov ax,#0x1301 ; Выводим строку и сдвигаем курсор.
int 0x10
pop dx
pop cx
pop ax
pop bx
ret
А сообщение так

boot_msg:
.byte 13,10
.ascii «Booting data …»
.byte 0
К этому времени на дисплее компьютера появится скромное «Booting data …». Это в принципе не хуже, чем «Hello World», но давайте добьемся чуть большего. Перейдем в защищенный режим и выведем этот «Hello» уже из программы, написанной на C.

Ядро 32-разрядное. Оно будет у нас размещаться отдельно от загрузочного сектора и собираться уже с помощью gcc и gas. Синтаксис ассемблера gas соответствует требованиям AT&T, так что тут все будет попроще. Но для начала нам нужно прочитать ядро. Опять воспользуемся готовой функцией 0x2 прерывания 0x13.

recalibrate:
mov ah, #0
mov dl, #FLOPPY_ID
int 0x13 ; проведем реинициализацию дисковода.
jc recalibrate
call read_track ; вызов функции чтения ядра
jnc next_work ; если во время чтения не произошло
; ничего плохого, то работаем дальше

bad_read:
; если чтение произошло неудачно — выводим сообщение об ошибке
mov bp,#error_read_msg
mov cx,7
call write_message
inf1: jmp inf1 ; и уходим в бесконечный цикл.
; Теперь нас спасет только ручная перезагрузка
Сама функция чтения предельно простая: долго и нудно заполняем параметры, а затем одним махом считываем ядро. Сложности начнутся, когда ядро перестанет помещаться в 17 секторах (то есть 8.5КБ); но это пока в будущем, а сейчас вполне достаточно такого молниеносного чтения

read_track:
pusha
push es
push ds
mov di, #SYSSEG ; Определяем
mov es, di ; адрес буфера для данных
xor bx, bx
mov ch, #START_TRACK ;дорожка 0
mov cl, #START_SECTOR ;начиная с сектора 2
mov dl, #FLOPPY_ID
mov dh, #START_HEAD
mov ah, #2
mov al, #SYSSIZE ;считать 10 секторов
int 0x13
pop ds
pop es
popa
ret
Вот и все. Ядро успешно прочитано, и можно вывести еще одно радостное сообщение на экран.

next_work:
call kill_motor ; останавливаем привод дисковода
mov bp,#load_msg ; выводим сообщение
mov cx,#4
call write_message
Вот содержимое сообщения
load_msg:
.ascii «done»
.byte 0
А вот функция остановки двигателя привода.
kill_motor:
push dx
push ax
mov dx,#0x3f2
xor al,al
out dx,al
pop ax
pop dx
ret
На данный момент на экране выведено «Booting data …done» и лампочка привода флоппи-дисков погашена. Все затихли и готовы к смертельному номеру — прыжку в защищенный режим.

Для начала надо включить адресную линию A20. Это в точности означает, что мы будем использовать 32-разрядную адресацию к данным.

mov al, #0xD1 ; команда записи для 8042
out #0x64, al
mov al, #0xDF ; включить A20
out #0x60, al
Выведем предупреждающее сообщение — о том, что переходим в защищенный режим. Пусть все знают, какие мы важные.

protected_mode:
mov bp,#loadp_msg
mov cx,#25
call write_message
Сообщение:

loadp_msg:
.byte 13,10
.ascii «Go to protected mode…»
.byte 0
Пока у нас еще жив BIOS, запомним позицию курсора и сохраним ее в известном месте (0000:0x8000 ). Ядро позже заберет все данные и будет их использовать для вывода на экран победного сообщения.

save_cursor:
mov ah,#0x03 ; читаем текущую позицию курсора
xor bh,bh
int 0x10
seg cs
mov [0x8000],dx ;сохраняем в специальном тайнике
Теперь внимание, запрещаем прерывания (нечего отвлекаться во время такой работы) и загружаем таблицу дескрипторов

cli
lgdt GDT_DESCRIPTOR ; загружаем описатель таблицы дескрипторов.
У нас таблица дескрипторов состоит из трех описателей: нулевой (всегда должен присутствовать), сегмента кода и сегмента данных .

align 4
.word 0
GDT_DESCRIPTOR: .word 3 * 8 — 1 ; размер таблицы дескрипторов
.long 0x600 + GDT ; местоположение таблицы дескрипторов
.align 2
GDT:
.long 0, 0 ; Номер 0: пустой дескриптор
.word 0xFFFF, 0 ; Номер 8: дескриптор кода
.byte 0, CODE_ARB, 0xC0, 0
.word 0xFFFF, 0 ; Номер 0x10: дескриптор данных
.byte 0, DATA_ARB, 0xCF, 0
Переход в защищенный режим может происходить минимум двумя способами, но обе ОС, выбранные нами для примера (Linux и Thix) используют для совместимости с 286 процессором команду lmsw. Мы будем действовать тем же способом

mov ax, #1
lmsw ax ; прощай реальный режим.
; Мы теперь находимся в защищенном режиме.
jmpi 0x1000, 8 ; Затяжной прыжок на 32-разрядное ядро.
Вот и вся работа загрузочного сектора – не мало, но и не много. Теперь с ним мы попрощаемся и направимся к ядру.

В конце ассемблерного файла полезно добавить следующую инструкцию.

org 511
end_boot: .byte 0
В результате скомпилированный код будет занимать ровно 512 байт, что очень удобно для подготовки образа загрузочного диска.

Первые вздохи ядра (head.S)

Ядро, к сожалению, опять начнется с ассемблерного кода. Но теперь его будет совсем немного.

Мы собственно зададим правильные значения сегментов для данных (ES, DS, FS, GS). Записав туда значение соответствующего дескриптора данных.

cld
cli
movl $(__KERNEL_DS),%eax
movl %ax,%ds
movl %ax,%es
movl %ax,%fs
movl %ax,%gs
Проверим, нормально ли включилась адресная линия A20 — простым тестом записи. Обнулим для чистоты эксперимента регистр флагов.

xorl %eax,%eax
1: incl %eax
movl %eax,0x000000
cmpl %eax,0x100000
je 1b
pushl $0
popfl
Вызовем долгожданную функцию, уже написанную на С.

call SYMBOL_NAME(start_my_kernel)
И больше нам тут делать нечего.

inf: jmp inf
Поговорим на языке высокого уровня (start.c)

Вот теперь мы вернулись к тому, с чего начинали рассказ. Почти вернулись, потому что printf() теперь надо делать вручную. Поскольку готовых прерываний уже нет, то будем использовать прямую запись в видеопамять. Для любопытных — почти весь код этой части, с незначительными изменениями, позаимствован из части ядра Linux, осуществляющей распаковку (/arch/i386/boot/compressed/*). Для сборки вам потребуется дополнительно определить такие макросы как inb(), outb(), inb_p(), outb_p(). Готовые определения проще всего одолжить из любой версии Linux.

Теперь, дабы не путаться со встроенными в glibc функциями, отменим их определение

#undef memcpy
Зададим несколько своих:
static void puts(const char *);
static char *vidmem = (char *)0xb8000; /*адрес видеопамяти*/
static int vidport; /*видеопорт*/
static int lines, cols; /*количество линий и строк на экран*/
static int curr_x,curr_y; /*текущее положение курсора */
И начнем, наконец, писать код на языке высокого уровня… правда, с небольшими ассемблерными вставками.

/* функция перевода курсора в положение (x,y).
* Работа ведется через ввод/вывод в видеопорт
*/
void gotoxy(int x, int y)
{
int pos;
pos = (x + cols * y) * 2;
outb_p(14, vidport);
outb_p(0xff & (pos >> 9), vidport+1);
outb_p(15, vidport);
outb_p(0xff & (pos >> 1), vidport+1);
}

/* функция прокручивания экрана. Работает,
* используя прямую запись в видеопамять
*/
static void scroll()
{
int i;
memcpy ( vidmem, vidmem + cols * 2, ( lines — 1 ) * cols * 2 );
for ( i = ( lines — 1 ) * cols * 2; i < lines * cols * 2; i += 2 ) vidmem[i] = ' '; } /* функция вывода строки на экран */ static void puts(const char *s) { int x,y; char c; x = curr_x; y = curr_y; while ( ( c = *s++ ) != '\0' ) { if ( c == '\n' ) { x = 0; if ( ++y >= lines ) {
scroll();
y—;
}
} else {
vidmem [ ( x + cols * y ) * 2 ] = c;
if ( ++x >= cols ) {
x = 0;
if ( ++y >= lines ) {
scroll();
y—;
}
}
}
}
gotoxy(x,y);
}
/* функция копирования из одной области
* памяти в другую. Заменитель стандартной функции glibc
*/
void* memcpy(void* __dest, __const void* __src,
unsigned int __n)
{
int i;
char *d = (char *)__dest, *s = (char *)__src;
for (i=0;i<__n;i++) d[i]="s[i];" }="}" /* функция, издающая долгий и протяжный звук. * Использует только ввод/вывод в порты поэтому * очень полезна для отладки */ make_sound() { __asm__(" movb $0xB6, %al\n\t outb %al, $0x43\n\t movb $0x0D, %al\n\t outb %al, $0x42\n\t movb $0x11, %al\n\t outb %al, $0x42\n\t inb $0x61, %al\n\t orb $3, %al\n\t outb %al, $0x61\n\t "); } /*А вот и основная функция*/ int start_my_kernel() { /*задаются основные параметры */ vidmem = (char *) 0xb8000; vidport = 0x3d4; lines = 25; cols = 80; /* считываются предусмотрительно * сохраненные координаты курсора */ curr_x=*(unsigned char *)(0x8000); curr_y=*(unsigned char *)(0x8001); /*выводится строка*/ puts("done\n"); /*уходим в бесконечный цикл*/ while(1); } Вот и вывели мы этот "Hello World" на экран. Сколько проделано работы, а на экране только две строчки Booting data ...done Go to proteсted mode ...done А что – плохо?! Закричала новая операционная система. Мир с радостью воспринял ее. Кто знает, может быть - это новый Linux ?... Подготовка загрузочного образа (floppy.img) Теперь подготовим загрузочный образ нашей системки.Для начала соберем загрузочный сектор. as86 -0 -a -o boot.o boot.S ld86 -0 -s -o boot.img boot.o Обрежем 32-битный заголовок и получим таким образом чистый двоичный код. dd if=boot.img of=boot.bin bs=32 skip=1 Соберем ядро gcc -traditional -c head.S -o head.o gcc -O2 -DSTDC_HEADERS -c start.c При компоновке НЕ ЗАБУДЬТЕ параметр "-T"! Он указывает, относительно какого смещения вести расчеты; в нашем случае, поскольку ядро грузится по адресy 0x1000, смещение соответствующее: ld -m elf_i386 -Ttext 0x1000 -e startup_32 head.o start.o -o head.img Отделим зерна от плевел, то есть чистый двоичный код от всяческих служебных заголовков и комментариев: objcopy -O binary -R .note -R .comment -S head.img head.bin И соединим воедино загрузочный сектор и ядро cat boot.bin head.bin >floppy.img
Образ готов. Записываем на дискетку (заготовьте несколько для экспериментов, я прикончил три штуки), перезагружаем компьютер и наслаждаемся…

cat floppy.img >/dev/fd0