CVE-2021-33913

2024-06-27 1473 words 7 minutes

Contents

CVE-2021–33913 Analysis

CVE-2021–33913 is a heap-based buffer overflow that takes place in the SPF macro expansion process of the open source SPF library libspf2. According to the website, libspf2 is used by systems such as Sendmail, Postfix, Exim, Zmailer, and MS Exchange. This vulnerability was discovered along with CVE-2021–33912 by security researcher Nathaniel Bennett, who provided some details in a blog post: https://nathanielbennett.com/blog/libspf2-cve-jan-2022-disclosure along with a patch to fix both issues: https://github.com/shevek/libspf2/commit/ee4719544d891734090c24406f2bef8935ab3cf9 In this article, I will be demonstrating the root cause of the vulnerability as well as exploring some avenues for exploitation.

What is SPF

To understand this vulnerability, it is important to understand some details about SPF. SPF stands for Sender Policy Framework and is an email authentication technique that is used to prevent unauthorized senders from sending email on behalf of your domain. By publishing an SPF record, which is just a DNS TXT record associated to a particular domain, you are providing details to help a recipient determine whether or not a received email from that domain has come from a trustworthy source.

A typical SPF record looks something like this:

v=spf1 ip4:192.168.0.1/16 -all

The first part of this record indicates the SPF version, which is spf1 in this case, the second part of this record provides an IPv4 range as an allowed sender, and the final portion of the record indicates how the recipient should handle mail if the sender is not in the approved list, which in this case the ‘-all’ mechanism indicates that the mail should be discarded.

SPF Macros

SPF macros are a feature of SPF that allows for dynamic policies. Essentially variables are placed in the SPF record and are filled in with matching criteria per-email when handled by a receiving MTA. Macros enables services such as hosted SPF (https://www.proofpoint.com/us/resources/solution-briefs/proofpoint-hosted-spf-technical-overview) where an MTA operator can dynamically lookup and populate the SPF record for a particular domain. This reduces exposure as the person or organization who owns the associated domain does not need to publish their allowed senders in a public DNS record. This type of service also allows a very large list of allowed senders to exist for a particular domain without worrying about the SPF record size limitations and other messier workarounds such as referencing multiple TXT records from a single SPF record.

The SPF macro syntax consists of a % character as a prefix, followed by a macro character between two curly braces. For example, %{s} is a macro that would populate the sender of the email. For more details on the different types of macros and how to properly add them to records, see the following resource: https://www.jamieweb.net/blog/using-spf-macros-to-solve-the-operational-challenges-of-spf

The Vulnerability

The initial report claims that this vulnerability is triggered by “an error during the domain label reversal/truncation phase of macro expansion. In the event that a macro is reversed, truncated and URL-encoded during expansion, the buffer of the macro output is erroneously sized based on the length of the leftmost label in the domain”. Further analysis shows that this initial claim is only partially correct. The sizing of the macro output buffer is indeed sized based on the leftmost label in the domain, but the macro only needs to trigger the reversal and URL-encoding operations, truncation is not required for this overflow to occur, although it does still occur when truncation takes place.

My first step in understanding this vulnerability was to inspect the corresponding patch.

The patch itself is very simple, essentially the len variable is getting replaced by another variable label_len. This variable replacement fixes the issue as label_len is only used within the context of the reversal and truncation operations, whereas len is used in various other locations throughout the SPF_record_expand_data function.

As mentioned previously, the path required to trigger this variable requires the SPF library to both reverse and URL-encode the domain that will be populated in the macro output. Let’s take a closer look at the reversal implementation to understand how the output buffer is being improperly sized.

while ( p_read >= var ) {  
	if ( SPF_delim_valid(d, *p_read) ) {  
		len = p_read_end - p_read - 1;  
		memcpy( p_write, p_read + 1, len );  
		p_write += len;  
		*p_write++ = '.';  
  
                p_read_end = p_read;  
	}  
			p_read--;  
}

The above while loop is called to perform the domain label reversal. Essentially p_read points to the end of the domain string and var points to the beginning of the domain string. The while loop traverses the domain string from back to front and each time it encounters a delimiter, it performs a memcpy to write the characters prior to the delimiter to p_write. It is important to note that when memcpy is called, len is set to the length of the label, which is necessary for this operation to only copy that portion of the domain string to the new location.

The while loop handles each of the labels after the first delimiter, but the next bit of code is needed to handle the first label.

if (p_read_end >= p_read) {  
	len = p_read_end - p_read - 1;  
	memcpy( p_write, p_read + 1, len );  
	  
	p_write += len;  
	*p_write++ = '.';  
}

At this point the domain string has been completely traversed, p_write has been set to the reversed domain string, and len is equal to the length of the leftmost label.

The next part of this vulnerability involves the process for URL encoding the reversed domain string. We can see that memory is allocated for the url_var variable based on the size of len, which as we now know is set to the length of the leftmost label of the domain string. Then two pointers, p_read and p_write, are set to var (which holds the reversed domain string) and url_var (which the URL encoded string will be written to).

if (d->dv.url_encode) {  
		url_var = malloc(len * 3 + 1);  
		if (url_var == NULL) {  
			if (munged_var)  
				free(munged_var);  
			return SPF_E_NO_MEMORY;  
		}  
  
                p_read = var;  
		p_write = url_var;

Then a while loop iterates over p_read until a null character is reached and p_read is written to p_write one byte at a time.

while ( *p_read != '\0' )  
		{  
			if ( isalnum( (unsigned char)( *p_read  ) ) )  
				*p_write++ = *p_read++;  
			else  
			{  
				switch( *p_read )  
				{  
				case '-':  
				case '_':  
				case '.':  
				case '!':  
				case '~':  
				case '*':  
				case '\'':  
				case '(':  
				case ')':  
					*p_write++ = *p_read++;  
					break;

During this loop is where the overflow takes place. Since the memory allocated for p_read is based on the length of a single label in the domain string, it is possible to create a domain string that leads to an allocation that is smaller than the total length of the domain string.

Example

To get a better idea of how this works, let’s look at a sample domain string.

b.AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA.example.com

In the above example, we can see that the leftmost label is a single character. that means when malloc is called during the URL encoding process it would pass the value 4 as the allocation size since the size of the allocation is (len * 3 + 1). On a 64 bit system this would create a 32 byte allocation as that is the minimum allocation size for glibc malloc, and being that the overall string length is 49 bytes, an overflow would occur.

When testing this out, I set a breakpoint directly after the URL encoding process and took a look at the heap layout.

Here we can see the chunk allocated for url_var is located at 0x55555579e6b0 and sized at 32 bytes (0x20) as expected. We can also see that the adjacent chunk at 0x55555579e6d0 has been overwritten.

Exploitability

I spent some time attempting to exploit this vulnerability, but was ultimately unsuccessful. I believe this would definitely be exploitable with an older version of glibc, especially one that does not protect against corrupting the size of the top chunk. There is some variability in terms of where on the heap this overflow takes place which can essentially be controlled by the size of the leftmost label. I targeted multiple areas during my analysis based on the state of the tcache bins prior to the overflow. By crafting a domain string where the (leftmost label * 3 + 1) would equal the size of a freed chunk in the tcache, I could move the overflow around, but unfortunately this did not lead to any useful overflows.

It is also possible that another mail server could be exploitable. I specifically chose to use exim to test this vulnerability, but as mentioned earlier, libspf2 is used by multiple mail servers.

Finally, I also noticed that all of the available packages for libspf2 in the Ubuntu repositories are still serving the old vulnerable version of this library (including the dev and dbg packages).

Someone should probably do something about that I guess.