Wednesday, May 10, 2017

Proving missing ASLR on dropbox.com and box.com over the web for a $343 bounty :D

Overview

Cloud file storage providers such as Box and DropBox will typically thumbnail uploaded images for purposes of showing icons and previews. Predictably, both providers appear to use ImageMagick for thumbnailing. So what happens if we come knocking with the ImageMagick 1-day CESA-2017-0002?

CESA-2017-0002 is a vulnerability in the RLE image decoder, where the allocated render canvas memory is not initialized under some conditions. This leads to the server generated thumbnail and preview being based on uninitialized memory. The pixels of the resulting preview can be used to reconstruct chunks of server memory.

The vulnerability itself is not particularly complicated, and will be fully described in a future post that describes a scenario where it has more bite.

Sandboxing and isolation

This vulnerability has interesting interactions with sandboxes and process boundaries. While sandboxes can do wonders to mitigate vulnerabilities such as remote code execution and filesystem leaks, they do little for bugs that leak memory content, such as this one.

On the other hand, good use of process boundaries can help against bugs that leak memory content. In a one-process-per-thumbnail model, the virtual address space of the process is only going to contain the attacker's data, and likely not the private data of anyone else.

In the case of Box and DropBox, I've not seen any indications of leaking anyone else's private data. It's likely that they are both using a one-process-per-thumbnail model. Certainly, one "easy" way to integrate ImageMagick would be to use the convert binary to run individual image transform jobs. This would give a one-process-per-thumbnail model.

Trying to get a bounty

Let's face it, getting a bug bounty is kind of cool. Given that we don't think we can leak someone else's data, might there be anything else in the address space that we could leak that is worthy of a bounty? Unfortunately, most bounty programs exclude minor leaks such as precise versioning information or configuration details.

But one idea does occur to us: since we're leaking the content of free'd memory chunks, we're very likely to see pointer values for things like malloc() freelist entries. The specific values of these pointers will tell us what level of ASLR exists in the process. If the level of ASLR is anything other than "full ASLR", would that be worth a bounty? Maybe. Let's proceed to try and leak some pointer values.

Exfiltrating bytes of memory

To exfiltrate bytes of memory, we will upload a color RLE file of 16x8 dimensions and download the resulting PNG file from the file preview panel. For both Box and DropBox, the preview has the following useful traits:
  • Produces a PNG download. PNG is a lossless format, so the PNG pixel values should correspond exactly with raw memory bytes.
  • Leaves image dimensions alone for smaller image sizes. This is useful because any downscaling would result in information loss.
We choose a 16x8 file based on trial and error. The size of the input file determines what malloc() size is used for the canvas that is not initialized properly:

    pixel_info_length=image->columns*image->rows*
        MagickMax(number_planes_filled,4);
    pixels=(unsigned char *) GetVirtualMemoryBlob(pixel_info);

Different sizes here will result in the leak of different portions of the heap. 16x8 leads to a 512 byte allocation. Based purely upon testing rather than theory, this appears to be good size that reliably leaks pointer values. But if we make the size too large, perhaps in the greedy hope of leaking tons of data, we'll end up with our allocation getting placed at the end of the heap. Obviously, there won't be any previous content in a brand new allocation at the end of the heap, and we'd leak a bunch of 0 bytes, which is not particularly impressive.

Observed leaked bytes were fairly consistent across runs, lending further evidence that Box and DropBox might just be using the convert binary, which would have a fresh heap state on each operation. This also suggests that if we really wanted to, we could carefully control the allocations and deallocations that our input file performs, in order to get the heap into a specific state to control exactly what was leaked.

Here are two examples of leaked images:

On the left is Box and on the right DropBox. The images have been scaled to 800% original size for clarity. Right away, we can visually see that our empty input canvas has resulted in a non-empty output canvas: leaked memory content! In order to turn the original downloaded PNG into a more digestible format, we can convert it to raw bytes like this:

convert box_16_8_rgb_rle.png out.rgb

And the dumping the resulting out.rgb file (e.g. od -t x1), we can look for pointers. First, Box:

0000000 88 27 81 cf cd 7f 00 00 88 27 81 cf cd 7f 00 00
0000020 00 39 18 02 00 00 00 00 00 39 18 02 00 00 00 00
0000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
0000140 00 00 00 00 00 00 00 00 20 31 81 cf cd 7f 00 00
0000160 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00
0000200 00 00 00 00 00 00 00 00 f0 39 18 02 00 00 00 00
[...]

Bytes highlighted in orange appear to be pointers to "high" locations in x86_64 virtual memory, such as the first one, 0x00007fcdcf812788. In fact, that's probably a pointer into the glibc static BSS, for the head of one of the freelist buckets. Running the test again, we get the different value 0x00007efdaac7d788, but the lower 12 bits (i.e. page offset) are the same, at 0x788.

Bytes highlighted in red appear to be pointers to "low" locations in virtual memory, such as 0x00000000021839f0. Running the test again, we get 0x0000000001cc19f0.

For DropBox, we dump out the following values of interest, with the same rules for highlights, and similar results:

0000000 30 76 38 02 00 00 00 00 00 00 00 00 00 00 00 00
0000020 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00
0000040 00 00 00 00 00 00 00 00 00 07 2c c9 5a 7f 00 00
0000060 00 00 00 00 00 00 00 00 54 52 53 54 ff ff ff ff
0000100 70 00 00 00 00 00 00 00 50 00 00 00 00 00 00 00
0000120 70 75 38 02 00 00 00 00 00 00 00 00 00 00 00 00
0000140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
0000220 00 00 00 00 00 00 00 00 61 02 00 00 00 00 00 00
0000240 d0 66 38 02 00 00 00 00 c0 75 38 02 00 00 00 00
[...]

ASLR determination

So, can we conclude what ASLR situation exists in the processes we dumped data from? Yes. At first glance, the ASLR may appear reasonable because all of the pointers are bouncing around between invocations. But there are at least 4 possible Linux ASLR setups that we should try and distinguish between. Here are some examples of these four setups for Linux x86_64:

1) Position independent executable, no system ASLR (/proc/sys/kernel/randomize_va_space)

555555554000-555555580000 r-xp 00000000 fc:01 659830                     /usr/bin/curl
55555577f000-555555782000 r--p 0002b000 fc:01 659830                     /usr/bin/curl
555555782000-555555783000 rw-p 0002e000 fc:01 659830                     /usr/bin/curl
555555783000-555556f4b000 rw-p 00000000 00:00 0                          [heap]

2) Position independent executable, and system ASLR

55809480d000-558094839000 r-xp 00000000 fc:01 659830                     /usr/bin/curl
558094a38000-558094a3b000 r--p 0002b000 fc:01 659830                     /usr/bin/curl
558094a3b000-558094a3c000 rw-p 0002e000 fc:01 659830                     /usr/bin/curl
558095533000-558096cfb000 rw-p 00000000 00:00 0                          [heap]

3) Statically positioned executable, no system ASLR

00400000-00401000 r-xp 00000000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
00600000-00601000 r--p 00000000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
00601000-00602000 rw-p 00001000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
00602000-00692000 rw-p 00000000 00:00 0                                  [heap]

4) Statically positioned executable, and system ASLR

00400000-00401000 r-xp 00000000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
00600000-00601000 r--p 00000000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
00601000-00602000 rw-p 00001000 fc:01 931015                             /usr/lib/x86_64-linux-gnu/ImageMagick-6.8.9/bin-Q16/display
0242d000-024bd000 rw-p 00000000 00:00 0                                  [heap]


In both the Box and DropBox cases, the pointers we have leaked match case 4), statically positioned executable with system ASLR. We make this determination because the supposed heap pointers are low but slightly variable between runs.

Ergo, both Box and DropBox have a vulnerability: missing ASLR on the binary used to do the thumbnailing. Put another way, that binary was not compiled as position independent. It's not a particularly surprising vulnerability: my Ubuntu 16.04 install has the same problem:

file -L /usr/bin/convert
/usr/bin/convert: ELF 64-bit LSB executable, x86-64, version 1 (SYSV)

Box vs. DropBox

When the same vulnerability crops up in two different places, the opportunity arises to perform comparisons. Both Box and DropBox appeared to have a one-process-per-conversion model and both responded to the report promptly by performing the most important action, which was to firmly curtail the number of ImageMagick decoders on the attack surface. Further to my previous post about a likely ancient ImageMagick on Box, no evidence of ancient ImageMagick was found on DropBox.

Conclusions

Determining whether ASLR is correctly enabled or not on the server is usually opaque to web application security testing. But by finding a vector to leak the content of server memory, we can match up pointer values to the status of ASLR on the server.

For my trouble, DropBox awarded me a $343 bounty. Box does not have a bounty program at this time.

Lack of ASLR on the ImageMagick conversion process could be a useful foot in the door to a memory corruption attack. One-shot exploitation of image decoding is fairly hard, because you have to defeat ASLR and also land the corruption exploit within the context of a single image decode, where you typically don't have scripting available. If, however, the binary is at a fixed location due to missing ASLR, exploitation is a much more tractable problem.