Thursday, November 4, 2010

More PDF Decryption Enhancements for PDF Examiner

Got rid of a unsigned int issue when calculating the permissions value for some types of encrypted PDFs, if you had any issues with decrypting malware PDFs, try resubmitting to the PDF Examiner. Second update was to handle owner password string literals in octal.

Apparently to get an 4 byte hex of a PHP int you can't just go dechex($permissions), you'll need to go dechex( pow(2, 32)- pow(2, 32)+$permissions) to get the larger unsigned int range. Fun workarounds, but it's at least closer to C than Python ;).

Unconfirmed Adobe PDF zeroday with this.printSeps

Reports on twitter are circulating that a new adobe PDF zero day PoC was posted to Full Disclosure (Nov 3rd, 2010). The file xpl_pdf.bin (MD5 d000e74163e34fc65914676674776284) contains a small JavaScript heap spray and call to this.printSeps which in tests does crash Adobe, it's not clear if this is further vulnerable to exploitation or what version of OS and Acrobat are affected. The exploit itself requires an Adobe version between 8 and 10.

A blog post from earlier this year (April 9th 2010) from a russian blog details the memory access error of using this.printSeps(), which is described as a denial of service bug. Interesting that this bug didn't pop up to a wider audience over the 7 months it was public.

Added initial detection for this potential exploit to PDF Examiner. You can analyze the file in PDF Examiner here. Bad JavaScript is available here.

Adobe PSIRT has reported they are investigating the issue. Mitigation advice has been posted here (such as disable JavaScript in Acrobat).

VUPEN has reported code execution is possible, working PoC still unpublished.

Update: Adobe has received a CVE number CVE-2010-4091 and is reporting a patch will be available Nov 15, 2010.

Saturday, October 2, 2010

Hiding PDF Exploits by embedding PDF files in streams and Flash ROP heapsprays

Another interesting sample that we came across (a901141662b350cd2c7d91268eddbdce) highlights one of the neat features of our online PDF Examiner. Detection and processing of streams which contain an embedded PDF file - it's quite easy now to put the exploits into an embedded PDF and compress or even encrypt the parent PDF file to avoid many AV products detecting the exploit code:

Object 3 has the embedded PDF file, which was extracted and processed automatically - it's linked to and shown to have the CVE-2010-2883 fontfile SING table description name overflow:

Now one of the very interesting things going on in this sample is that there's no javascript for the heapspray. We do that the parent PDF has embedded Flash files in objects 1 and 2. We can download those two Flash files easily from within PDF examiner by clicking save Obj to File.

Now both Flash files have the CWS magic number that indicates they are compressed. Here's how we expand them using PHP:
function flashExplode ($stream) {
$magic = substr($stream, 0, 3);

if ($magic == "CWS") {
$header = substr($stream, 4, 5);
$content = substr($stream, 10);
$uncompressed = gzinflate($content);
return "FWS".$header.$uncompressed;
} else
return $stream;

With the files uncompressed, here's a look at them:

Googling jit-egg.swf or funcXOR1 or Loadzz2 leads us to some PoC code by @asintsov at
This code is a ROP JIT-egg shellcode heapspray in Flash, so our sample is exploiting CVE-2010-2883 in an embedded PDF file and using Flash to do the heapspray. The shellcode will drop and executable and clean PDF file which is stored in the original PDF between the %%EOF and some tagged on PDF junk streams.

Friday, October 1, 2010

PDF Slack Space

Another common way to hide an embedded executable file in a PDF is to include it's content after the end of file marker %%EOF. We're now showing any content after the last %%EOF as "Slack Space" also marked in brown. Check out all the neat features of the PDF Examiner.

Wednesday, September 29, 2010

Trick for finding the embedded exe's in PDFs

One of the common traits of a lot of PDF malware is that the embedded executable is put it to an object stream and marked with a compression filter such as FlateDecode, but the stream is rarely actually compressed. We now mark objects in the PDF Examiner online tool with a raw stream which doesn't correctly inflate as in brown to denote the potential inclusion of an executable attachment. In most cases the "fake" stream contains an XORed exe file or sometimes additional clean PDFs which are dropped at exploit time.

In the example below, you can see object 64 contains a stream which was marked as FlateDecode, but is listed in brown to denote that it did not contain a valid gzipped stream. In the hexview we can see the pattern of a 256 byte XOR key shown through the executable's whitespace (then you can use the XOR key to statically extract the executable for analysis).

Monday, September 27, 2010

JS Stream Decryption Fix

Fixed a bug in the PDF Examiner with escaped characters in literal streams for Encrypted documents. Encrypted PDFs with JS(...) blocks not in a compressed stream were affected.

Friday, September 24, 2010

PDF Malware Threat Overview - list of common vulnerabilities

We've created a new comprehensive Malware Tracker chart for the current state of PDF threats from Adobe Reader / Acrobat and embedded Flash exploits. Check out the chart here. We'll be keeping to page up to date with new threats as they develop and are patched. Links to analysis in our PDF Examiner tool are also included on real live malware samples.

Sunday, September 19, 2010

New shortcut to the PDF Examiner tool

Now you can submit a PDF for free online analysis and view all the objects at A new domain to use as a shortcut to our very popular online pdf dissector tool.

Sunday, September 12, 2010

PDF Examiner New Features

Added support for multiple objects of the same ID - objects will now be displayed by [object number].[generation number] @ file location bytes. This should enhance the way PDF files with duplicate objects are viewed. PDF Examiner

Saturday, September 11, 2010

PDF Examiner New Features

Added a lot of enhancements for dealing with obfuscated JavaScript, including showing objects which may contain JavaScript but have no detected entities as orange. Check out the PDF Examiner.

Thursday, September 9, 2010

Visualizing Embedded Executables Teaser and PDF Updates

Since we generally like to tease about what we're working on next as we get too excited to wait for the public release, here's something we think is pretty neat. We decided to play around with visualization for some recent cryptanalysis work on some Microsoft Office .doc, Powerpoint and Excel files.

Take a look at the lines in the below chart - the green horizontal line represents a frequency plot of the top character occurrences over a 256byte spectrum in an office document which contained a one byte XOR'ed embedded executable virus. The red line is even more interesting, it represents the same, but where 256 byte XOR key was used to hide the malware. The blue scatter is the statistical analysis of a clean document with no malware. We thought it was pretty neat that when you visualize your cryptanalysis the documents with malware came up with straight lines in a lot of cases, and clean documents look almost random. More to come in the form of blazing fast cryptanalysis and Office docs :)

In other updates, we updated the Malware Tracker PDF Examiner to detect the new unpatched zeroday embedded font file buffer overflow exploit CVE-2010-2883. PDF Examiner is here. We've already seen a new sample different than the original malware reported (contagiodump blog), and with the creation of the new metasploit module and no patch yet, this exploit is going to be one of the worse. The exploit, while not requiring Javascript to crash acrobat, does still require Javascript to load up the shellcode to do the bad stuff, so disabling Javascript in Acrobat is recommended until a patch is released by Adobe.

Wednesday, September 1, 2010

PDF Examiner

Added a few updates to the PDF Examiner - checking object parameters for exploits - such as /Launch etc. Working on more encryption methods - if you have any Revision 1 or 3 samples, send them over to us. Bug fixes - check streams with no encoding methods for known exploits.

Tuesday, August 31, 2010

PDF Revision 3 Encryption

I just wanted to give a quick shout out to i♥cabbages for their very useful post on PDF Revision 3 encryption and the mysterious unpublished algorithm. I'm currently working on bringing more of the PDF encryption methods into the PDF Examiner.

Currently Revision 4 AES V2 is working pretty well, just in the process of adding Revision 2 40-128 bit RC4 support and researching Revision 1 (40 bit RC4) and 3 (RC4+some XORing).

Thanks to those that provided samples, please flag any failures to me and I'll do my best to add them as well.

Monday, August 30, 2010

encrypted pdf part 2 - with the online pdf examiner and object dissector

A couple posts ago I talked about do-it-yourself AESV2 PDF decryption, now it's time to get into the analysis of the PDF Javascript payload. The free online PDF Examiner 1.0 is very helpful to handle the parsing of the PDF and locating the objects that have weird obfuscated Javascript (you can use our PDF analysis tool here.)

After uploading the PDF at, we get the following page which highlights that object 47 generation 0 has some javascript obfuscation going on:

In the left column you can see objects which have something bad detected in them, show up as red, objects with streams of any sort of content show up as green, and the smaller xref and document info objects are grey and of minimal value to finding the exploits. As you can see below when you click on the suspected bad object, we are presented with a hex view which clearly shows we've found a Javascript block (remember this would also normally have been tricky to track down with other PDF parsers as this is also AES V2 128 bit encrypted).

Now keeping with the on-the-go quick analysis we've designed these online tools for - you can click the View Obj Raw to see the decoded object's content for an easy copy-paste:

The javascript object isn't super pretty to look at:
Now Javascript in exploits is usually pretty messy, we can copy paste the above code over to which has a great online tool to clean up that messy js code.

Now here's where we can see there's all sorts of messy obfuscated code using some mathematical tricks to evade decoding. However, notice the eval in the last line of the code? We can save a lot of time by simply changing the eval to document.write and let the attacker's code work against them:

Then over to our PC, we can create a simple javascript html file to open in our favorite browser:

Opening this in a web browser reveals the de-obfuscated javascript:
And over to the javascript beautifier again:

We can clearly see the potpourri of exploits we've been presented with: -> is CVE-2009-4324
util.printd("DAbRSENUPTBrlwPSTcwaybxlFnvNzcMRwJvG", new Date()) -> is CVE-2008-2992
Collab.collectEmailInfo -> CVE-2007-5659
app.doc.Collab.getIcon -> CVE-2009-0927

The deobfuscation also revealed the shellcode, we're not going to get into that here, but will remind everyone that we have a online nasm viewer (with our own annotations) over at which also lets you add an xor key to try unpacking the shellcode yourself.

That's all for now :)

Saturday, August 28, 2010

PDF dissector tool online

Check out our new PDF analysis platform Malware Tracker PDF Examiner 1.0 at Our new PDF dissector will process normal compressed or encrypted (AESV2) PDFs into objects for viewing, scan for known exploit CVE's or obfuscated javascript, and export decoded data to file. Upload and analyze PDFs on the go for free.

Wednesday, July 28, 2010

Analysing Encrypted PDF Files for malware

I recently came across a PDF 306d7e608a52121aa4508e9901e4072e which on Virus Total only has 7/42 (16.67%) detection.

The sample uses PDF Version 4 standard encryption - AES V2 with a 128 bit key which leads us to wonder if some AV vendors are not handling the decryption to peak inside at the PDF content.

PDFs can be encrypted, with a key calculated from values in the PDF so it can be opened and viewed, compliant PDF readers will dissallow certain permissions such as copy/paste or printing (some documents can have a user password to open the file, but we won't get into brute forcing that type here). While obtaining the key is a relatively trivial calculation, the encryption level is quite good and will bypass typical NIDS and apparently some AV products.

Some of the resources we can use to figure out the PDF encryption key generation is the ISO 32000-1 document on the PDF structure which is freely available from Adobe.

For older PDF Version 1/2 encryption I found this resource. But this particular PDF has the /V 4, for the latest version where it's own filter can be specified, in this case we also have /Standard and /AESV2, so the PDF specs has the solution:

ISO 32000-1 2008 Encryption Key Algorithm 2:

PDF uses a standard user (/U) password padding. Now the document will have some content in the /U user password field, but you should first try the standard padding as the full password as most documents do not require a user password to view them.

Step A:
PDF password padding / default user password in algorithm: 28BF4E5E4E758A4164004E56FFFA01082E2E00B6D0683E802F0CA9FE6453697A

Step B:
Build a string with 28BF4E5E4E758A4164004E56FFFA01082E2E00B6D0683E802F0CA9FE6453697A to use in a md5 hash.

Step C:
Get the /O owner password - original is 30E714592E1FF1FE5C285A601B413C71CAF3B989576B72AC51371CF782A7F32F0B, however, there's a escape character 5C that needs to be stripped, giving 30E714592E1FF1FE285A601B413C71CAF3B989576B72AC51371CF782A7F32F0B, append that to the value above.

Step D:
"Convert the integer value of the P entry to a 32-bit unsigned binary number and pass these bytes to the MD5 hash function, low-order byte first."
The /P value is -3392, we're using PHP which doesn't have an unsigned int, so we'll have to do the math ourselves instead of a sprintf:
pow(2, 32) + $p_value = fffff2c0, when set to low order first we get c0f2ffff to append to our hashing string above.

Step E:
Append the first field of the document ID to the hash string: ID[] - we will append F247C9CF56A1484FB2C4F8C200FA269F to our hash.

Step F:

Step G:
Our final value is
"28BF4E5E4E758A4164004E56FFFA01082E2E00B6D0683E802F0CA9FE6453697A30E714592E1FF1FE285A601B413C71CAF3B989576B72AC51371CF782A7F32F0Bc0f2ffffF247C9CF56A1484FB2C4F8C200FA269F" (converted to data) and hashed with md5 is 26612c183ad8994a9600ada07e1299f2.

Step H:
Hash the hash 50 times:

step h 0 md5(26612c183ad8994a9600ada07e1299f2) = e62c3762cfab0ba2d3bfb201f3c3d263
step h 1 md5(e62c3762cfab0ba2d3bfb201f3c3d263) = 954a6bc4674dd8d93f836d9ad0e8f5c9
step h 2 md5(954a6bc4674dd8d93f836d9ad0e8f5c9) = d6a528e6572312c3f5c9b5cbe12aa343
step h 3 md5(d6a528e6572312c3f5c9b5cbe12aa343) = 9b6d871ae2f747f6cc7491ac24302c78
step h 4 md5(9b6d871ae2f747f6cc7491ac24302c78) = 188a87ddb2b9b939ed9d101cc3c09a7e
step h 5 md5(188a87ddb2b9b939ed9d101cc3c09a7e) = f495bdd808b927fe35d644e8f2971bbc
step h 6 md5(f495bdd808b927fe35d644e8f2971bbc) = ec8d5f00274875134a6a100e0429cdff
step h 7 md5(ec8d5f00274875134a6a100e0429cdff) = 830844b7c7b6276c062854ad1edebfdc
step h 8 md5(830844b7c7b6276c062854ad1edebfdc) = d446741e4b5bca850bc9645a7ee3b045
step h 9 md5(d446741e4b5bca850bc9645a7ee3b045) = 73c86126cf89e96b105caf5de627a018
step h 10 md5(73c86126cf89e96b105caf5de627a018) = 192744b0bade31e7e46cc8f5e4697fb4
step h 11 md5(192744b0bade31e7e46cc8f5e4697fb4) =
step h 12 md5(b04e410c2c5f327650053c942c4b8fe7) = 038e96e7289bddcfa4f8a96e0a3ca1f7
step h 13 md5(038e96e7289bddcfa4f8a96e0a3ca1f7) = fd0ff1f4284e7aa9e9b11ddaf7e69350
step h 14 md5(fd0ff1f4284e7aa9e9b11ddaf7e69350) = cf2f8e2c3a57260bde21a28adf1d7c05
step h 15 md5(cf2f8e2c3a57260bde21a28adf1d7c05) = 4e7a8c7572985f90d0e903a4d9dd957d
step h 16 md5(4e7a8c7572985f90d0e903a4d9dd957d) = 871eff58e5ebc751fba8ad222ba97181
step h 17 md5(871eff58e5ebc751fba8ad222ba97181) = 1bd8326a037a5c8f3d8624123d80d7d2
step h 18 md5(1bd8326a037a5c8f3d8624123d80d7d2) = 509f4173ba89b3d8a8b30bff75604ab5
step h 19 md5(509f4173ba89b3d8a8b30bff75604ab5) = 9c0153ee14f16a3ab2f6932a51ed347a
step h 20 md5(9c0153ee14f16a3ab2f6932a51ed347a) = 12931d398d6d1abd44e49207684fc840
step h 21 md5(12931d398d6d1abd44e49207684fc840) = 7b9e11e1ad4f1a42ccf6f5a5f604df2d
step h 22 md5(7b9e11e1ad4f1a42ccf6f5a5f604df2d) =
step h 23 md5(6a02459568ce254fb717983ef92b173d) = 59a73d2798871876343d7eea5a723102
step h 24 md5(59a73d2798871876343d7eea5a723102) = 00b1aca101d3a7641b4f98c1f65c9ded
step h 25 md5(00b1aca101d3a7641b4f98c1f65c9ded) = 9ae87a76ba1af438af75ba2011727827
step h 26 md5(9ae87a76ba1af438af75ba2011727827) = 0a8090edc5aebc1521046d3f2c8913a7
step h 27 md5(0a8090edc5aebc1521046d3f2c8913a7) = 87c38f682ecaa9eb4bc4a9adb5d92552
step h 28 md5(87c38f682ecaa9eb4bc4a9adb5d92552) = fe140891a75402580507fc8f8db6ec17
step h 29 md5(fe140891a75402580507fc8f8db6ec17) =
step h 30 md5(2cf639b306532be03a1481ad104aa3e2) = eace2ceeb9e7c8e5900cb3b1ba4e3904
step h 31 md5(eace2ceeb9e7c8e5900cb3b1ba4e3904) = 46695a7f26b3a1874e17d6bf9d044a1b
step h 32 md5(46695a7f26b3a1874e17d6bf9d044a1b) = 775aaaaf5d8ed731400e777d47a9f730
step h 33 md5(775aaaaf5d8ed731400e777d47a9f730) = 32c27b94921ef5f21c4538c6043a3635
step h 34 md5(32c27b94921ef5f21c4538c6043a3635) = 1710f9661da8b8c843bf20100b5cd58a
step h 35 md5(1710f9661da8b8c843bf20100b5cd58a) = 32dd4dfb99e429399540e460664343e3
step h 36 md5(32dd4dfb99e429399540e460664343e3) = 628d80b8dce1084b816dad348de26b4d
step h 37 md5(628d80b8dce1084b816dad348de26b4d) = 0d484724b53c014bf044e7ab5fac4414

step h 38 md5(0d484724b53c014bf044e7ab5fac4414) = aa561b8f3211da0994b790eba2eedd11
step h 39 md5(aa561b8f3211da0994b790eba2eedd11) = 0969b6eb986d13e5965e57b818320227
step h 40 md5(0969b6eb986d13e5965e57b818320227) = aca2a9b85c975ca6d8b2ff1c848e93a0
step h 41 md5(aca2a9b85c975ca6d8b2ff1c848e93a0) = 60f6b7bd36240d64ea47d507bbba0241
step h 42 md5(60f6b7bd36240d64ea47d507bbba0241) = 17eb81278c5287ec1f85e6b6a4d584c2
step h 43 md5(17eb81278c5287ec1f85e6b6a4d584c2) = ca5756e27675308b3de57cf1695e7c8a
step h 44 md5(ca5756e27675308b3de57cf1695e7c8a) = 7353248cb20ab14cea5eee3674a7eafb
step h 45 md5(7353248cb20ab14cea5eee3674a7eafb) = 0eac9698646535097bc8a5c66a02a921
step h 46 md5(0eac9698646535097bc8a5c66a02a921) =
step h 47 md5(210e8578482403a49e414904980ec15a) = 7cf558c4d6af2b36cbc52b5381b98e85
step h 48 md5(7cf558c4d6af2b36cbc52b5381b98e85) = 53dede5bec03d50f3441b31c955f2121
step h 49 md5(53dede5bec03d50f3441b31c955f2121) = f5834bfee58498420a2128ea4edcd346
step h final hash f5834bfee58498420a2128ea4edcd346

Step I:
Use the full 128 bit length, so final key is f5834bfee58498420a2128ea4edcd346.

Using the key:

ISO 32000-1 Algorithm 1: Encryption of data using the RC4 or AES algorithms
Step A:
Looking at the image above, we get the object number of 47 and generation number 0.
As hex we get object number as 3 bytes 000038 and generation number as 2 bytes 0000, low order becomes:

Step B:
md5 (f5834bfee58498420a2128ea4edcd346 + 3800000000 + 73416C54) (73416C54 is hard coded padding for any object)
Result is 9edcf26e53cf4d286c6aad5f0b4821dd

Step C:
Read in object form stream to enstream, trim \x0a and \x0d
$decrypted = mcrypt_decrypt(
substr($d, 16),
substr($d, 0, 16)

And if there's a Filter/FlateDecode:
gzinflate($decrypted) (you may need to chomp the first one or two bytes to get the gzinflate to work.)

And that's it, a quick and easy how to decrypt PDF malware.

Tuesday, July 13, 2010

New blog

Welcome to our new blog, hopefully we'll have some insight on malware analysis related topics and trends. For now, check out our web-based shellcode analysis and unpacker tool at