Tuesday, May 17, 2011

PDF Malware scoring with PDFExaminer

Today we're going to talk a little about the scoring of PDF malware with the PDFExaminer tool. We're currently rating PDFs as clean, suspicious or malware based on a simple scoring algorithm.

Use of JavaScript, per object: +1
JS Obfuscation function - eval, charCodeAt, etc: +1
Strings/variables exploit, jit, shellcode etc: +1
Flash (define object, Flash block): +1
CVE Exploit detected: +10
JBig2Decode: +1

Clean = 0
Suspicious = 1-9
Malware = 10 or more

Some CVE exploit signatures may occur multiple times, as our detection engine uses REGEX signatures and some exploits may be detected two or more times with varied signatures to more broadly detect new variants of known exploits.

Sunday, May 15, 2011

PDFExaminer API

Our new API tool to submit PDFs from the command line and download reports is now available.

Feel free to recode in other languages. We'll post any user submissions here as well.

Upload a PDF and receive the report:
php mwtfile.php [filename] [email address for report]
mwtfile.php source.

Download a PDFExaminer report for a hash
php mwtreport.php [hash] [report xml, text, json, php, is_malware, rating, severity]
mwtreport.php source.

Examples:
php mwtfile.php China\'s\ Charm\ diplomacy\ in\ BRICS\ Summit.pdf user@email.com

....
php mwtreport.php ae39b747e4fe72dce6e5cdc6d0314c02 xml

XML Report:
<?xml version="1.0"?>
<pdf><filename>China's Charm diplomacy in BRICS Summit.pdf</filename>
<size>411558</size>
<submitted>2011-04-21 14:46:36</submitted>
<md5>ae39b747e4fe72dce6e5cdc6d0314c02</md5>
<sha1>18306c34c5769f66573b725dce70a353ff549857</sha1>
<sha256>f4e861eec510a0d38ae8fa54b630fdda40011891d12925e0e74da39d9280ddd8</sha256>
<ssdeep>3072:qISKk2ZxVh/tj5focZCkMyM/1lKTHzteS8i:kMVh/tpNLzk+</ssdeep>
<engine>52</engine>
<content-type>PDF document, version 1.7</content-type>
<PDFExaminer>malwaretracker.com</PDFExaminer>
<encrypted>0</encrypted>
<is_malware>1</is_malware>
<severity>44</severity>
<rating>malware</rating>
<exploit><gen_id>0</gen_id>
<obj_id>2</obj_id>
<dup_id>10882</dup_id>
<exploittype>suspicious.flash Embedded Flash</exploittype>
</exploit>
<exploit><gen_id>0</gen_id>
<obj_id>2</obj_id>
<dup_id>10882</dup_id>
<exploittype>flash.exploit CVE-2011-0611</exploittype>
</exploit>
<exploit><gen_id>0</gen_id>
<obj_id>2</obj_id>
<dup_id>10882</dup_id>
<exploittype>suspicious.flash Embedded Flash define obj</exploittype>
</exploit>
<exploit><gen_id>0</gen_id>
<obj_id>2</obj_id>
<dup_id>10882</dup_id>
<exploittype>suspicious.string heap spray shellcode</exploittype>
</exploit>
<exploit><gen_id>0</gen_id>
<obj_id>2</obj_id>
<dup_id>10882</dup_id>
<exploittype>flash.suspicious jit_spray</exploittype>
</exploit>
<exploit><gen_id>0</gen_id>
<obj_id>26</obj_id>
<dup_id>9769</dup_id>
<exploittype>suspicious.flash Adobe Shockwave Flash in a PDF define obj type</exploittype>
</exploit>
<exploit><gen_id>0</gen_id>
<obj_id>30</obj_id>
<dup_id>9920</dup_id>
<exploittype>suspicious.flash Adobe Shockwave Flash in a PDF define obj type</exploittype>
</exploit>
</pdf>


The report formats available are text, xml, json, php (Serialize hash), rating (malware, clean, suspicious), severity (hit count), is_malware (0 or 1). dup_id is the object's decimal location in the PDF file, to account for duplicate object and generations within the same file.

Tuesday, May 10, 2011

Server upgrade

We completed a server upgrade to a brand new server with double the resources, processing speed should be even better and we are looking to release our PDFExaminer API tool very soon.

API Features for the Free online PDFExaminer
Submit a PDF for analysis via PHP or scripted web post
Extract reports in XML, Text, JSON, or PHP Serialize (Hash variable)

Monday, May 2, 2011

PDFExaminer: ObjStm handling

We've rolled out a number of new features today, one of the biggest is ObjStm handling - object streams are extracted and processed as separate objects. Malware severity rating now includes the count from embedded PDFs. Our parser has also been enhanced to better process extremely malformed PDFs.

Coming soon, we'll be releasing an API to post PDFs for analysis and retrieve reporting in XML, PHP Serialize, JSON, or text.