Monday, February 20, 2012

Decrypting embedded encrypted executables

In addition to the detection of embedded executables in documents, you will probably want to run the malware in your dynamic analysis sandbox or do a static analysis in IDA. To decrypt the executable or embedded clean documents etc that are obfuscated/encrypted, we have the following simple script. First you'll need the XOR key and ROL decode shift from a Cryptam report:

Download our cryptam_unxor.php script.

Sample usage:
dev:cryptam_test dev$ strings 34eba128caa21df52b7cec6ea1c80a91.virus|egrep This.program
dev:cryptam_test dev$ php cryptam_unxor.php 34eba128caa21df52b7cec6ea1c80a91.virus -xor 000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff -rol 5
dev:cryptam_test dev$ strings 34eba128caa21df52b7cec6ea1c80a91.virus.out|egrep This.program
This program cannot be run in DOS mode.
This program cannot be run in DOS mode.
dev:cryptam_test dev$

Then use your favorite hex editor to grab the content from the MZ header onward.

The script (spacing removed by blog editor, use the link above for a clean copy):
* v0.1
* cryptam_unxor.php: Cryptam - command line script
* unxor and unrol

$key = '';
$rol = 0;
$outfile = '';

//accept a file as input
if (is_file($argv[1]) && $argc % 2 == 0) {
$outfile = $argv[1].".out";
for ($i = 2; $i < $argc; $i+=2) {
if ($argv[$i] == "-xor" && is_file($argv[$i+1]) )
$key = file_get_contents($argv[$i+1]);
else if ($argv[$i] == "-xor" || $argv[$i] == "-key")
$key = $argv[$i+1];
else if ($argv[$i] == "-rol")
$rol = $argv[$i+1];
else if ($argv[$i] == "-out")
$outfile = $argv[$i+1];
} else {
echo "invalid number of arguments:\n";
echo "php cryptam_unxor.php virus.doc -xor fe85aa -rol 3 -out file.out\n";

$data = file_get_contents($argv[1]);

if ($key != '') {
$data = xorString($data, hex2str($key));

if ($rol != '') {
$data = cipherRol($data, $rol);

file_put_contents($outfile, $data);

function hex2str($hex) {
$str = '';
for($i = 0; $i<strlen($hex); $i += 2) {
$str .= chr(hexdec(substr($hex,$i,2)));
return $str;

function cipherRol($string, $x) {
$newstring = '';
for ($i = 0; $i < strlen($string); $i++){
$bin = str_pad(decbin(ord($string[$i])), 8,'0', STR_PAD_LEFT);
$ro = substr($bin, $x).substr($bin, 0, $x);
$newstring .= chr(bindec($ro));
return $newstring;

function xorString($data, $key) {
$key_len = strlen($key);
$newdata = '';

for ($i = 0; $i < strlen($data); $i++) {
$rPos = $i % $key_len;
$r = '';
if ($key_len == 1)
$r = ord($data[$i]) ^ ord($key);
$r = ord($data[$i]) ^ ord($key[$rPos]);

$newdata .= chr($r);

return $newdata;


Obfuscation and detection of embedded executables

We're going to talk a little bit more about how our new Cryptam system detects malware documents based on identification of the encrypted executables. A document exploit needs to install an executable, those executables are usually either obfuscated and embedded in the document, or downloaded from a remote site.

APT type targeted email attacks, or "spear phishing" attacks, in our experience, most commonly embed the executable trojan within the document exploit. From plaintext, to obfuscation using 1-1024 byte XOR keys, counters, and Rol/Ror bit shifting are all commonly seen.

Common AV typically fails with detecting malware documents, as the exploit shellcode is usually heavily packed, and the XOR encryption creates a huge number of variants of potential signatures, so usually AV detection ends up being hash based and lags behind the attacks with new attacks getting only 10-20% detection on Virustotal.

Our Cryptam system uses the entropy of the file content to ignore legitimate content and focus on the higher entropy sections to statistically calculate the key used based on the position in the document and occurrences at that position, all in a single read pass, unlike brute force like other systems use, this method is extremely fast.

Cryptam can also be customized for both exploit/shellcode signatures in regex, as well as embedded executable signatures - Windows/Mac/Linux executable traits or library references (which are scanned for using the calculated XOR key and variations of ROL).

Example dispersion and key detection of 256 byte key, 00-FF:

Example dispersion and key detection of 256 byte key, FF-00:

Example dispersion and key detection of 256 byte key, algorithmic:

The above graphs show the highest occurrence characters over 1024 bytes in red, and the key, which is 256 bytes overlayed in black. In most cases the highest occurrences when there is a 256 byte key will be obvious with the key pattern repeating 4 times over 1024 bytes.

A clean document will appear as a lot of noise, randomly:

Wednesday, February 15, 2012

New malware document scanner tool released

We've recently released our malware document scanner tool called Cryptam (which stands for cryptanalysis of malware) . This system scans document files such as MS Office (.doc/.ppt/.xls), PDF and other document formats for embedded executables whether encrypted or not. As most embedded malware executables use varying lengths of XOR and ROL/ROR obfuscation to evade traditional A/V detection, we focus on the detection of the embedded executable rather than the exploit itself.

A typical Cryptam report visually shows three critical pieces of the cryptanalysis done. The first graph shows the count for each ascii character in the file, obvious single byte XOR keys can be seen here. The second graph is the entropy of the file, most documents other than PDFs are very light entropy on legitimate content, and only images or the embedded executables showing as red high entropy sections. The third and final graphic is the XOR dispersion over 1024 bytes with the calculated key overlayed. We define the XOR dispersion as the highest occurrence character per position in the 1024 byte blocks in the file. So a 256 byte XOR key used on an embedded executable will have a pattern which repeats 4 times over the 1024 bytes. If the dispersion graphic looks random, it's probably data and not an embedded executable. Sloping lines are typical of algorithmically generated encryption keys - the typical exploit shellcode is very small, and simple counters are commonly used as the XOR key.

The main areas to check in the cryptam report is the summary, for embedded executable signatures - such as an XORed version of This Program cannot be run in MSDOS etc. And the key length - which is typically anywhere from 1 to 1024 bytes, but most commonly 256 bytes with typical APT type attacks. The system is also available as a command line scanner and private web versions like our PDFExaminer product.

Use the Cryptam document malware scanner online at Upcoming posts will release a few useful tools to unxor and unrol the executables using the keys Cryptam detects.