Chapter 18 Notes

to accompany Sikorski and Honig, Practical Malware Analysis, no starch press

Packing and Unpacking

In general, packed malware must be unpacked before it can be analyzed statically.

Packers have two main purposes: to shrink programs, and to thwart detection or analysis

To undo the packer's work, we must understand how it works
Packers take an executable file as input and produce another executable as output
Packers can pack all or part of the input file
Analysts often find it helpful to reconstruct the IMPORT section to understand functionality,
- so this is a good reason to pack that section!
- as well as a good reason to have tools that can rebuild the IMPORT section as needed

Unpacking stub is loaded and executed by OS - eventually
Code entry point (a certain location in PE header - where?) points to unpacking stub
which then loads the rest of the program, perhaps in pieces, after seeing if it's safe to do so
Unpacking stub performs three steps:
1. unpack original executable into memory
2. resolves all the imports
3. transfers execution to original entry point (OEP)

What does the Windows Loader do?
- Its job is to read PE headers, allocate memory, and copy executable code into place
Sections can be created by the unpacking stub, or packing can be done inside sections of original programs.

Normally, the Windows Loader resolves Imports, i.e. determines which functions are needed and supplies the addresses
If Imports section is left unpacked, static analysis will indicate functionality, and less compression is obtained
If Imports section is packed, the unpacking stuff has to resolve the imports.
Maybe it uses the (innocuous) LoadLibrary and GetProcAddress functions
Or the unpacking stub has to figure out all the addresses at run-time
The "tail jump" is often something the analyst wants to find - from unpacker to real intended functionality
- The Tail Jump may be a jump instruction, or mess with the stack and use ret or call, or use an
  OS function like NtContinue

If a program is packed, then it will in general have higher entropy than the same binary left unpacked
Refer to Lyda paper (pdf)
A formula attributed to Shannon 1963: H(x) = - sum i=1 to n p(i)*log2(p(i)), which for bytes is bounded within [0,8]
So consider a long string x of ASCII characters, all equally probable, i.e. p(i) is 1/256 for all n=256 possible characters. Then
H(x) = - sum i=1 to n ( (1/256) * -8) = -256*(1/256)*(-8) = 8, meaning it really takes eight bits to store each byte of information in that string,
Hence x is uncompressible
But if some characters are more common than others, entropy goes down. A string of all 0xCC, for example, has entropy zero since
all of the p(i) are zero except for p(i for 0xCC)

PE Explorer (did we demo this earlier? I think so) has static unpacking plug-ins
"Automated dynamic unpackers run the executable and allow the unpacking stub to unpack the original executable code. Once the original executable is unpacked, the program is written to disk, and the unpacker reconstructs the original import table."
"The automated unpacking program must [run the malware! and thereby] determine where the unpacking stub ends and the original executable begins, which is difficult."
No good publicly available automated dynamic unpackers, although some work okay on some packers
This would be a good M.S. thesis topic!

Write a program to unpack the code, assuming you understand the packing algorithm
Sometimes, malware authors create their own packers
Run the program so that it unpacks itself, dump the process out of memory (how?) and edit the PE header as needed.
OllyDbg can be used for this, perhaps. with the Find OEP plug-in
Then invoke the Dump Debugged Process plug-in, which can also be used with Immunity Debugger
OllyDbg can rebuild the Imports table, and fix the PE header to point to the OEP
If OllyDbg or Immunity can't cope, try Import Reconstructor (ImpRec)

A skill that develops with practice
General approach: use a debugger and set breakpoints
But not all malware is debugger-friendly, of course
Easiest: Find OEP by Section Hop, an OllyDbg plug-in
Manual strategy: find the tail jump. May be a JMP instruction, or a RET or whatever...
See Example 19-1. Some indicators:
- Jumps that go a long way are suspect, as in Figure 19-5
- Jumps that go to invalid instructions, as in Example 19-2
Another idea: set a read breakpoint on the stack, so as to tell when the unpacker finishes
Set breakpoints after each loop (may be tedious, and may require starting over and over and over)
Set a breakpoint on GetProcAddress, to see when unpacker is rebuilding the IMPORT table
Set a breakpoint on some function you know the unpacked program will call, e.g.
GetModuleHandleA or GetCommandLineA
Use the Run Trace option in OllyDbg

Name functions as you encounter them
But sometimes malware deliberately imports a lot of functions to defeat analysis :-