Chapter 18 Notes
to accompany Sikorski and Honig, Practical Malware Analysis, no starch press
Packing and Unpacking
In general, packed malware must be unpacked before it can be analyzed statically.
Packers have two main purposes: to shrink programs, and to thwart detection or analysis
Packer Anatomy
- To undo the packer's work, we must understand how it works
- Packers take an executable file as input and produce another executable as output
- Packers can pack all or part of the input file
- Analysts often find it helpful to reconstruct the IMPORT section to understand functionality,
- so this is a good reason to pack that section!
- as well as a good reason to have tools that can rebuild the IMPORT section as needed
Unpacking Stub
- Unpacking stub is loaded and executed by OS - eventually
- Code entry point (a certain location in PE header - where?) points to unpacking stub
- which then loads the rest of the program, perhaps in pieces, after seeing if it's safe to do so
- Unpacking stub performs three steps:
- unpack original executable into memory
- resolves all the imports
- transfers execution to original entry point (OEP)
Loading the Executable
- What does the Windows Loader do?
- Its job is to read PE headers, allocate memory, and copy executable code into place
- Sections can be created by the unpacking stub, or packing can be done inside sections of original programs.
Resolving Imports
- Normally, the Windows Loader resolves Imports, i.e. determines which functions are needed and supplies the addresses
- If Imports section is left unpacked, static analysis will indicate functionality, and less compression is obtained
- If Imports section is packed, the unpacking stuff has to resolve the imports.
- Maybe it uses the (innocuous) LoadLibrary and GetProcAddress functions
- Or the unpacking stub has to figure out all the addresses at run-time
- The "tail jump" is often something the analyst wants to find - from unpacker to real intended functionality
- The Tail Jump may be a jump instruction, or mess with the stack and use ret or call, or use an
OS function like NtContinue
Unpacking Illustrated
- Figures 19-1 through 19-4 (whatever the chapter is)
- This seems to be one of the possible models
Indicators of Packed Programs
- Few imports, or just two: LoadLibrary and GetProcAddress
- IDA Pro can only recognize a small portion (duh)
- Section names such as UPX0 (which suggests UPX of course)
- Abnormal section sizes, e.g. raw data 0 but virtual size not zero
- PEiD
Entropy Calculation
- If a program is packed, then it will in general have higher entropy than the same binary left unpacked
- Refer to Lyda paper (pdf)
- A formula attributed to Shannon 1963: H(x) = - sum i=1 to n p(i)*log2(p(i)), which for bytes is bounded within [0,8]
- So consider a long string x of ASCII characters, all equally probable, i.e. p(i) is 1/256 for all n=256 possible characters. Then
H(x) = - sum i=1 to n ( (1/256) * -8) = -256*(1/256)*(-8) = 8, meaning it really takes eight bits to store each byte of information in that string,
- Hence x is uncompressible
- But if some characters are more common than others, entropy goes down. A string of all 0xCC, for example, has entropy zero since
all of the p(i) are zero except for p(i for 0xCC)
Unpacking Options
- Automated static unpacking
- Automated dynamic unpacking
- Manual dynamic unpacking
Automated Unpacking
- PE Explorer (did we demo this earlier? I think so) has static unpacking plug-ins
- "Automated dynamic unpackers run the executable and allow the unpacking stub to unpack the original executable code. Once the original executable is unpacked, the program is written to disk, and the unpacker reconstructs the original import table."
- "The automated unpacking program must [run the malware! and thereby] determine where the unpacking stub ends and the original executable begins, which is difficult."
- No good publicly available automated dynamic unpackers, although some work okay on some packers
- This would be a good M.S. thesis topic!
Manual Unpacking
- Write a program to unpack the code, assuming you understand the packing algorithm
- Sometimes, malware authors create their own packers
- Run the program so that it unpacks itself, dump the process out of memory (how?) and edit the PE header as needed.
- OllyDbg can be used for this, perhaps. with the Find OEP plug-in
- Then invoke the Dump Debugged Process plug-in, which can also be used with Immunity Debugger
- OllyDbg can rebuild the Imports table, and fix the PE header to point to the OEP
- If OllyDbg or Immunity can't cope, try Import Reconstructor (ImpRec)
Finding the Original Entry Point (OEP)
- A skill that develops with practice
- General approach: use a debugger and set breakpoints
- But not all malware is debugger-friendly, of course
- Easiest: Find OEP by Section Hop, an OllyDbg plug-in
- Manual strategy: find the tail jump. May be a JMP instruction, or a RET or whatever...
- See Example 19-1. Some indicators:
- Jumps that go a long way are suspect, as in Figure 19-5
- Jumps that go to invalid instructions, as in Example 19-2
- Another idea: set a read breakpoint on the stack, so as to tell when the unpacker finishes
- Set breakpoints after each loop (may be tedious, and may require starting over and over and over)
- Set a breakpoint on GetProcAddress, to see when unpacker is rebuilding the IMPORT table
- Set a breakpoint on some function you know the unpacked program will call, e.g.
GetModuleHandleA or GetCommandLineA
- Use the Run Trace option in OllyDbg
Repair Import Table Manually
- Name functions as you encounter them
- But sometimes malware deliberately imports a lot of functions to defeat analysis :-
Dealing with Specific Packers
- UPX, PECompact, Aspack, Petite, WinUpack, Themida