Chapter 16 Notes
to accompany Sikorski and Honig, Practical Malware Analysis, no starch press
Anti-Disassembly
Used by malware authors to interfere with malware analysts.
Linear Disassembly
- Given an executable binary, find the place where execution is supposed to begin
- Figure out what opcode is there, and recreate the mnemonic and operands
- Experienced assembly programmers will memorize some opcodes
- Assuming that each instruction is a certain length,
- Move on to the next instruction
- Advantages:
- often works with well-behaved code generated by compilers
- reasonably easy to implement, even in Python maybe
- Disadvantages:
- relatively easy to fool by insertion of dead code in the form of nonsense or malformed instructions
- for example: JUMP over four bytes dead code (two byte jump), but dead code begins with the opcode for a four byte instruction.
- Disassembler will dutifully continue, but misinterpret the jump target as part of the dead code's operands...
Flow-Oriented Disassembly
- Given an executable binary, find the place where execution is supposed to begin
- Disassemble that instruction, but following branches as best it can
- Builds a queue of locations at which it expects to find instructions to be disasembled
- Conditional branches add both TRUE and FALSE branch targets to the queue
- Advantages:
- isn't quite as easy to fool as linear disassembly
- Disadvantages:
- requires more knowledge of how instructions work
- can still be confused by branches that are never taken, but indicate flow of control into nonsense code
- such as explained in the rest of the chapter
To Confuse a Disassembler: Some Tricks
Jump Instructions with the Same Target
- Back to back jumps, e.g. JZ followed by JNZ
- The false branch of the JNZ will never be executed, because of the JZ right before, but the disassembler doesn't know this, and it may
try to disassemble dead, nonsensical code
- By causing the disassembler to follow two paths of execution, only one of which is possible at run-time, the disassembler may have two different interpretations of same sequence of bytes
Jump Instruction with a Constant Condition
- Similar to the previous
- conditional jump will have both TRUE and FALSE paths,
- but only one will ever be taken during execution
- leaving the other path to hit bogus instuctions
NOP-ing Out Instructions with IDA ProGhidra
- "The simple anti-disassembly techniques we have discussed use a data byte placed strategically after a conditional jump instruction, with the idea that disassembly
starting at this byte will prevent the real instruction that follows from being disassembled because the byte that is inserted is the opcode for a multibyte instruction." quoting from PMA
- This data byte is called a "rogue byte"
- 0xE8 is a good choice, since it's the opcode of a five-byte instruction, as explained in Lab 15-1
Return Pointer Abuse
- another way to misrepresent the real flow of control
Misusing Structured Exception Handlers
- and still another way to misrepresent the real flow of control
Conclusions
- we need to see if Ghidra is susceptible to the same tricks as IDA Pro
- malware authors are always coming up with new tricks!