Chapter 16 Notes

to accompany Sikorski and Honig, Practical Malware Analysis, no starch press

Anti-Disassembly

Used by malware authors to interfere with malware analysts.

Given an executable binary, find the place where execution is supposed to begin
Figure out what opcode is there, and recreate the mnemonic and operands
Experienced assembly programmers will memorize some opcodes
Assuming that each instruction is a certain length,
Move on to the next instruction
Advantages:
- often works with well-behaved code generated by compilers
- reasonably easy to implement, even in Python maybe
Disadvantages:
- relatively easy to fool by insertion of dead code in the form of nonsense or malformed instructions
- for example: JUMP over four bytes dead code (two byte jump), but dead code begins with the opcode for a four byte instruction.
- Disassembler will dutifully continue, but misinterpret the jump target as part of the dead code's operands...

Given an executable binary, find the place where execution is supposed to begin
Disassemble that instruction, but following branches as best it can
Builds a queue of locations at which it expects to find instructions to be disasembled
Conditional branches add both TRUE and FALSE branch targets to the queue
Advantages:
- isn't quite as easy to fool as linear disassembly
Disadvantages:
- requires more knowledge of how instructions work
- can still be confused by branches that are never taken, but indicate flow of control into nonsense code
- such as explained in the rest of the chapter

Back to back jumps, e.g. JZ followed by JNZ
The false branch of the JNZ will never be executed, because of the JZ right before, but the disassembler doesn't know this, and it may
try to disassemble dead, nonsensical code
By causing the disassembler to follow two paths of execution, only one of which is possible at run-time, the disassembler may have two different interpretations of same sequence of bytes

"The simple anti-disassembly techniques we have discussed use a data byte placed strategically after a conditional jump instruction, with the idea that disassembly starting at this byte will prevent the real instruction that follows from being disassembled because the byte that is inserted is the opcode for a multibyte instruction." quoting from PMA
This data byte is called a "rogue byte"
0xE8 is a good choice, since it's the opcode of a five-byte instruction, as explained in Lab 15-1