<- previous index next ->
The Intel 80x86 has many registers and named sub-registers. Here are some that are used in assembly language programming and debugging (the "dash number" gives the number of bits): +---------------------------+ EAX extended accumulator | EAX-32 +-----------------+| (lower part of dividend) | | AX-16 || (quotient after division) | |+--------+------+|| (lower part of product) | || AH-8 | AL-8 ||| | |+--------+------+|| | +-----------------+| +---------------------------+ +---------------------------+ EBX extended base pointer | EBX-32 +-----------------+| (BX in DS segment) | | BX-16 || | |+--------+------+|| | || BH-8 | BL-8 ||| | |+--------+------+|| | +-----------------+| +---------------------------+ +---------------------------+ ECX extended counter | ECX-32 +-----------------+| (string and loop operations) | | CX-16 || (CX is a 16 bit counter) | |+--------+------+|| | || CH-8 | CL-8 ||| | |+--------+------+|| | +-----------------+| +---------------------------+ +---------------------------+ EDX extended DX | EDX-32 +-----------------+| (I/O pointer for memory mapped I/O) | | DX-16 || (remainder after divide) | |+--------+------+|| (upper part of dividend) | || DH-8 | DL-8 ||| (upper part of product) | |+--------+------+|| | +-----------------+| +---------------------------+ +---------------------------+ ESP extended stack pointer | ESP-32 +-------------+| SP stack pointer | | SP-16 || (used by PUSH and POP) | +-------------+| +---------------------------+ +---------------------------+ EBP extended base pointer | EBP-32 +-------------+| (by convention, callers stack) | | BP-16 || (BP in ES segment) | +-------------+| +---------------------------+ +---------------------------+ ESI extended source index | ESI-32 +-------------+| SI source index | | SI-16 || (in DS segment) | +-------------+| +---------------------------+ +---------------------------+ EDI extended destination index | EDI-32 +-------------+| | | DI-16 || (DI in ES segment) | +-------------+| +---------------------------+ +---------------------------+ EIP extended instruction pointer | EIP-32 +-------------+| IP instruction pointer | | IP-16 || | +-------------+| +---------------------------+ +---------------------------+ EFLAGS error flags | EFLAGS-32 +-------------+| or just flags | | EFLAGS-16 || (not a register name!) | +-------------+| (must use PUSHF and POPF) +---------------------------+ For 32-bit "C" compatible programming, stop here. +-------------+ CS code segment | CS-16 | +-------------+ +-------------+ SS stack segment | SS-16 | +-------------+ +-------------+ DS data segment | DS-16 | (current module) +-------------+ +-------------+ ES data segment | ES-16 | (calling module, destination string) +-------------+ +-------------+ FS heap segment | FS-16 | +-------------+ +-------------+ GS global segment | GS-16 | (shared) +-------------+ There are also 80-bit floating point registers ST0 .. ST7 There are also 64-bit MMX registers MM0 .. MM7 There are also control registers CR0 .. CR4 There are also debug registers DR0 .. DR3, DR6, DR7 There are also test registers TR3 .. TR7 A dumb program to test register names is testreg.asm Another dumb program to test al,ah,ax,eax regeax.asm The basic syntax for a line in NASM is: label: opcode operand(s) ; comment The "label" is a case sensitive user name, followed by a colon. The label is optional and when not present, indent the opcode. The label should start in column one of the line. The label may be on a line with nothing else or a comment. The "opcode" is not case sensitive and may be a machine instruction or an assembler directive (pseudo operation) or a macro call. Typically, all "opcode" fields are neatly lined up starting in the same column. Use of "tab" is OK. Machine instructions may be preceded by a "prefix" such as: a16, a32, o16, o32, and others. "operand(s)" depend on the choice of "opcode". An operand may have several parts separated by commas, The parts may be a combination of register names, constants, memory references in brackets [ ] or empty. Comments are optional, yet encouraged. Everything from the semicolon to the end of the line is a comment, ignored by the assembler. The semicolon may be in column one, making the entire line a comment. Sections or segments: One specific assembler directive is the "section" or "SECTION" directive. Four types of section are predefined for ELF format: section .data ; initialized data ; writeable, not executable ; default alignment 4 bytes section .bss ; uninitialized space for data ; writeable, not executable ; default alignment 4 bytes section .rodata ; initialized data ; read only, not executable ; default alignment 4 bytes section .text ; instructions (code) ; not writeable, executable ; default alignment 16 bytes section other ; any name other than .data, .bss, ; .rodata, .text ; your stuff ; not executable, not writeable ; default alignment 1 byte A few comments on efficiency: My experience is that a good assembly language programmer can make a small (about 100 lines) "C" program more efficient than the gcc compiler. But, for larger programs, the compiler will be more efficient. Exceptions are, for example, the SGI IRIX cc compiler that has super optimization for that specific machine. For the Intel 80x86 here are some samples in nasm and from gcc (different syntax but you should be able to recognize the instructions) Focus on the loop, there is prologue and epilogue code that should be included, yet was omitted. Note the test has "check" values at each end of the array. There is no range testing in either "C" or assembly language. A simple loop loopint.asm Same code from gcc loopint.s Hex machine code generated by nasm loopint.lst Most efficient loop loopint2.asm Same code from gcc loopint2.s Hex machine code generated by nasm loopint2.lst Speed consideration must take into account cache and virtual memory performance, number of bytes transfered from RAM and clock cycles. On modern computer architectures, this is almost impossible. For example, the Pentium 4 translates the 80x86 code into RISC pipeline code and is actually executing instructions that are different from the assembly language. Carefully benchmarking complete applications is about the only conclusive measure of efficiency.
<- previous index next ->