<- previous    index    next ->

Lecture 2 Getting and using NASM

A 64-bit architecture, by definition, has 64-bit integer registers.
Here are sample programs and output to test for 64-bit capability in gcc:
Get sizeof on types and variables big.c
output from  gcc -m64 big.c  big.out
malloc more than 4GB  big_malloc.c
output from  big_malloc_mac.out
Newer Operating Systems and compilers
Get sizeof on types and variables big12.c
output from  gcc big12.c  big12.out
To bring everyone into the 64-bit world, we will use all 64-bit programs.

A note about the Intel computer architecture and the
tradeoff between "upward compatibility" and "clinging to the past".
Intel built a  4-bit computer 4004.
Intel built an 8-bit, byte, computer 8008
Intel built a 16-bit, word, computer 8086
Intel built a 32-bit, double word, computer 80386
Intel builds  64-bit, quad word, computers X86-64
The terms byte, word, double word, and quad word remain today
in the software we will write for modern 64-bit computers.
We use  -f elf64  and  -m64  for assembling and compiling

Learning a new programming language is an orderly progression
of steps.
1) Find sample code that you can compile and run to get output
   This is typically  hello_world  or just hello
   Then more output, we use  "C" printf  printf1.asm  printf2.asm
2) Find sample code for defining data of the types supported
   We use  testdata_64.asm for Nasm assembly language
3) Find sample code to do integer arithmetic
   We use  intarith_64.asm
4) Find sample code to do floating point arithmetic
   We use  fltarith_64.asm
5) Find sample code to write a function and call a function
   We use  fib_64.asm or test_factorial.asm
Along the way, you will see the structure of typical code,
and initialization and termination of typical programs,
creating "if" and "loop" constructs. Then you can
"cut-and-paste" existing code, modify for your program.

Computer access for this course

NASM is installed on linux.gl.umbc.edu and can be used there. From anywhere that you can reach the internet, log onto your UMBC account using: ssh your-user-id@linux.gl.umbc.edu your-password You should set up a directory for CMSC 313 and keep all your course work in one directory. e.g. mkdir cs313 # only once cd cs313 # change to directory each time for CMSC 313 Copy over a sample program to your directory using: cp /afs/umbc.edu/users/s/q/squire/pub/download/hello_64.asm . Assemble hello.asm using: nasm -f elf64 hello_64.asm Link to create an executable using: gcc -m64 -o hello hello_64.o Execute the program using: hello or ./hello

Assembly Language

Assembly Language is written as lines, rather than statements. A semicolon makes the rest of a line a comment. A line may be blank, a comment, a machine instruction or an assembler directive called a pseudo instruction. An optional label may start a line with a colon. An assembly language program can run on a bare computer, can run directly on an operating system, or can run using a compiler and associated libraries. We will use a C compiler and libraries for convenience. A big difference between assembly language and compiler code is that a label for a variable in assembly language is an address while a name of a variable in compiler code is the value. Assembly language programmers are very frugal. They typically minimize storage space and time. e.g. the instructions xor rax,rax mov rax,0 do the same thing, zero register rax yet the xor is a little faster. e.g. many variables are never stored in RAM, they keep values in registers. I will avoid many of these "tricks" for a while. (Files are available as hello.asm and hello_64.asm, the _64 is to emphasize we will use all 64-bit values and registers. Usually there is also a C language file, e.g. hello.c )

First example hello_64.asm

Now look at the file hello_64.asm ; hello_64.asm print a string using printf ; Assemble: nasm -f elf64 -l hello_64.lst hello_64.asm ; Link: gcc -m64 -o hello hello_64.o ; Run: ./hello > hello.out ; Output: cat hello.out ; Equivalent C code ; // hello.c ; #include <stdio.h> ; int main() ; { ; char msg[] = "Hello world"; ; printf("%s\n",msg); ; return 0; ; } ; Declare needed C functions extern printf ; the C function, to be called section .data ; Data section, initialized variables msg: db "Hello world", 0 ; C string needs 0 fmt: db "%s", 10, 0 ; The printf format, "\n",'0' section .text ; Code section. global main ; the standard gcc entry point main: ; the program label for the entry point push rbp ; set up stack frame, must be aligned mov rdi,fmt ; pass format, standard register rdi mov rsi,msg ; pass first parameter, standard register rsi mov rax,0 ; or can be xor rax,rax call printf ; Call C function pop rbp ; restore stack mov rax,0 ; normal, no error, return value ret ; return Makefile_nasm Now, to save yourself typing, download Makefile_nasm into your cs313 directory. There will be more sample files to download. cp /afs/umbc.edu/users/s/q/squire/pub/download/Makefile_nasm Makefile make # look in Makefile to see how to add more files to run Type make # to run Makefile, only changed stuff gets run Type make -f Makefile_nasm # only changed stuff gets run

Variable Data and Storage allocation, sections

There can be many types of data in the ".data" section: Look at the file testdata_64.asm and see the results in testdata_64.lst ; testdata_64.asm a program to demonstrate data types and values ; assemble: nasm -f elf64 -l testdata_64.lst testdata_64.asm ; link: gcc -m64 -o testdata_64 testdata_64.o ; run: ./testdata_64 ; Look at the list file, testdata_64.lst ; no output ; Note! nasm ignores the type of data and type of reserved ; space when used as memory addresses. ; You may have to use qualifiers BYTE, WORD, DWORD or QWORD section .data ; data section ; initialized, writeable ; db for data byte, 8-bit db01: db 255,1,17 ; decimal values for bytes db02: db 0xff,0ABh ; hexadecimal values for bytes db03: db 'a','b','c' ; character values for bytes db04: db "abc" ; string value as bytes 'a','b','c' db05: db 'abc' ; same as "abc" three bytes db06: db "hello",13,10,0 ; "C" string including cr and lf ; dw for data word, 16-bit dw01: dw 12345,-17,32 ; decimal values for words dw02: dw 0xFFFF,0abcdH ; hexadecimal values for words dw03: dw 'a','ab','abc' ; character values for words dw04: dw "hello" ; three words, 6-bytes allocated ; dd for data double word, 32-bit dd01: dd 123456789,-7 ; decimal values for double words dd02: dd 0xFFFFFFFF ; hexadecimal value for double words dd03: dd 'a' ; character value in double word dd04: dd "hello" ; string in two double words dd05: dd 13.27E30 ; floating point value 32-bit IEEE ; dq for data quad word, 64-bit dq01: dq 123456789012,-7 ; decimal values for quad words dq02: dq 0xFFFFFFFFFFFFFFFF ; hexadecimal value for quad words dq03: dq 'a' ; character value in quad word dq04: dq "hello_world" ; string in two quad words dq05: dq 13.27E300 ; floating point value 64-bit IEEE ; dt for data ten of 80-bit floating point dt01: dt 13.270E3000 ; floating point value 80-bit in register section .bss ; reserve storage space ; uninitialized, writeable s01: resb 10 ; 10 8-bit bytes reserved s02: resw 20 ; 20 16-bit words reserved s03: resd 30 ; 30 32-bit double words reserved s04: resq 40 ; 40 64-bit quad words reserved s05: resb 1 ; one more byte SECTION .text ; code section global main ; make label available to linker main: ; standard gcc entry point push rbp ; initialize stack mov al,[db01] ; correct to load a byte mov ah,[db01] ; correct to load a byte mov ax,[dw01] ; correct to load a word mov eax,[dd01] ; correct to load a double word mov rax,[dq01] ; correct to load a quad word mov al,BYTE [db01] ; redundant, yet allowed mov ax,[db01] ; no warning, loads two bytes mov eax,[dw01] ; no warning, loads two words mov rax,[dd01] ; no warning, loads two double words ; mov ax,BYTE [db01] ; error, size miss match ; mov eax,WORD [dw01] ; error, size miss match ; mov rax,WORD [dd01] ; error, size miss match ; push BYTE [db01] ; error, can not push a byte push WORD [dw01] ; "push" needs to know size 2-byte ; push DWORD [dd01] ; error, can not push a 4-byte push QWORD [dq01] ; OK ; push eax ; error, wrong size, need 64-bit push rax fld DWORD [dd05] ; floating load 32-bit fld QWORD [dq05] ; floating load 64-bit mov rbx,0 ; exit code, 0=normal mov rax,1 ; exit command to kernel int 0x80 ; interrupt 80 hex, call kernel ; end testdata_64.asm Widen your browser window, part of testdata_64.lst to see addresses and data values in hexadecimal. 1 ; testdata_64.asm a program to demonstrate data types and values 2 ; assemble: nasm -f elf64 -l testdata_64.lst testdata_64.asm 3 ; link: gcc -m64 -o testdata_64 testdata_64.o 4 ; run: ./testdata_64 5 ; Look at the list file, testdata_64.lst 6 ; no output 7 ; Note! nasm ignores the type of data and type of reserved 8 ; space when used as memory addresses. 9 ; You may have to use qualifiers BYTE, WORD, DWORD or QWORD 10 11 section .data ; data section 12 ; initialized, writeable 13 14 ; db for data byte, 8-bit 15 00000000 FF0111 db01: db 255,1,17 ; decimal values for bytes 16 00000003 FFAB db02: db 0xff,0ABh ; hexadecimal values for bytes 17 00000005 616263 db03: db 'a','b','c' ; character values for bytes 18 00000008 616263 db04: db "abc" ; string value as bytes 'a','b','c' 19 0000000B 616263 db05: db 'abc' ; same as "abc" three bytes 20 0000000E 68656C6C6F0D0A00 db06: db "hello",13,10,0 ; "C" string including cr and lf 21 22 ; dw for data word, 16-bit 23 00000016 3930EFFF2000 dw01: dw 12345,-17,32 ; decimal values for words 24 0000001C FFFFCDAB dw02: dw 0xFFFF,0abcdH ; hexadecimal values for words 25 00000020 6100616261626300 dw03: dw 'a','ab','abc' ; character values for words 26 00000028 68656C6C6F00 dw04: dw "hello" ; three words, 6-bytes allocated 27 28 ; dd for data double word, 32-bit 29 0000002E 15CD5B07F9FFFFFF dd01: dd 123456789,-7 ; decimal values for double words 30 00000036 FFFFFFFF dd02: dd 0xFFFFFFFF ; hexadecimal value for double words 31 0000003A 61000000 dd03: dd 'a' ; character value in double word 32 0000003E 68656C6C6F000000 dd04: dd "hello" ; string in two double words 33 00000046 AF7D2773 dd05: dd 13.27E30 ; floating point value 32-bit IEEE 34 35 ; dq for data quad word, 64-bit 36 0000004A 141A99BE1C000000F9- dq01: dq 123456789012,-7 ; decimal values for quad words 36 00000053 FFFFFFFFFFFFFF 37 0000005A FFFFFFFFFFFFFFFF dq02: dq 0xFFFFFFFFFFFFFFFF ; hexadecimal value for quad words 38 00000062 6100000000000000 dq03: dq 'a' ; character value in quad word 39 0000006A 68656C6C6F5F776F72- dq04: dq "hello_world" ; string in two quad words 39 00000073 6C640000000000 40 0000007A C86BB752A7D0737E dq05: dq 13.27E300 ; floating point value 64-bit IEEE 41 42 ; dt for data ten of 80-bit floating point 43 00000082 4011E5A59932D5B6F0- dt01: dt 13.270E3000 ; floating point value 80-bit in register 43 0000008B 66 44 45 46 section .bss ; reserve storage space 47 ; uninitialized, writeable 48 49 00000000 s01: resb 10 ; 10 8-bit bytes reserved 50 0000000A s02: resw 20 ; 20 16-bit words reserved 51 00000032 s03: resd 30 ; 30 32-bit double words reserved 52 000000AA s04: resq 40 ; 40 64-bit quad words reserved 53 000001EA s05: resb 1 ; one more byte 54 55 SECTION .text ; code section 56 global main ; make label available to linker 57 main: ; standard gcc entry point 58 59 00000000 55 push rbp ; initialize stack 60 61 00000001 8A0425[00000000] mov al,[db01] ; correct to load a byte 62 00000008 8A2425[00000000] mov ah,[db01] ; correct to load a byte 63 0000000F 668B0425[16000000] mov ax,[dw01] ; correct to load a word 64 00000017 8B0425[2E000000] mov eax,[dd01] ; correct to load a double word 65 0000001E 488B0425[4A000000] mov rax,[dq01] ; correct to load a quad word 66 67 00000026 8A0425[00000000] mov al,BYTE [db01] ; redundant, yet allowed 68 0000002D 668B0425[00000000] mov ax,[db01] ; no warning, loads two bytes 69 00000035 8B0425[16000000] mov eax,[dw01] ; no warning, loads two words 70 0000003C 488B0425[2E000000] mov rax,[dd01] ; no warning, loads two double words 71 72 ; mov ax,BYTE [db01] ; error, size miss match 73 ; mov eax,WORD [dw01] ; error, size miss match 74 ; mov rax,WORD [dd01] ; error, size miss match 75 76 ; push BYTE [db01] ; error, can not push a byte 77 00000044 66FF3425[16000000] push WORD [dw01] ; "push" needs to know size 2-byte 78 ; push DWORD [dd01] ; error, can not push a 4-byte 79 0000004C FF3425[4A000000] push QWORD [dq01] ; OK 80 81 ; push eax ; error, wrong size, need 64-bit 82 00000053 50 push rax 83 84 00000054 D90425[46000000] fld DWORD [dd05] ; floating load 32-bit 85 0000005B DD0425[7A000000] fld QWORD [dq05] ; floating load 64-bit 86 87 00000062 BB00000000 mov rbx,0 ; exit code, 0=normal 88 00000067 B801000000 mov rax,1 ; exit command to kernel 89 0000006C CD80 int 0x80 ; interrupt 80 hex, call kernel 90 91 ; end testdata_64.asm You do not see much without output, we keep it simple and use "C" printf. printf1_64.asm Then the output: printf1_64.out You can not do much without arithmetic, add, sub, mul, div Example also shows the assembly language technique for a macro. intarith_64.asm Then the output: intarith_64.out Divide not exactly same as multiply. Extreme macro. div_test.asm Then the output: div_test.out The next lecture will cover Intel registers:
    <- previous    index    next ->

Other links

Go to top