UMBC CMSC 313 -- Assembly Language Segment Previous | Next


NASM Overview

Introduction

First Step

OK, the assignment will be to write an assembly language program. What does that mean? Time to panic? Well, no. First, consider what you already know...that is why there are prerequisites for this class. There are many things done when writing assembly language that are identical to writing a C program. We need to create a source file using a text editor. It does not matter what editor you use, it only matters that the source be saved as ASCII text. You've done that in C/C++, so no problem! The filename will be whatever, it is customary to use a file extension of .asm or .S .

Requirement

In this class, the rules for naming files are: If your SSN is 123-45-6789 and you are doing project 0, your filename is 6789prj0.asm

Note

When you write Java code, you can compile it and run it anywhere. It is not that way when you write assembly language! When you write C code, using ANSI C, you can run the program anywhere, as long as you have a compiler for that system, probably without any change to the source code. It is not that way when you write assembly language!

In assembly language, your code must match:

When things don't match, you have to modify the source to make it match!

We are using Intel 486/Pentium CPUs, Linux (kernel 2.4.x) which has the correct libraries, and NASM (which uses the Intel format).

The latest version is 0.98.38, which can be downloaded from either the Official NASM site at Sourceforge, or the web page for this course. This software licensed under the Lesser General Public License, so you are free to copy it. You can not charge anyone for the software under the LGPL!

Do you need to have Linux on your computer with NASM installed? No, but you will be installing Linux in CMSC421, so you can get a jump on the process if you do it now. Otherwise, you must use the GL system for your assignments. (Personally, I think every Computer Science major needs to have two or three operating systems installed on his/her computer, but they don't let me make the rules.) Assistance is available from the Linux Users Group here at UMBC. They are an excellent source of information!

One issue to point out is that the standard assembler for Linux is as which uses the AT&T format and the two are not compatible!

Step Two

Put something into the source file! Sounds easy....what? Well, I am lazy. So I make a source file template, called template.asm (creative, aren't I?) So what is in that file? Check it out.

Well, there are three sections in an assembly program:

The read-only attribute is a constraint imposed on you by the operating system.

NOTE: The addresses inside each section start at zero! That means the addresses are not the physical address, but a relative address. The operating system loads each section at some address (unknown until the program is actually loaded into memory by the loader.) To get the physical address, you must add the relative address to the start address of the of the section. (Of course, there is some magic done here, but don't work about it until you get to CMSC421!)

The template is actually a complete program. The only thing it does is successfully terminate! Well, you should do one thing and do it well? Not quite that extreme.

I copy the template to the appropriate filename and then I write the program, building on the template...saving some work!

NASM is case sensitive, just like Linux!

Assembly language instructions (not macros) result in one machine instruction. You will be putting one instruction on a line. This is where you get the metric Lines of Code. The phrase not had any meaning since the first high-level language, however! Instructions have up to four parts:

label: instruction operands ; comment All four are optional, however you can not have an operand without an instruction. In the old days, things had to be in certain columns or it was an error. Today, there is no such constraint.

Labels are used to implement program control and data structures. Valid characters in labels are letters, numbers, _, $, #, @, ~, ., and ?. In this class, the first character must be a letter.

Instructions tell the computer to do something and the operand is what it is to be done upon or from. You have the same thing in C/C++:

int age;

age = 21;
In the first line you told the computer to reserve some memory for a variable. In the second line you told the computer to set the variable to the value 21. C/C++ is a little bit forgiving when you not mindful of the data types involved in an instruction and will try to "fix" things for you using implied data conversions. The assembler is not that nice! You can not put a floating point value into a character, it does not fit and the assembler will not force it. You can not put a long int into a short int. More precisely, you can not put 4 bytes into 2 bytes! With each instruction, you must make sure that the source and destination are exactly the same size. In addition to "real" machine instructions, NASM also supports a number of pseudo-instructions, such as the reserving memory locations..

Operands are the source and/or destination for the instruction. When there are two operands, the first one is the destination and the second one is the source. Operands can be registers, addresses, constants or expressions. Along the way, there are also a couple of additional constraints, You can not have both the source and destination as addresses, the CPU is not build that way. Also the destination can not be a constant! The destination must be what in C/C++ is called an "lvalue" (effective memory address or a register.) You can use an operand with an instruction!

Comments in assembly language are exactly the same thing as they are in C/C++. In this class, if you fail to comment, you are planning to fail! You are required to submit a required set of comments as a bare minimum and you can supplement it with any extra comments you wish. Your grade depends on a well-commented program.

Assembly

in C/C++, when you compile a program, you are exactly running both the compiler and the linker. With NASM, it is two separate steps.

Running NASM

There is one important option for the NASM program that you must supply, and that is what format to use for the output file. You must specify "-f elf" and then give the name of the file(s) to assembly. nasm -f elf hello.asm That will produce a file hello.o. This file is an unlinked, object file that can not be executed. It must be linked and turned into an executable.

There is another option of interest, that is the -l or listing option.

nasm -f elf -l hello.lst hello.asm

Running The Linker

OK, so good, so far, but you still can not execute. Of course, that assumes that there were not assembly errors. If there were, you must go back and fix them and re-assembly....feel just like C/C++ doesn't it!

There are two linkers. ld gives you a simple executable, works but no free-bees!

ld hello.o This results in a file a.out and where have you heard of that before!

The good stuff comes when you use gcc instead.

gcc -o hello hello.o First of all, the -o option lets us rename the output file from a.out to hello. This will also allow you to use the C library and not have to rewrite all of the basic stuff, like printf. Unless you are told otherwise, in this class you can use the C library.

Makefile

It is often convenient to automate the process. When there is not a lot to be gain by using a Makefile when you only have one source file, it can still be useful. I use them because I like to use xemacs and can simply click on the "Compile" button and get a new executable! (Make sure you name it Makefile!) Then all you have to do is run the command make It figures out which files have to compiled and then relinks everything. This saves a little typing when you only have one file, but when you starting having multiple files, it really saves time and effort!


Previous | Next

©2004, Gary L. Burt