A Program

There is a book entitled Algorithms + Data Structures = Programs by Nichols Wirth. Essentially, that describes every program in every language. The code section implements the algorithm, and the data sections implement the data structures.

We need to have the assembler help us as much as possible, so we don't have to work as hard! The first trick is to use the include directive, just as you did in C/C++ and Java (different directive, same affect!). You can include a number of files that you will need or you can use the following:


; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл
    include \masm32\include\masm32rt.inc
; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл	

That is an include that includes the rest!

Sections (Segments) of a program

Some of the sections of a program include:

.Model The only thing we will use is FLAT, because it gives us all 4GB of memory
.Stack This is the amount of stack space you want to reserve. 1KB is the default
.Data Actually, we can have .DATA, .DATA?, .CONST, .FARDATA, and .FARDATA?
You can get away with using only the .DATA
.Code This is your algoritm, in syntaxically correct assembly language instructions!

You can have more than one .DATA and one .CODE segment. The linker will get it figured out for you.

.Model

When we are writing 32-bit code, the easiest way is to use the "FLAT" memory model, because the addressing in linear. It starts at address 00000000h and goes to 0FFFFFFFFh (simply 4GB!) The alternative is to use the segmented model which is far more complicated!

Details About Using Hexadecimal Numbers

All numbers must begin with a symbol in the range of '0' to '9', otherwise the assembler does not know it is a number! The easy way to handle that is to put a zero on the left side of the number, which does not change the value. Then we have the problem of what base is the number? Unless we specify, it is decimal. To use a hexadecimal number, it must end with an 'h', for hex. Then there is the question, what case is the letters in hex? It does not matter, but the accepted style is use uppercase. There is no penalty for it being lower case, though.

(To use binary, use a 'b' after the number.)

.stack

The stack is a data structure we use for a number of purposes. It is very useful for us and we will talk about it later. So far, the default has been sufficient for our needs.

.data And .data?

In your previous programs in the different languages, you normally had to declare your variable. Declaring a variable involved in reserving an appropriate sized memory location, giving it a name (or label or identifier) and possibly an initial value. Use .data? for uninitialized variables and .data for initialized data.

Declaring data can be interesting. You must keep track of the constraints of your data, what is the minimium and maximum size, it is signed or unsigned, is it real or whole? There is no special way to designate that the variable is signed or unsigned. You, the programmer, must keep that straight! Also, today, the tendency is to make everything a double word, because it is actually faster. When we talk about type, what we really mean is how many bits in the data? If you try to put one size variable into another size container (memory or register), the assembler will flag it as an error.

For the time being, we are only going to talk about integer data. Real numbers come later.

Intrinsic Whole Data Types

These are some of the built-in data types:
SizeName1Name2
8-bitBYTEdb
sbyte 
16-bitWORDdw
sword 
32-bitDWORDdd
sdword 
64-bitQWORDdq

Notes:

  1. This shows a TYPE, but can be used as an initializer. TYPE must be used when the instruction could apply to any time, such as INC DWORD [nrStudents]. Without the TYPE, this could refer to an 8-bit, 16-bit, or 32-bit memory location.
  2. This is an initializer, used in the .DATA and .DATA? segments

I

.Code

An instruction or a line of code has up to four parts:
  1. label
  2. mnemonic
  3. operand(s)
  4. comment
All four are optional, depending on what you are doing. A line of code can consist of just a label, just a mnemonic, or just a comment. The operand(s) can not be present without a mnemonic.

Labels

Labels are identifiers. They are used to identify data and variables or locations in the code. They allow us to refer to locations and data in a symbolic or meaningful way. That is why you should use care in selecting your labels. It also helps document your code.

Mnemonics

"Mnemonic" is defined by Merriam-Webster OnLine Search as "assisting or intended to assist memory". These are normally an abbreviation of the instruction in a form that is suppose to help you remember what the instruction is and does. MOV is for "Move" or transfer data from one point to another. JMP is to jump to location in the program. Some people (including authors of books) incorrectly call them "opcodes", but opcodes are numeric version that only the computer really understands. The MOV instruction can be a number of different opcodes, depending on the addressing mode, etc.

Operands

Operands are the extra information that the instruction needs to do that instruction. Of course, not all instructions use operands. Some instructions have one operand, others have two operands. Operands have no meaning without an instruction mnemonic, and can not exist in isolation.

When an instruction has two operands, they are in the format of:

destination, source although, the first one can also be a source as well as a destination. The destination and the source must be the same size! You can never work with data of two different types or sizes at the same time. You must convert the smallest one to match the largest one.

If you want to put a copy of the data in the EAX register into the EBX, it is:

mov EBX, EAX EAX is the source and EBX is the destination.

If I want to add the values in the EAX and EBX register together and store the results in EAX, it would be in the form of:

add EAX, EBX NOTE: The destination must be one of the two registers holding the data. The Intel chip does not allow you to put the result into a third register. You would have to do the addition and then have another instruction to move the sum to the third register.

Comments

The comments do not affect the performance of the program. They exist to communicate. This is where you tell the reader what you are doing. If others cannot understand your code, your program is worthless. Comments should add value to the code, not just repeat verbatim what the instruction does!

Bad Comment

inc EAX ;increment the EAX register

Good Comment

inc EAX ;increment the number of students processed

Notes

In the .code section, we must call the function to exit the process. If we do not, the computer will continue to execute whatever comes next in memory. This is not normally a good thing and usually crashed the program.

In some programs, there is a way to create what looks like one instruction that is really many instructions. These are called macros. We will be using ones that come with MASM32. In C and C++, you can create a macro with the "#define" directive.

At other times, we will be using library functions to do some of our work, especially input and output!

Type Of Program

There are two types of programs, console and windows. MASM32 came with samples of each

Console (hello.asm)

; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

;                 Build this with the "Project" menu using
;                       "Console Assemble and Link"

; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

    .486                                    ; create 32 bit code
    .model flat, stdcall                    ; 32 bit memory model
    option casemap :none                    ; case sensitive
 
    include \masm32\include\windows.inc     ; always first
    include \masm32\macros\macros.asm       ; MASM support macros

  ; -----------------------------------------------------------------
  ; include files that have MASM format prototypes for function calls
  ; -----------------------------------------------------------------
    include \masm32\include\masm32.inc
    include \masm32\include\gdi32.inc
    include \masm32\include\user32.inc
    include \masm32\include\kernel32.inc

  ; ------------------------------------------------
  ; Library files that have definitions for function
  ; exports and tested reliable prebuilt code.
  ; ------------------------------------------------
    includelib \masm32\lib\masm32.lib
    includelib \masm32\lib\gdi32.lib
    includelib \masm32\lib\user32.lib
    includelib \masm32\lib\kernel32.lib

    .code                       ; Tell MASM where the code starts

; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

start:                          ; The CODE entry point to the program

    print chr$("Hey, this actually works.",13,10)
    exit

; ллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллллл

end start                       ; Tell MASM where the program ends

Batch File To Create A Console Program

@echo off if not exist rsrc.rc goto over1 \masm32\bin\rc /v rsrc.rc \masm32\bin\cvtres /machine:ix86 rsrc.res :over1 if exist "hello.obj" del "hello.obj" if exist "hello.exe" del "hello.exe" \masm32\bin\ml /c /coff "hello.asm" if errorlevel 1 goto errasm if not exist rsrc.obj goto nores \masm32\bin\Link /SUBSYSTEM:CONSOLE "hello.obj" rsrc.res if errorlevel 1 goto errlink dir "hello.*" goto TheEnd :nores \masm32\bin\Link /SUBSYSTEM:CONSOLE "hello.obj" if errorlevel 1 goto errlink dir "hello.*" goto TheEnd :errlink echo _ echo Link error goto TheEnd :errasm echo _ echo Assembly Error goto TheEnd :TheEnd pause

Windows Program (minimum.asm)

; #########################################################################

      .386
      .model flat, stdcall
      option casemap :none   ; case sensitive

; #########################################################################

      include \masm32\include\windows.inc
      include \masm32\include\user32.inc
      include \masm32\include\kernel32.inc

      includelib \masm32\lib\user32.lib
      includelib \masm32\lib\kernel32.lib

; #########################################################################

    .code

start:

    jmp @F
      szDlgTitle    db "Minimum MASM",0
      szMsg         db "  --- Assembler Pure and Simple ---  ",0
    @@:

    push MB_OK
    push offset szDlgTitle
    push offset szMsg
    push 0
    call MessageBox

    push 0
    call ExitProcess

    ; --------------------------------------------------------
    ; The following are the same function calls using MASM
    ; "invoke" syntax. It is clearer code, it is type checked
    ; against a function prototype and it is less error prone.
    ; --------------------------------------------------------

    ; invoke MessageBox,0,ADDR szMsg,ADDR szDlgTitle,MB_OK
    ; invoke ExitProcess,0

end start

Batch File To Create Windows Program

@echo off

if not exist rsrc.rc goto over1
\masm32\bin\rc /v rsrc.rc
\masm32\bin\cvtres /machine:ix86 rsrc.res
 :over1
 
if exist "minimum.obj" del "minimum.obj"
if exist "minimum.exe" del "minimum.exe"

\masm32\bin\ml /c /coff "minimum.asm"
if errorlevel 1 goto errasm

if not exist rsrc.obj goto nores

\masm32\bin\Link /SUBSYSTEM:WINDOWS "minimum.obj" rsrc.res
 if errorlevel 1 goto errlink

dir "minimum.*"
goto TheEnd

:nores
 \masm32\bin\Link /SUBSYSTEM:WINDOWS "minimum.obj"
 if errorlevel 1 goto errlink
dir "minimum.*"
goto TheEnd

:errlink
 echo _
echo Link error
goto TheEnd

:errasm
 echo _
echo Assembly Error
goto TheEnd

:TheEnd
 
pause