Monday, June 21, 2010

Step by step to MIPS assembly

In this article, I am going to introduce how to write MIPS assembly. All the examples introduced will able to be assembled or compiled into actual executable programs. They can be executed on a MIPS machine running Linux. So, let's forget the simulator and come to real world :-P. The first is the most simple program.


1         .data
2         .text
3         .global __start
4 __start:
5         li $4, 88
6         li $2, 4001
7         syscall
line 1: is an assembler directive which specifies that data section starts from here. It is empty since we don't have data section in this example.
line 2: specifies that text section starts from here.
line 3: specifies that __start is globally visible to all other modules that are part of the executable, and visible to the GNU linker. The symbol __start is required by the GNU Linker to specify the first instruction to be executed in a program, must always be a global one (same as _start for x86).

line 4: specifies the entry of __start
line 5: loads immediate value 88 to $a0
line 6: loads immediate value 4001 to $v0
line 7: calls the OS to perform system call.
line 5~7: executes the system call according to the number stored in $v0. For MIPS O32 ABI, the number starts from 4000, and exit() will be executed in this example. (You may reference linux kernel source include/asm-mips/unistd.h) The status code is stored in $a0, and we may check it by 'echo $?'.

Assemble the source code with the cross assembler.
  $ mipsel-linux-uclibc-as -o exit1.o exit1.s
Link the object into an executable program.
  $ mipsel-linux-uclibc-ld -o exit1 exit1.o
Then, test the program.
  $ exit1
  $ echo $?
  88

OK. We have the skeleton of basic assembly source code. Normally, we don't write something like 'li $2, 4001', that is un-readable. It is easier to leverage definition of header files and the power of compiler. The same example calling exit() could be re-written.

 1 #include "asm/regdef.h"
 2 #include "asm/unistd.h"
 3 
 4         .data
 5         .text
 6         .global main
 7 main:
 8         .set noreorder
 9         .cpload        t9             # PIC ABI crap (function prologue)
10         .set reorder
11         li a0, 99
12         li v0, __NR_exit
13         syscall
line 1: we need register definitions
line 2: we need system call index definitions
line 6~7: is just like main() in a C program. Compiler will do some magic stuff (between __start and main) for our ease leveraging its power.
line 8~10: is the prologue of every function. Our program will compiled into a PIC, and linked with C library. We need the compiler to calculate a correct offset for global pointer register. It is a must when we want to make a call to the dynamic C library. We are going to have the same prologue in the following examples.

Compile the source code with cross compiler.
  $ mipsel-linux-uclibc-gcc -o exit2 exit2.S
Test the program.
  $ exit2
  $ echo $?
  99

Then, we may try to make some calls in our program. Instruction jal let us call a procedure. It will automatically place the return address in $ra (GPR 31), so that we can easily return back by 'jr ra'. The following example calculates the value of (2^3 + 5^2).

 1 # The illustration the use of function.
 2 # This program will compute the value of 2^3 + 5^2.
 3 
 4 #include "asm/regdef.h"
 5 #include "asm/unistd.h"
 6 
 7         .data
 8         .text
 9         .global main
10 main:
11         .set noreorder
12         .cpload        t9             # PIC ABI crap (function prologue)
13         .set reorder
14         subu sp, 4             # reserve 4bytes from stack
15         li a0, 2
16         li a1, 3
17         jal power              # calculate 2^3
18         nop
19         sw v0, 4(sp)           # push the value of 2^3
20         nop
21         li a0, 5
22         li a1, 2
23         jal power              # calculate 5^2
24         nop
25         lw a0, 4(sp)           # pop the value of 2^3
26         nop
27         addu sp, 4             # release 4bytes to stack
28         add a0, v0             # add with 5^2
29         li v0, __NR_exit
30         syscall
31 
32         .type power, @function
33 power:
34         add v0, a0, zero       # v0 = a0
35         subu a1, 1
36         beqz a1, ans           # power is 1
37         nop
38 ploop:
39         mul v0, v0, a0         # v0 * a0
40         subu a1, 1             # decrease the power
41         bnez a1, ploop         # next power
42         nop
43 ans:
44         jr ra
45         nop
line 14: reserves 4 bytes from stack that we are going to use.
line 15~16: passes argument to the function, power(). The first argument ($a0) is base, and the second argument ($a1) is exponent.
line 17: calls power().
line 18: is branch delay slot for jal
line 19~20: when power() returns, the value of 2^3 will be stored in $v0. We save is to the stack.
line 21~24: calls power() to calculate 5^2. It will be stored in $v0.
line 25~26: pop the stored value of 2^3 to $a0.
line 27: releases memory back to stack that we reserved.
line 28: adds $v0 (containing 5^2) with $a0 (containing 2^3).
line 29~30: calls exit(). The final answer is in $v0.
line 32: marks the symbol power as being a function name.
line 34: $v0 = $a0
line 35~37: if the exponent is equal to 1, the answer (in $v0) is equal to the base.
line 38~42: a loop to repeat calculation of multiplying base for (exponent - 1) times
line 43~45: returns to the original branch with final answer stored in $v0

Compile the source code with cross compiler.
  $ mipsel-linux-uclibc-gcc -o power power.S
Check the value of 2^3 + 5^2.
  $ power
  $ echo $?
  33

Note that the previous example is a simple usage of function. Is is recommended to follow the O32 ABI calling convention for a better compatibility.

With reference to previous example, you can write a hello-world program with write system call. There is also an example provided in Linux MIPS mailing list. How about a hello-world program calling C library printf? Take a look at the followed example.

 1 #include "asm/regdef.h"
 2 
 3         .globl main
 4 main:   .ent    main
 5         .frame sp, 32, ra
 6         .set   noreorder
 7         .cpload        t9
 8         .set   reorder
 9 
10         subu   sp, sp, 32
11         sw     ra, 28(sp)
12         .cprestore 24
13         la     a0, hello
14         jal    printf
15 
16         lw     ra, 28(sp)
17         addiu  sp, sp, 32
18         jr     ra
19         .end   main
20 
21         .data
22 hello:  .asciz  "Hello world\n"
line 4: marks the entry of main, the .ent pseudo-op is used by the debugger.
line 5: tells compiler frame register is $sp, frame size is 32 bytes, and return register is $ra.
line 6~8: is the prologue described in previous example.
line 10: reserves the stack memory for 32 bytes. That includes space for local variables, saved registers, and argument area. Please reference the calling convention article for detail.

line 11: pushes the return address. When we call other functions, return address will be changed. Thus, it must be saved first.
line 12: restores the global pointer back. Because we will make a call that clobbers $gp, we have to restore the $gp value before returning. The idea is similar to line 11 that saving $ra.

line 13: the argument is string "Hello world"
line 14: make a call to printf
line 16: restores back the return address
line 17: release the stack that we reserved
line 18: returns from main function
line 19: marks the end of main
line 22: the "Hello world" string is stored in data section

Compile the source code with cross compiler.
  $ mipsel-linux-uclibc-gcc -o printf printf.S
Test the hello-world program.
  $ printf
  Hello world

Having these examples for reference, I think it will be easier if you'd like to write assembly together with C programs.

References:

6 comments:

Laura said...

32 bytes o 32 bits?

Winfred said...

I think you're asking the 4th example, line 5 and line 10. They are both bytes. When we try to access memory, it is usually byte aligned (or even word aligned).

Jure "JLP" Repinc said...

Thanks for tis tutorial. I'm using it on Emtec Gdium laptop with MIPS CPU. To make some examples work I had to change asm/regdef.h into sys/regdef.h.

Сергей said...

Hello.
I think that you made a mistake with names of registers $4 and $2 in the first example. These registers have names $a0 and $v0 respectively. You use these right register names in your second example.

Сергей said...

I have got a question about the third example. You reserve 4 bytes in a stack for a local variable (but you don't address to it). In next instructions you use adress 4(sp) to this local variable, but it's adress of previous 4 bytes in stack. Am I right?

Winfred said...

Yes. 4 bytes space in the stack is reserved to store the value of 2^3. Thanks for correcting my typo on alternate register names.