The x86 architecture features many different calling conventions. There are three major calling conventions in use: __cdecl, __stdcall, and __fastcall. We are going to force the compiler to generate assembly with different calling conventions in order to see how they work.
Here is a simple function in C:
int f(int a, int b) { return a + b; }
GCC compiler uses __cdecl by default (It is also the default calling convention for Microsoft C/C++ compiler). We will get the same result as __cdecl without extra compiler options or configurations.
Let's call the simple function with __cdecl calling convention:
int __attribute__((cdecl)) f(int a, int b) { return a + b; } void caller(void) { f(3, 5); }
The assembly generated by gcc for __cdecl would be:
1 f: 2 pushl %ebp 3 movl %esp, %ebp 4 movl 12(%ebp), %eax 5 addl 8(%ebp), %eax 6 popl %ebp 7 ret 8 caller: 9 pushl %ebp 10 movl %esp, %ebp 11 subl $8, %esp 12 movl $5, 4(%esp) 13 movl $3, (%esp) 14 call f 15 leave 16 ret
Calling a __cdecl function line by line:
Push parameters onto the stack, from right to left. In this example, 5 first and then 3.
12 movl $5, 4(%esp) 13 movl $3, (%esp)
Call the function f. The process will push the content of %eip onto the stack (for the return address of function f).
14 call f
Since we are in the new function f, a new local stack frame is needed. We push the current %ebp (which belongs to the caller's frame) onto the stack, and make it point to the top of the stack (%esp).
2 pushl %ebp 3 movl %esp, %ebp
After %ebp is changed, the arguments of the function f will be referred as 8(%ebp) and 12(%ebp). They are summed to %eax, which will be the return value to the caller. Note that 0(%ebp) is the old (caller's) base pointer, and 4(%ebp) is the old instruction pointer.
4 movl 12(%ebp), %eax 5 addl 8(%ebp), %eax
Restore the saved base pointer %ebp.
6 popl %ebp
Return from the function. It will pop the old %eip and jump to the location.
7 ret
Note that line 9, 10 and line 2, 3 pairs are the same. It is the function prologue used for every function. Line 8 reserves space for arguments of function f. Line 9 to 11 together is actually the same as ENTER instruction. The compiler uses 3 instructions instead of ENTER instruction for performance.
9 pushl %ebp 10 movl %esp, %ebp 11 subl $8, %esp
LEAVE instruction will set %esp to %ebp, then pop %ebp. It is the epilogue used for every function. Since the stack is intact in function f, we don't have to set %esp to %ebp, popping %ebp is enough. Compiler uses LEAVE instead of 2 instructions also for performance.
15 leave
As we can see, main characteristics of __cdecl calling convention are:
- Arguments are passed from right to left, and placed on the stack.
- Stack cleanup is performed by the caller.
How about calling the function with __stdcall?
int __attribute__((stdcall)) f(int a, int b) { return a + b; } void test(void) { f(3, 5); }
The assembly for __stdcall would be:
1 f: 2 pushl %ebp 3 movl %esp, %ebp 4 movl 12(%ebp), %eax 5 addl 8(%ebp), %eax 6 popl %ebp 7 ret $8 8 test: 9 pushl %ebp 10 movl %esp, %ebp 11 subl $8, %esp 12 movl $5, 4(%esp) 13 movl $3, (%esp) 14 call f 15 subl $8, %esp 16 leave 17 ret
The only difference from __cdecl is that it uses "ret 8" (line 7) for self clean up stack. Therefore, the caller needs to "subl 8, %esp" (line 15) in order to retrieve %ebp back before using leave instruction.
As we can see, main characteristics of __stdcall calling convention are:
- Arguments are passed from right to left, and placed on the stack.
- Stack cleanup is performed by the called function.
And how about calling the function with __fastcall?
int __attribute__((fastcall)) f(int a, int b) { return a + b; } void test(void) { f(3, 5); }
The assembly generated by gcc for __fastcall would be:
1 f: 2 pushl %ebp 3 movl %esp, %ebp 4 subl $8, %esp 5 movl %ecx, -4(%ebp) 6 movl %edx, -8(%ebp) 7 movl -8(%ebp), %eax 8 addl -4(%ebp), %eax 9 leave 10 ret 11 test: 12 pushl %ebp 13 movl %esp, %ebp 14 movl $5, %edx 15 movl $3, %ecx 16 call f 17 popl %ebp 18 ret
It indicates that the arguments should be placed in registers, rather than on the stack, whenever possible. The argument first argument 3 is placed in %ecx (line 15) and the second argument 5 is in %edx (line 14). Function f copy the arguments to its own stack, and calculate with the stack.
What if calling a __fastcall function more thatn two argument?
int __attribute__((fastcall)) f(int a, int b, int c) { return a + b + c; } void test(void) { f(3, 5, 7); } 1 f: 2 pushl %ebp 3 movl %esp, %ebp 4 subl $8, %esp 5 movl %ecx, -4(%ebp) 6 movl %edx, -8(%ebp) 7 movl -8(%ebp), %eax 8 addl -4(%ebp), %eax 9 addl 8(%ebp), %eax 10 leave 11 ret $4 12 test: 13 pushl %ebp 14 movl %esp, %ebp 15 subl $4, %esp 16 movl $7, (%esp) 17 movl $5, %edx 18 movl $3, %ecx 19 call f 20 subl $4, %esp 21 leave 22 ret
The third argument 7 is pushed onto stack (line 16). And function f uses "ret 4" for cleaning up the stack. Therefore, we may conclude that main characteristics of __stdcall calling convention are:
- The first two function arguments that require 32 bits or less are placed into registers ECX and EDX.
- The rest of them are pushed on the stack from right to left.
- If any stack based arguments were present, the callee cleans them off of the stack
References:
- Wikipedia, "x86 calling convention"
- WikiBooks, "x86 Disassembly/Calling Convention Examples"
- Steve Friedl, "Intel x86 Function-call Conventions - Assembly View"
- Nynaeve, "Win32 calling conventions review"
- Nemanja Trifunovic, CodeProject, "Calling Conventions Demystified"
- The Old New Thing, "The history of calling conventions, part 1"
No comments:
Post a Comment