Why doesn't Linux follow Unix syscall conventions? - c

I'm teaching myself Linux assembly language and I've come across an interesting difference between BSD and Linux. In Unix, you push your syscall parameters onto the stack before calling an 80h interrupt; by contrast, in Linux, you pass your parameters in registers.
Does anyone know what the rationale was for the Linux developers to go with registers instead of the stack?
Note: Here's a good page detailing this difference: FreeBSD Developer's Handbook:System Calls without explaining the rationale.

The syscall convention is different because the standard function calling sequence is different. Im assuming you're talking about the difference between the x86-32 calling convention and the AMD64 calling convention. You can check out the AMD64 ABI here.
But if you want to get to the point quickly check this post. Basically it's about speed. By changing the calling convention and using registers instead of the stack you can shave off instructions in the prologue and the epilogue of a call.

You can use some registers with 32 bit code as well. There are several calling conventions for 32-bit code: cdecl, stdcall, pascal and fastcall. Windows and Linux use the same calling conventions for 32-bit code. With fastcall (__attribute((fastcall) in GCC) the first two integer parameters (3 with some compilers) can be registers. The other calling conventions use the stack.
For 64-bit code Windows and Linux use different calling conventions. Linux can use up to 14 registers for calls and Windows only six. Using registers can make the code faster. That could be part of the reason some 64-bit code with many function calls runs O(10%) faster than the same 32-bit code.

Related

Calling convention to use for max. portability between x86 systems

I am working on a set of self-contained x86 assembly routines that I would like to make available to C programs on systems below:
Linux 64-bit only
Windows 32-bit and 64-bit
(Good to have ultimately, Mac 64-bit, but this is not clear as Apple appear to be on their way to drop x86 in favour of ARM)
I use LLVM in some other capacity already and it is almost certain that I would use clang rather than gcc although I can envisage a situation of someone's wanting to compile the whole of it using gcc. The assembler will be NASM.
I develop both the routines and a C library that exposes them to users, i.e. everything is under my control and I can design everything as needed.
I expect that some users will actually use C++ but they will still link to the C library - that is, not with the assembly routines directly.
As I am new to assembly, I am in the process of discovering a wonderful maze of various calling conventions spread across systems, compilers, vendors, calling variants and languages. I cannot say that it does not make for interesting reads sometimes but I cannot say either that it is not confusing to beginners.
My take after reading up on it all is that at the end of the day I can simply start with cdecl for maximum portability in the initial version and then think about special casing to cover other conventions if needs arise - depending on what the routines actually do I may make things faster by using other conventions in specific cases.
But initially, as I would like to have something that works correctly and then optimise it even further - is it correct to say that settling on cdecl will offer maximum portability across the systems that I listed? Thank you.
x86-64 Linux and MacOS both use the x86-64 System V ABI. Windows uses its own calling convention. None of these x86-64 platforms call it "cdecl".
The normal approach is for your library to uses the standard calling convention for the target platform, which means different asm for each one. One way to handle this is with asm macros to adapt the tops of your functions for different calling conventions. Or to parameterize register names like ARG1 instead of hard-coding RDI, but that gets very complicated very fast if your functions are more than trivial pointer increments, or if you ever use a register for something other than a function arg.
On 32-bit Window you have a choice of multiple conventions; fastcall / vectorcall are the two that suck the least. On every other x86 32 and 64-bit platform, there's one standard calling convention. It'll be easier for people to use your library if you follow it.
Agner Fog's calling convention guide has some more detailed suggestions for dealing with portability of hand-written asm. https://www.agner.org/optimize/
You could in theory use x86-64 System V everywhere, but then on Windows MSVC would be unable to emit calls to your code. (GNU C compatible compilers like gcc, clang, and ICC could use __attribute__((sysv_abi)) in the prototypes on Windows where their default calling convention is what MS names x64 fastcall).
I guess you could use x86-64 fastcall everywhere and use __attribute__((ms_abi)) in your prototypes for non-MSVC compilers. But that may cost some performance overhead, especially if you want to use all the XMM regs. (xmm6..15 are call-preserved in x64 fastcall). But beware of compiler bugs; using non-default calling conventions is not nearly as well tested.
If all your functions have 4 or fewer total register args, it's not too bad a calling convention in most respects. Otherwise more register args are usually more efficient. Why does Windows64 use a different calling convention from all other OSes on x86-64?
32-bit and 64-bit are obviously vastly different; none of the standard calling conventions are compatible between 32 and 64-bit code, and your code will usually need to be pretty different anyway.
The only real similarity is between 32-bit Windows fastcall and the standard 64-bit Windows calling convention (which MS also calls fastcall), but 32-bit fastcall only passes the first 2 args in regs, and is callee-pops stack args. 64-bit fastcall passes the first 4 args in regs, starting with the same 2 but then using r8 and r9 which only exist in 64-bit mode.

Overflowing Buffers on the Stack

I'm reading the Shellcoder's handbook and trying to follow along on there simple overflowing buffers on the stack example, but I'm stuck.
I'm running GCC on windows and before a function call instead of pushing on the stack like the book says it should, it just moves the values into registers and then makes the call. The book is running linux I'd assume, does it use a different calling method than windows? How would I get the linux behavior?
Also, when a program accept user input, how do I input data into the program such that it shows in the gdb?
It looks like your book is assuming the cdecl calling convention on an IA32 platform, but that your compiler is using a different calling convention that puts parameters in registers. Are you using an AMD64 platform by any chance? The standard for AMD64 is to put the first n arguments in registers and only additional arguments on the stack (Windows only uses four registers for parameters; every other common platform uses six).
More information on calling conventions: https://en.wikipedia.org/wiki/X86_calling_conventions
If you add a bunch of additional function parameters before the ones that you care about, you should get the last ones on the stack. Alternately, if you compile as 32-bit instead of 64-bit, you might get what you're looking for.

Assuming a calling convention when combining C and x86 Assembly

I have some assembly routines that are called by and take arguments from C functions. Right now, I'm assuming those arguments are passed on the stack in cdecl order. Is that a fair assumption to make?
Would a compiler (GCC) detect this and make sure the arguments are passed correctly, or should I manually go and declare them cdecl? If so, will that attribute still hold if I specify a higher optimisation level?
Calling conventions mean much more than just argument ordering. There is a good pdf explaining all the details, written by Agner Fog: Calling conventions for different C++ compilers and operating systems.
This is a matter of the ABI for the platform you're writing code for. Almost all platforms follow the Unix System V ABI for C calling convention and other ABI issues, which includes both a general ABI (gABI) document detailing the common ABI characteristics across all CPU architectures, and a processor-specific ABI (psABI) document specific to the particular CPU architecture/family. When it comes to x86, this matches what you refer to as "cdecl". So from a practical standpoint, x86 assembly meant to be called from C should be written to assume "cdecl". Basically the only exception to the universality of this calling convention is Windows API functions, which use their own nonstandard "stdcall" calling convention due to legacy Win16 dll thunk compatibility issues; nonetheless, the "default" calling convention on x86 Windows is still "cdecl".
A more important concern when writing asm to be called from C is whether symbol names should be prefixed with an underscore or not. This varies widely between platforms, with the general trend being that ELF-based platforms don't use the prefix, and most other platforms do...
The quick and dirty way to do it is create a dummy C function that matches the asm function you want to implement, do a few things in the dummy C function with the passed in parameters so you can tell them apart, compile then disassemble. Not foolproof but works often.

C register calling conventions

Where i can find documentation about registers which assembly must preserve when a C function is called?
What you want is your system's C Application Binary Interface. Google for "C ABI" and your architecture, and you'll find it. For example, here is one for sparc and here is the relevant bit for AVRs.
This is called the ABI (Application Binary Interface). Where do you find it? Depends on your architecture and operating system.
For example: Google for ABI x86_64 linux if you want to find the calling conventions for an 64 bit linux system.
Dr Agner Fogs optimization manuals contain a nifty side-by-side listing of all the common system and compiler conventions(ABIs), for both 32 and 64 bits. They also contain a lot of other useful information, you can get them here: http://www.agner.org/optimize/
There's a couple of calling conventions in use, but the most common is CDECL. The arguments are pushed on the stack in the order described by that link, and the "scratch" registers available to the callee are %eax, %ecx, and %edx. Anything else should be preserved on the stack.
But, as other people have pointed out, this is only one of many conventions. Check the documentation for the system you're programming for.
Take a look at this links:
Using Win32 calling conventions
Intel x86 Function-call Conventions - Assembly View
Hope this helps.
It's pretty much architecture specific. Have a look at wikipedia's explanation for starters.
http://en.wikipedia.org/wiki/Calling_convention

Does C have a standard ABI?

From a discussion somewhere else:
C++ has no standard ABI (Application Binary Interface)
But neither does C, right?
On any given platform it pretty much does. It wouldn't be useful as the lingua franca for inter-language communication if it lacked one.
What's your take on this?
C defines no ABI. In fact, it bends over backwards to avoid defining an ABI. Those people, who like me, who have spent most of their programming lives programming in C on 16/32/64 bit architectures with 8 bit bytes, 2's complement arithmetic and flat address spaces, will usually be quite surprised on reading the convoluted language of the current C standard.
For example, read the stuff about pointers. The standard doesn't say anything so simple as "a pointer is an address" for that would be making an assumption about the ABI. In particular, it allows for pointers being in different address spaces and having varying width.
An ABI is a mapping from the execution model of the language to a particular machine/operating system/compiler combination. It makes no sense to define one in the language specification because that runs the risk of excluding C implementations on some architectures.
C has no standard ABI in principle, but in practice, this rarely matters: You do what your OS-vendor does.
Take the calling conventions on x86 Windows, for example: The Windows API uses the so-called 'standard' calling convention (stdcall). Thus, any compiler which wants to interface with the OS needs to implement it. However, stdcall doesn't support all C90 language features (eg calling functions without prototypes, variadic functions). As Microsoft provided a C compiler, a second calling convention was necessary, called the 'C' calling convention (cdecl). Most C compilers on Windows use this as their default calling convention, and thus are interoperable.
In principle, the same could have happened with C++, but as the C++ ABI (including the calling convention) is necessarily far more elaborate, compiler vendors did not agree on a single ABI, but could still interoperate by falling back to extern "C".
The ABI for C is platform specific - it covers issues such as register allocation and calling conventions, which are obviously specific to a particular processor. Here are some examples:
The ARM ABI (includes C++)
The PowerPC Embedded ABI
The several ABIs of x86
x86 has had many calling conventions, which extensions under Windows to declare which one is used. Platform ABIs for embedded Linux have also changed over time, leading to incompatible user space. See some history of the ARM Linux port here, which shows the problems in the transition to a newer ABI.
Although several attempts have been
made at defining a single ABI for a
given architecture across multiple
operating systems (Particularly for
i386 on Unix Systems), the efforts
have not met with such success.
Instead, operating systems tend to
define their own ABIs ...
Quoting ... Linux System Programming page 4.
An ABI, even for C, has parts which are quite platform independent, parts which depend on the processor (which registers should be saved, which are used for passing parameters,...) and parts which depend on the OS (more or less the same factors as for the processor as some choices are not imposed by the architecture but are the result of trade-offs, plus some OS's have a language independent notion of exception and so a compiler for any language has to generate the right thing to handle those, handling of threads may also impose things on the ABI -- if a register points to TLS, you can't use it for what you want).
In theory, every compiler may have its own ABI. But usually, for a couple processor/OS, the ABI is fixed by the OS vendor which often also provide a C compiler and common libraries which use that ABI and competitors prefer to be compatible. (I'd not be surprised if there are exceptions for some OS for which C isn't a major programming language).
But the OS vendor may switch ABI for one reason or the other (new versions of processors may have features that you want to use in the ABI for one - for instance some have asked for a 32bit ABI for x86_64 allowing to use all the registers). During the migration phase - which may be for a very long time - you may have to handle two ABI.
neither does C, right?Right
On any given platform it pretty much does. It wouldn't be useful as the lingua franca for inter-language communication if it lacked one.Pretty much might refer to architecture-specific defaults chosen by C compiler vendors being adapted within other languages. So if Keil's ARM C compiler will use left to right little endian parameter ordering and stack to pass arguments and some predetermined register for return value, then extern "C" from other compilers will assume compatibility with such scheme.
While such agreement maybe considered part of ABI, unlike managed execution context such as JVM browser sandbox, this is far from being complete standard ABI by itself.
C does not have a standard ABI. This is easily illustrated by all the calling conventions (cdecl, fastcall and stdcall) that are used out there. Each is a different ABI.
There's no standard ABI because C has always been about maximum runtime performance and the ABI with the highest performance depends on the underlying hardware. As a result, the ABI may use only stack or prefer registers for passing function call arguments and return values as needed for any given hardware.
For example, even amd64 (a.k.a x86-64) has two calling conventions: Microsoft x64 and System V AMD64 ABI. The former puts 4 first arguments to registers and the rest into the stack. The latter puts 6 first arguments to registers and the rest into the stack. I have no idea why Microsoft created non-compatible calling convention for amd64 hardware. For all I know, the Microsoft variant has a slightly worse performance and was created later.
For more information, see https://en.wikipedia.org/wiki/X86_calling_conventions
Prior to the C89 Standard, C compilers for many platforms used essentially the same ABI, save for variations in data sizes. For machines whose stack grows downward, code which calls a function would push the arguments on the stack in order from right to left and then call the function (pushing the return address in the process). A called function would leave its arguments on the stack, and the caller would at its leisure adjust the stack pointer to remove them [or, on some architectures, might adjust the stacked values in place]. While <stdarg.h> made it unnecessary for most programs to rely upon that convention, it remained in use for many years because it was simple and worked pretty well. While there was no "official" document establishing that as a cross-platform "standard", most compilers targeting machines with downward-growing stacks worked that way, leading to a greater level of consistency than exists today.

Resources