I have a question regarding the assembler. I was thinking of how the C function that takes multiple parameters as an argument is transformed into assembly. So my question is, is there a subroutine in assembly that takes arguments as a parameter to operate?
The code might look something like this:
Call label1, R16.
Where R16 is the subroutine input parameter.
If that's not the case then that means that EACH time the C function is called, it gets assembled into a subroutine with the parameters related to the specific call being substituted automatically in it. That basically means that whenever a C function is called, the compiler transforms it into an inline function which am sure is not the case either :D
So which is right?
Thanks alot! :)
The compiler uses a "calling convention" which can be specific to that one compiler for that one target architecture (x86, arm, mips, pdp-11, etc). For architectures with "plenty" of general purpose registers, the calling convention often starts with passing parameters in registers, and then uses the stack, for architectures with not a lot of registers the stack is primarily if not completely used for parameter passing and the return.
The calling convention is a set of rules, such that if everyone follows the rules you can compile functions into objects and link them with other objects and they will be able to call each others functions or call themselves.
So it is a bit of a hybrid of what you were assuming. The code built for that function is in some respects custom to that function as the number and type of parameters dictate what registers or how much stack is consumed and how. At the same time all functions conform to the same formula so they look more alike than different.
On an arm for example you might have three integers being passed in to a function, they would for all the arm calling conventions I have seen (generally you find that even though it could vary across compilers it often doesnt or in the case of arm and mips and some others they try to dictate the convention for everyone rather than the compiler folks trying to do it) the first parameter in the C function would come in in r0, the second in r1 and third in r2. If the first parameter were a 64 bit integer though then r0 and r1 are used for that first parameter and r2 gets the second and r3 the third, after r3 you use the stack, ordering of parameters on the stack is also dictated by the convention. So when a caller or a callee's code is compiled using the same C prototype then both sides know exactly where to find the parameters and construct the assembly language to do that.
There might be some minimal options in some instruction sets, but in general that is not the case.
Some assemblers have macros though that mimic procedural calls (usually with only a few registrable basetypes).
And no, only in the case of inline functions a new function is generated with the parametrised with the parameters substituted.
A compiler doesn't generate code for a procedure by textual substitution of parameters, but by putting all relevant parameters in registers or on the stack in a fixed regime called the "calling convention".
The code that calculates and loads the parameters (in registers or on stack) is generated for each invocation, and the procedure/function remains unmodified and loads the parameters from where it knows it can find them
Related
Normally, when function A calls function B, function A will save all caller-saved registers (that are live) before doing the call to function B. In my (somewhat esoteric) usecase function B will be invoked through a function pointer and can also possibly overwrite callee-saved registers without saving them. That means A also has to save the callee-saved registers before invoking B. I looked up the possibility of using a custom calling convention for the function pointer but that solution didn't pan out. I'm using gcc (or clang) and compiling for ARM, but if at all possible I'd like to avoid inline assembly or compiler-specific constructs to make the code as general as possible. Anyone have a good suggestion? :-)
Currently I have a small function which gets called very very very often (looped multiple times), taking one argument. Thus, it's a good case for a __fastcall.
I wonder though.
Is there a difference between these two syntaxes:
void __fastcall func(CTarget *pCt);
and
void func(register CTarget *pCt);
After all, those two syntaxes basically tell the compiler to pass the argument in registers right?
Thanks!
__fastcall defines a particular convention.
It was first added by Microsoft to define a convention in which the first two arguments that fit in the ECX and EDX registers are placed in them (on x86, on x86-64 the keyword is ignored though the convention that is used already makes an even heavier use of registers anyway).
Some other compilers also have a __fastcall or fastcall. GCC's is much as Microsofts. Borland uses EAX, EDX & ECX.
Watcom recognises the keyword for compatibility, but ignores it and uses EAX, EDX, EBX & ECX regardless. Indeed, it was the belief that this convention was behind Watcom beating Microsoft on several benchmarks a long time ago that led to the invention of __fastcall in the first place. (So MS could produce a similar effect, while the default would remain compatible with older code).
_mregparam can also be used with some compilers to change the number of registers used (some builds of the Linux kernel are on Intel or GCC but with _mregparam 3 so as to result in a similar result as that of __fastcall on Borland.
It's worth noting that the state of the art having moved on in many regards, (the caching that happens in CPUs being particularly relevant) __fastcall may in fact be slower than some other conventions in some cases.
None of the above is standard.
Meanwhile, register is a standard keyword originally defined as "please put this in a register if possible" but more generally meaning "The address of this automatic variable or parameter will never be used. Please make use of this in optimising, in whatever way you can". This may mean en-registering the value, it may be ignored, or it may be used in some other compiler optimisation (e.g. the fact that the address cannot be taken means certain types of aliasing error can't happen with certain optimisations).
As a rule, it's largely ignored because compilers can tell if you took an address or not and just use that information (or indeed have a memory location, copy into a register for a bunch or work, then copy back before the address is used). Conversely, it may be ignored in function signatures just to allow conventions to remain conventions (especially if exported, then it would either have to be ignored, or have to be considered part of the signature; as a rule, it's ignored by most compilers).
And all of this becomes irrelevant if the compiler decides to inline, as there is then no real "argument passing" at all.
register is enforced, so it can serve as an assertion that you won't take the address; any attempt to do so is then a compile error.
Visual Studio 2012 Microsoft documentation regarding the register keyword:
The compiler does not accept user requests for register variables; instead, it makes its own register choices when global register-allocation optimization (/Oe option) is on. However, all other semantics associated with the register keyword are honored.
Visual Studio 2012 Microsoft documentation regarding the __fastcall keyword:
The __fastcall calling convention specifies that arguments to functions are to be passed in registers, when possible. The following list shows the implementation of this calling convention.
You can still have a look at the assembler code created by the compiler to check what actually happens.
register is essentially meaningless in modern C/C++. Compilers ignore it, putting whichever variables in registers they want (and note that a given variable will often be in a register some of the time, and in the stack some of the time, during the function's execution). It has some minor utility in hinting non-aliasing, but using restrict (or a given compiler's equivalent to restrict) is a better way to achieve that.
__fastcall does improve performance slightly, though not as much as you'd expect. If you have a small function which is called often, the number one thing to do to improve performance is to inline it.
In short, it depends on your architecture and your compiler.
The main difference between these two syntaxes is that register is standardized and __fastcall isn't, but they are both calling conventions.
The default calling convention in C is the cdecl, where parameters are pushed into the stack in reverse order, and return value is stored on EAX register. Every data register can be used in the function, before the call they are caller-saved.
There is another convention, the fastcall, which is indicated by the register keyword. It passes arguments into EAX, ECX and EDX registers (the remaining args are pushed into the stack).
And __fastcall keyword isn't conventionned, it totaly depends on your compiler. With cl (Visual Studio), it seems to store the four first arguments of your function to registers, except on x86-64 and ARM archs. With gcc, the two first arguments are stored on register, regardless of the arch.
But keep in mind that compilers are able by themselves to optimize your code to greatly improve its speed. And I bet that for your function there is a better way to optimize your code.
But you need to disable optimisation to use these keywords (volatile as well). Which is a thing I totaly not recommend.
I am using a bootloader program which is in Assembly and I am calling a C function frequently to SEND and RECEIVE a Character at a time. The controller I am using seems to have just 3 general purpose registers which it uses frequently. Apart from that I am storing some bytes in fixed RAM locations.
SO, my question is:
Will C function overwrite these RAM location, which were defined in Assembly?
I am doing PUSH and PULL of the concerned registers before going and after coming from these C functions.
If I understand your question correctly, you are concerned about the RAM locations used in your assembly module overlapping with some variable declared in a C module. You can examine the list file output by your linker to determine if this is the case. The linker list file will show all of the RAM addresses used by your C modules which you can compare to the fixed RAM locations used in the assembly module.
Note that if your linker does not produce a list file automatically, you will have to read through your linker's documentation to find the right command line option to do so.
As long as you are keeping the previous values on the stack when doing the c calls you should be fine. Just make sure that you are pushing onto stack before the call and popping off the stack after returning.
It all depends on the C calling convention that the C code was compiled in. Calling convention is how the caller and callee will communicate with regards to passing data into the function and returning values afterwards. This includes who wil do stuff like back up registers onto the stack before/after calling, will it be necessary to prep the registers before calling the C function, can you guarantee that the registers will return the way they were, etc.
You'll need to find out how the C code was compiled (with what Calling Convention setting). Note that this is also architecture specific. A summary of the different calling conventions and a description of what each entails can be found at Wikipedia here:
http://en.wikipedia.org/wiki/Calling_convention
http://en.wikipedia.org/wiki/X86_calling_conventions
On x86, cdecl and stdcall are the most popular conventions. cdecl means your ASM code should do the cleanup, while stdcall says the function being called is responsible for it. If you have the source code for the C function, I would suggest passing the necessary flags to the compiler to make it a "Callee cleanup" convention (usually stdcall, but safecall and fastcall are also options) which means you can safely call the C function without worrying about register corruption.
It come across to me that function like printf() have not limited the number of parameters.
But when debugging program on Solaris, I noticed it will push at most 5 parameters into stack, common register will be used if there are more than 5 parameters.
So what will happen if even common register is not enough in function like printf ? Did compiler do something for me ?
The behaviour is controlled by the ABI for the platform. If there are more parameters than fit in the registers, then they will be handled in a different way. There isn't a simple upper limit on the number of arguments that can be passed, so the compiler and the ABI define a mechanism that works on the hardware in question. What works on SPARC does not necessarily work on, for example, Intel IA32.
Normally platforms where the ABI uses registers for argument passing switch to a different calling convention for variadic functions, whereby everything is passed on the stack. This is why the C standard assigned undefined behavior to calling a variadic function without a prototype; without a prototype, on such platforms the compiler will generate an incorrect call.
It should be noted that some platforms use more complicated (uselessly complicated, I would say) methods of passing arguments to variadic functions, such as constructing a sort of linked list and passing a hidden pointer to that list, which the implementation of va_start is then somehow able to obtain. As a programmer, you should just treat the whole stdarg.h stuff as a black box that does what's expected, and pray that you never have to see the gorey details of some of the uglier implementations...
In this SO thread, Brian Postow suggested a solution involving fake anonymous functions:
make a comp(L) function that returns the version of comp for arrays of length L... that way L becomes a parameter, not a global
How do I implement such a function?
See the answer I just posted to that question. You can use the callback(3) library to generate new functions at runtime. It's not standards compliant, since it involves lots of ugly platform-specific hacks, but it does work on a large number of systems.
The library takes care of allocating memory, making sure that memory is executable, and flushing the instruction cache if necessary, in order to ensure that code which is dynamically generated (i.e. the closure) is executable. It essentially generates stubs of code that might look like this on x86:
pop %ecx
push $THUNK
push %ecx
jmp $function
THUNK:
.long $parameter
And then returns the address of the first instruction. What this stub does is stores the the return address into ECX (a scratch register in the x86 calling convention), pushes an extra parameter onto the stack (a pointer to a thunk), and then re-pushes the return address. Then, it jumps to the actual function. This results in the function getting fooled into thinking it has an extra parameter, which is the hidden context of the closure.
It's actually more complicated than that (the actual function called at the end of the stub is __vacall_r, not the function itself, and __vacall_r() handles more implementation details), but that's the basic principle.
I don't believe you can do that with C99 -- there's no partial application or closure facility available unless you start manually generating machine code at runtime.
Apple's recently proposed blocks would work, though you need compiler support for that. Here's a brief overview of blocks. I have no idea when/if any vendor outside apple will support them.
It is not possible to generate ordinary functions during run-time in either C or C++. What Brian suggested is based on a big "if": "...if you can fake anonymous functions...". And the answer to that "if" is: no, you can't. (Although it is not clear what he meant by "fake".)
(In C++ it is possible to generate function-like objects at run time, but not ordinary functions.)
The above applies to standard C and C++ languages. Particular implementations can support various implementation-provided extensions and/or manually-implemented hacks, like "closures", "delegates" and similar stuff. Nothing of that, of course, have anything to do with standard C/C++ languages.