Mixing Assembly language and C programs - c

I am using a bootloader program which is in Assembly and I am calling a C function frequently to SEND and RECEIVE a Character at a time. The controller I am using seems to have just 3 general purpose registers which it uses frequently. Apart from that I am storing some bytes in fixed RAM locations.
SO, my question is:
Will C function overwrite these RAM location, which were defined in Assembly?
I am doing PUSH and PULL of the concerned registers before going and after coming from these C functions.

If I understand your question correctly, you are concerned about the RAM locations used in your assembly module overlapping with some variable declared in a C module. You can examine the list file output by your linker to determine if this is the case. The linker list file will show all of the RAM addresses used by your C modules which you can compare to the fixed RAM locations used in the assembly module.
Note that if your linker does not produce a list file automatically, you will have to read through your linker's documentation to find the right command line option to do so.

As long as you are keeping the previous values on the stack when doing the c calls you should be fine. Just make sure that you are pushing onto stack before the call and popping off the stack after returning.

It all depends on the C calling convention that the C code was compiled in. Calling convention is how the caller and callee will communicate with regards to passing data into the function and returning values afterwards. This includes who wil do stuff like back up registers onto the stack before/after calling, will it be necessary to prep the registers before calling the C function, can you guarantee that the registers will return the way they were, etc.
You'll need to find out how the C code was compiled (with what Calling Convention setting). Note that this is also architecture specific. A summary of the different calling conventions and a description of what each entails can be found at Wikipedia here:
http://en.wikipedia.org/wiki/Calling_convention
http://en.wikipedia.org/wiki/X86_calling_conventions
On x86, cdecl and stdcall are the most popular conventions. cdecl means your ASM code should do the cleanup, while stdcall says the function being called is responsible for it. If you have the source code for the C function, I would suggest passing the necessary flags to the compiler to make it a "Callee cleanup" convention (usually stdcall, but safecall and fastcall are also options) which means you can safely call the C function without worrying about register corruption.

Related

C startup code is only written in assembly confusion

I understand that the C startup code is for initializing the C runtime environment, initializes static variables, sets up the stack pointer etc. and finally branches to main().
They say that this can only be written in assembly language as it's platform-specific. However, can't this still be written in C and compiled for the specific platform?
Function calls of course would be not possible because we "more than likely" don't have the stack pointer set up at that stage. I still can't see other main reasons. Thanks in advance.
Startup code can be written in C language only if:
Implementation provides all necessary intrinsic functions to set hardware features that cannot be set using standard C
Provides mechanism of placing fragments of code and data in the specific place and in specific order (gcc support for ld linker scripts for example).
If both conditions are met you can write the startup code in C language.
I use my own startup code written in C (instead of one provided by the chip vendors) for Cortex-M microcontrollers as ARM provides CMSIS header files with all needed inline assembly functions and gcc based toolchain gives me full memory layout control.
Most of the problem with writing early startup code in C is, in fact, the absence of a properly structured stack. It's worse than just not being able to make function calls. All of a C compiler's generated machine code assumes the existence of a stack, pointed to by the ABI-specified register, that can be used for scratch storage at any time. Changing this assumption would be so much work as to amount to a complete second "back end" for the compiler—way more work than continuing to write early startup code by hand in assembly.
Early bootstrap code, bringing up the machine from power-on, also has to do a bunch of special operations that can't usually be accessed from C, like configuring interrupts and virtual memory. And it may have to deal with the code not having been loaded at the address it was linked for, or the relocation table not having been processed, or other similar problems; these also break pervasive assumptions made by the C compiler (e.g. that it can inject a call to memcpy whenever it wants).
Despite all that, most of a user mode C library's startup code will, in fact, be written in C, for exactly the reason you are thinking. Nobody wants to write more code in assembly, over and over for each supported ISA, than absolutely necessary.
A minimal C runtime environment requires a stack, and a jump to a start address. Setting the stack pointer on most architectures requires assembly code. Once a stack is available it is possible to run code generated from C source.
ARM Cortex-M devices load the stack pointer and start address from the vector table on reset, so can in fact boot directly into code generated from C source.
On other architectures, the minimal assembly requires is to set a stack pointer, and jump to the start address. Thereafter it is possible to write other start-up tasks in C ( or C++ even). Such startup code is responsible for establishing the full C runtime, so must not assume static initialisation or library initialisation (no heap or filesystem for example), which are things that must be done by the startup code.
In that sense you can run code generated from C source, but the environment is not strictly conforming until main() has been called, so there are some constraints.
Even where assembly code is used, it need not be the whole start-up code that is in assembly.

Can we add `-fcall-used-REG` for specific functions on gcc?

Can we tell to gcc that specific functions don't need to store/restore some callee save registers by a function attribute?
We can tell it by -fcall-used-REGS for a file, but all of functions in the file are affected.
I made an assembler code which store some callee save registers (r12 on x86_64, for example) and want to call some C functions from asm code. Called functions don't need to store/restore callee save registers because asm code store/restore them, so saving callee save registers are simply overhead (called functions are enough small, so the overhead of pro/epilogue code is huge).
It might be possible with pragma/attribute optimize, but really you should just put the functions in their own files. These functions have to be entirely self-contained since they're using non-default ABI. Putting functions in their own files is a good habit to get in anyway.

C and assembly how can it work?

I am wondering how mixing C and assembly can be possible as compilers generate code in different ways, for example many C compilers will use registers rather than pushing to the stack while making a function call, These functions will then move those registers into the appropiate memory locations because of this what if you write assembly code or link with an object file created by a different compiler that will call the C function but instead push the arguments to the stack rather than set the registers.
My guess is the C compiler assembly output has done it in such a clever way that it doesn't make a difference and it will still work but I can't be sure looking at the assembly code it doesn't appear it would work.
Can anyone answer my question as I am writing a compiler and need to know this so I don't make any mistakes should I want to link with a C module in the future.
The conventions that are used for calling functions are part of what's called the "application binary interface" (ABI). If this interface is specified, then all code that follows the specification can be linked together.
There is no standard ABI for C. However, most popular platforms have one prevailing C compiler that effectively produces a de-facto standard ABI (e.g. there's one for Windows, one for Linux on x86 (32 and 64 bit), one for Linux on ARM, etc.). ABIs may specify a large number of separate "calling conventions", and your C compiler will typically let you specify the desired convention at the point of function declaration using some vendor extension.
Conversely, if there is no documented ABI for your C compiler, or for an existing bit of object code, then you cannot in general link (or otherwise interact) with it successfully.

From Compiler to assembler

I have a question regarding the assembler. I was thinking of how the C function that takes multiple parameters as an argument is transformed into assembly. So my question is, is there a subroutine in assembly that takes arguments as a parameter to operate?
The code might look something like this:
Call label1, R16.
Where R16 is the subroutine input parameter.
If that's not the case then that means that EACH time the C function is called, it gets assembled into a subroutine with the parameters related to the specific call being substituted automatically in it. That basically means that whenever a C function is called, the compiler transforms it into an inline function which am sure is not the case either :D
So which is right?
Thanks alot! :)
The compiler uses a "calling convention" which can be specific to that one compiler for that one target architecture (x86, arm, mips, pdp-11, etc). For architectures with "plenty" of general purpose registers, the calling convention often starts with passing parameters in registers, and then uses the stack, for architectures with not a lot of registers the stack is primarily if not completely used for parameter passing and the return.
The calling convention is a set of rules, such that if everyone follows the rules you can compile functions into objects and link them with other objects and they will be able to call each others functions or call themselves.
So it is a bit of a hybrid of what you were assuming. The code built for that function is in some respects custom to that function as the number and type of parameters dictate what registers or how much stack is consumed and how. At the same time all functions conform to the same formula so they look more alike than different.
On an arm for example you might have three integers being passed in to a function, they would for all the arm calling conventions I have seen (generally you find that even though it could vary across compilers it often doesnt or in the case of arm and mips and some others they try to dictate the convention for everyone rather than the compiler folks trying to do it) the first parameter in the C function would come in in r0, the second in r1 and third in r2. If the first parameter were a 64 bit integer though then r0 and r1 are used for that first parameter and r2 gets the second and r3 the third, after r3 you use the stack, ordering of parameters on the stack is also dictated by the convention. So when a caller or a callee's code is compiled using the same C prototype then both sides know exactly where to find the parameters and construct the assembly language to do that.
There might be some minimal options in some instruction sets, but in general that is not the case.
Some assemblers have macros though that mimic procedural calls (usually with only a few registrable basetypes).
And no, only in the case of inline functions a new function is generated with the parametrised with the parameters substituted.
A compiler doesn't generate code for a procedure by textual substitution of parameters, but by putting all relevant parameters in registers or on the stack in a fixed regime called the "calling convention".
The code that calculates and loads the parameters (in registers or on stack) is generated for each invocation, and the procedure/function remains unmodified and loads the parameters from where it knows it can find them

Overflowing Buffers on the Stack

I'm reading the Shellcoder's handbook and trying to follow along on there simple overflowing buffers on the stack example, but I'm stuck.
I'm running GCC on windows and before a function call instead of pushing on the stack like the book says it should, it just moves the values into registers and then makes the call. The book is running linux I'd assume, does it use a different calling method than windows? How would I get the linux behavior?
Also, when a program accept user input, how do I input data into the program such that it shows in the gdb?
It looks like your book is assuming the cdecl calling convention on an IA32 platform, but that your compiler is using a different calling convention that puts parameters in registers. Are you using an AMD64 platform by any chance? The standard for AMD64 is to put the first n arguments in registers and only additional arguments on the stack (Windows only uses four registers for parameters; every other common platform uses six).
More information on calling conventions: https://en.wikipedia.org/wiki/X86_calling_conventions
If you add a bunch of additional function parameters before the ones that you care about, you should get the last ones on the stack. Alternately, if you compile as 32-bit instead of 64-bit, you might get what you're looking for.

Resources