I would like to do some "inline" assemly programming in Sparc and I am wondering how I can do that with register passing.
Best to explain my issue with a small example
int main()
{
int a = 5;
int b = 6;
int res;
asm_addition(a,b);
printf("Result: %d\n", res);
return(0);
}
// My assembler addition
.global asm_addition
.align 4
add rs1, rs2, rd
restore
Does anyone know which registers I have to use so that the values a and b will be added? Finally, which register do I need to speficy for rd so that the result will then be printed put with the last printf statement following the assemly routine.
Thanks so much for some input!
The calling convention might depend on OS. I presume Solaris. Google for system v application binary interface sparc, the PDF is easy to find.
Full inline assembler documentation is buried somewhere in the SunStudio PDFs and not so easy to find. Officially it is also accessible via man -s 1 inline, though on my system I have to open the file manually. In the man page, look for "Coding Conventions for SPARC Systems".
On Solaris the parameter are passed via register %o0 to %o5 then over the stack. If the called function is a leaf function (i.e. it doesn't call another function) the register window is not moved forward and the function accesses them directly via %o0 to %o5. If the register window is moved, then the function can access the parameters via the %i0 to %i5 registers. The return value goes the same way via %i0 in the callee which becomes %o0 in the caller.
For floating point parameter they are handled via the FP registers but there you will have to read the document Dummy00001 pointed to.
Related
I've been stuck for a while on how to set up a callback when an exception occurs.
I have this test code:
void main()
{
long * bad = (long*)0x0A000000; //Invalid address
//When the following line gets executed
//it causes an error and the debugger sends me to an assembly file.
*bad = 123456789;
}
The assembly file that I am sent to looks like this(fragment of the real file):
.macro DEFAULT_ISR_HANDLER name=
.thumb_func
.weak \name
\name:
1: b 1b /* endless loop */
.endm
DEFAULT_ISR_HANDLER SRC_IRQHandler /*Debugger stops on this line*/
As I understand DEFAULT_ISR_HANDLER is a macro that defines an endless loop.
What I want to do is define my own function in a C file, that I could call when an exception occurs, instead of calling whats defined in the DEFAULT_ISR_HANDLER macro.
My question is, How would I define a macro, in that assembly, that calls an specific C function?
Hopefully I explained myself. Any information or direction around this topic is appreciated.
In case it's relevant I am using GCC ARM compiler v5.4_2016q3
Thanks,
Isaac
EDIT
I am using a Cortex-M3.
Until now I realized I was talking about processor exceptions. According to the datasheet there is a list with 16 exception types.
Apparently, the way it works is that all the exception types are being redirected to the macro, which in turn calls some thumb function and afterwards an endless loop(according to DEFAULT_ISR_HANDLER above in code).
What I would like to do is define my own function in a C file, for convenience, so every time any type of processor exception appear, I could control how to proceed.
You have two options:
Just define a C function with the void SRC_IRQHandler(void) signature and since the macro is defining the default handler as weak, your function will override the default handler in the linking stage.
There should be a place in your project where SRC_IRQHandler is placed in what is called a Vector Table in the Cortex-M3 architecture. You can replace the name of this function with your own C function and your function will be called when this interrupt (exception) happens.
The cortex-m family in general has well more than 16 exceptions there are those plus as many interrupts are implemented by that core, 32, 64, 128, 256. But it is all fundamentally the same. The way the cortex-m family works is they perform the EABI call for you if you will, they preserve some of the registers and then start execution at the address called out in the vector table done in such a way that you can have the address of a normally compiled C function directly in the table. Historically you needed to wrap that function with some code to preserve and restore the state and often instruction sets have a special return from interrupt, but the cortex-m they did a bit different.
so knowing that then the next question is how do you get that address in the table, and that depends on your code, build system, etc. Those handlers might be setup to point to an address in ram and maybe you are running on an RTOS and there is a function you call runtime to register a function for an exception then the RTOS changes the code or some data value in ram that is tied into their handler which essentially wraps around yours. or you are making the vector table in assembly or some other tool specific thing (although assembly is there, works and easy) and you simply count down the right number of entries (or add a hundred more entries so you can count down to the right entry) and place the name of your C function.
good idea to disassemble or do some other check on the result before running to double check that you have placed the handler address at the right physical address for that interrupt/exception.
I have a pointer to a __stdcall function in C and in both x86 and x64 assembly what I'd like to do is have an asm function that I can use to jump to that function.
For example take the windows API function MessageBoxW
void *fn = GetProcAddress(GetModuleHandle("kernel32.dll"), MessageBoxW);
Then in C I'll have a call to the ASM, like
void foo()
{
MessageBoxW_asmstub(NULL, "test", "test", NULL);
}
Assume fn is global. Then in assembly I'd like to have a function that just forwards to MessageBoxW, not calling it. In other words I want MessageBoxW to clean up the variables passed to MessageBoxW_asmstub and then return to foo
jump (fn) ?
I don't know how to do this.
Assuming that MessageBoxW_asmstub is declared to the C compiler as having the correct calling convention (i.e. __stdcall for x86; for x64 there is thankfully only one calling convention), then as the comment from Ross Ridge said, this is as simple as jumping to the target function which will then return directly to the caller. Since you have an indirect reference (i.e. fn refers to a pointer to the target), you probably need another load instruction (although my knowledge of x86 is limited here -- I wouldn't be at all surprised if there is some double-indirect form of jmp). You can use any volatile registers in the calling convention to do this, e.g. for x64 you might use something along the lines of:
extern fn:qword
MessageBoxW_asmstub:
mov rax, fn
jmp rax
BTW, if you use a debugger to step through calls to delay-loaded DLL imports, you'll probably see a similar pattern used in the linker-generated stub functions.
So I made a very simple C program to study how C works on the inside. It has just 1 line in the main() excluding return 0:
system("cls");
If I use ollydebugger to analyze this program It will show something like this(text after the semicolons are comments generated by ollydebugger.
MOV DWORD PTR SS:[ESP],test_1.004030EC ; ||ASCII "cls"
CALL <JMP.&msvcrt.system> ; |\system
Can someone explain what this means, and if I want to change the "cls" called in the system() to another command, where is the "cls" stored? And how do I modify it?
You are using 32 bit Windows system, with its corresponding ABI (the assumptions used when functions are called).
MOV DWORD PTR SS:[ESP],test_1.004030EC
Is equivalent to a push 4030ech instruction, that simply store the address of the string cls on the stack.
This is the way parameters are passed to functions and tell us that the string cls is at address 4030ech.
CALL <JMP.&msvcrt.system> ; |\system
This is the call to the system function from the CRT.
The JMP in the name is due how linking works by default with Visual Studio compilers and linkers.
So those two lines are simply passing the address of the string to the system function.
If you want do modify it you need to check if it is in a writable section (I think is not) by checking the PE Sections, your debugger may have a tool for that. Or you could just try anyway the following:
Inspect the memory at 4030ech, you will see the string, try editing it (this is debugger dependent).
Note: I use the TASM notation for hex numbers, i.e. 123h means 0x123 in C notation.
While trying to make my own alternative to the stdarg.h macros for variable arguments functions, a.k.a. functions with an unknown number of arguments, i tried to understand the way the arguments are stored in memory.
Here is a MWE :
#include <stdio.h>
void foo(int num, int bar1, int bar2)
{
printf("%p %p %p %p\n", &foo, &num, &bar1, &bar2);
}
int main ()
{
int i, j;
i = 3;
j = -5;
foo(2, i, j);
return 0;
}
I understand without any problem that the function's address is not in the same place as the arguments' addresses.
But the latter aren't always organized in the same way.
On a x86_32 architecture (mingw32), i get this kind of result :
004013B0 0028FEF0 0028FEF4 0028FEF8
which means that the adresses are in the same order as the arguments.
BUT when I run it on a x86_64 this time the output is :
0x400536 0x7fff53b5f03c 0x7fff53b5f038 0x7fff53b5f034
Where the addresses are obviously in reverse order w.r.t. the arguments.
Therefore my question is (tl;dr) :
Are the arguments' addresses architecture dependent, or also compiler dependent?
It is compiler dependent. Compiler vendors naturally have to obey by the rules of the CPU architecture. A compiler normally obey the platform ABI as well, at least for code that could potentially interoperate with code produced by another compiler. The platform ABI is a specification of calling convention, linking semantic and much more, for a given platform.
E.g. compilers on linux and other unix like operating system adhere to the System V Application Binary Interface, and you'll find in chapter 3.2.3 how parameters are passed to functions (arguments passed in registers are passed left to right and arguments passed in memory(on the stack) are passed from right to left). On Windows, the rules are documented here.
They're ABI dependent. In cases where it doesn't matter (functions that will only be called in a known way), it's entirely compiler dependent and that usually means using registers, which don't have an address (those arguments will have an address if you ask for that address, giving the appearance that everything has an address). Functions that get inlined don't even really have arguments anymore, so the question of what their addresses are is moot - though again they will appear to exist and have an address when you force that happen.
Arguments may not be stored in memory at all, but passed via registers; however the language requires an address to be returned for any symbol operand of &, so your observation may be a result of you actually attempting the observation and the compiler has simply copied the values to those addresses in order that they are addressable.
It might be interesting to see what happens if you request the addresses in a different order that they were passed for example:
printf("%p %p %p %p\n", &num, &bar1, &bar2, &foo) ;
You may or may not get the same result; the point is that teh addresses you observed may be an artefact of the observation rather than of the passing. Certainly in the ARM ABI, the first four arguments to a function are passed in registers R0, R1, R2, & R3, and thereafter are passed vis the stack.
On x86_64 you get the arguments in a "weird" order because they are not actually passed to the function in any memory at all. They are passed in cpu registers. By taking their address you actually force the compiler to generate code that will store the arguments in memory (on the stack in your case) so that you can take the address of them.
You can't implement stdarg macros without interacting with the compiler. In gcc the stdarg macros just wrap a builtin construct because there is no way for you to know where the arguments might be by the time you need them (the compiler might have reused the registers for something). The builtin stdarg support in gcc can significantly change code generation for functions that use them so that the arguments are available at all. I presume the same goes for other compilers.
I wish to know which one is responsible for cleanup of the stack
Suppose you have a function fun lets say like this:
var = fun(int x, int y, float z, char x);
when fun will get called it will go into the stack along with the parameters then when the function returns who is responsible for cleanup of the stack is it the function it self or the "var" which will hold the return value.
One more thing, can anyone explain the concepts of calling conventions?
You referred to the answer yourself: calling conventions.
A calling convention is similar to a contract. It decides the following things:
Who is responsible to cleanup the parameters.
How and in which order the parameters are passed to the called function.
Where the return value is stored.
There are many different calling conventions, depending on the platform and the programming environment. Two common calling conventions on the x86 platforms are:
stdcall
The parameters are passed onto the stack from right to left. The called function cleans up the stack.
cdecl
The parameters are passed onto the stack from right to left. The calling function cleans up the stack.
In both cases the return value is in the EAX register (or ST0 for floating point values)
Many programming languages for the x86 platform allow to specify the calling convention, for example:
Delphi
function MyFunc(x: Integer): Integer; stdcall;
Microsoft C/C++
int __stdcall myFunc(int x)
Some usage notes:
When creating a simple application it's rarely necessary to change or to know about the calling convention, but there are two typical cases where you need to concern yourself with calling conventions:
When calling external libraries, Win32 API for example: You have to use compatible calling conventions, otherwise the stack might get corrupted.
When creating inline assembler code: You have to know in which registers and where on the stack you find the variables.
For further details I recommend these Wikipedia articles:
Calling convention
x86 calling conventions
calling convention refers to who is doing the cleanup of the stack; caller or callee.
Calling conventions can differ in:
where parameters and return values are placed (in registers; on the call
stack; a mix of both)
the order in which parameters are passed (or parts of a single
parameter)
how the task of setting up and cleaning up a function call is divided
between the caller and the callee.
which registers that may be directly used by the callee may sometimes also
be included
Architectures almost always have more
than one possible calling convention.
By the time that line is complete var will hold the value returned by fun() and any memory on the stack used by fun will be gone: "push", "pop" all tidy.
Calling conventions: everything that the compiler organises so that fun can do its work. Consider those parameters x, y, z. What order do they get pushed onto the stack (indeed do they get passed via the stack)? Doesn't matter so long as the caller and callee agree! It's a convention.