I want to retrieve the parameters from the program stack by their consecutive relative position in memory, but it fails, any idea on the following snippet of code?
#include <stdio.h>
int __cdecl sum(int num, ...) {
int* p = &num + 1;
int ret = 0;
while(num--) {
ret += *p++;
}
return ret;
}
int __cdecl main(int argc, char** argv) {
printf("%d\n", sum(3, 1, 2, 3)); // wrong result!
return 0;
}
Compiling by the below command:
clang sum.c -g -O0 -o sum
This depends so much on the architecture that no one answer can tell you how to do this, and the advice to use <stdarg.h> is overwhelmingly the proper way to solve the problem.
Even on a given machine, compiler options themselves can change the layout of the stack frame (not to mention the difference between 32-bit and 64-bit code generation).
However, if you're doing this simply because you're curious and are trying to investigate the architecture - a fine goal in my book - then the best way to noodle around with it is to write some code in C, compiler it to assembler, and look at the asm code to see how the parameters are passed.
// stack.c
#include <stdio.h>
int sum(int count, ...)
{
int sum = 0;
// stuff
return sum;
}
int main()
{
printf("%d\n", sum(1, 1));
printf("%d\n", sum(2, 1, 2));
printf("%d\n", sum(3, 1, 2, 3));
printf("%d\n", sum(4, 1, 2, 3, 4));
printf("%d\n", sum(10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10));
return 0;
}
On Linux, you can use either clang -S stack.c or gcc -S stack.c to produce a readable stack.s assembler output file, and with Visual C on Windows, cl /Fa stack.c will produce stack.asm
Now you can look at the different flavors of how the data is passed; when I try this on my Windows system, the 32-bit compiler does a routine push/push/push prior to the call (and might plausibly be negotiated on your own), while the 64-bit compiler does all kinds of things with registers that you won't manage yourself.
So, unless you're just curious about assembly language and architectures, I urge you to use <stdarg.h> facilities to get this done.
EDIT responding to the OP's comment:
I declare __cdecl as a suffix of each function, it should follow the cdecl calling convention, meaning the parameters will be passed into the stack from right to left, and also the caller has responsibility to clean the stack afterward.
This is a common implementation, but this is not what cdecl means; it means "use the calling convention compatible with how C does things", and though passing arguments on the stack is one of them, it's not the only way.
There is no requirement that arguments be passed left or right.
There is no requirement that the stack grows down.
There is no requirement that arguments be passed on the stack at all.
On my 64-bit system, the assembly language for the 10-parameter call looks like:
; Line 20
mov DWORD PTR [rsp+80], 10
mov DWORD PTR [rsp+72], 9
mov DWORD PTR [rsp+64], 8
mov DWORD PTR [rsp+56], 7
mov DWORD PTR [rsp+48], 6
mov DWORD PTR [rsp+40], 5
mov DWORD PTR [rsp+32], 4
mov r9d, 3
mov r8d, 2
mov edx, 1
mov ecx, 10
call sum
mov edx, eax
lea rcx, OFFSET FLAT:$SG8911
call printf
The first handful of parameters are passed right-to-left on the stack, but the rest are passed in registers.
What cdecl really means is the caller has to clean up the parameters (however passed) after the function returns rather than the callee doing so. Callee-cleanup is a bit more efficient, but that does not support variable-length parameters.
This is platform dependant.
But there are some standard library functions for C that will help you.
Have a look in #include <stdarg.h>
There you will find a couple of macros to help in decoding:
va_start Create the va_list
va_end Tides up the va_list
va_arg Moves to the next param in va_list
va_list Keeps track of the current spot.
int __cdecl sum(int num, ...)
{
int loop;
va_list vl;
va_start(vl, num); // start at num
va_arg(vl,int); // Move to the first extra argument
for (loop=1; loop < num; ++loop)
{
int val = va_arg(vl, int); // Get the current argument and move on.
}
va_end(vl); // Tidy up
}
Related
I tried to set an external variable and get its value afterwards, but the value I got was not correct. Do external variables always need to be volatile when compiled with gcc?
The code is as follows (updated the complete code, the previous access to the memory address 0x00100000 is changed to the another variable):
main.c
extern unsigned int ex_var;
extern unsigned int another;
int main ()
{
ex_var = 56326;
unsigned int *pos=&ex_var+16;
for (int i = 0; i < 6; i++ )
{
*pos++ = 1;
}
another = ex_var;
}
another.c
unsigned int ex_var; // the address of this variable is set to right before valid_address
unsigned int valid_address[1024]; // to indicate valid memory address
unsigned int another;
And the value set to another is not 56326.
Note: another.c seems to be strange to indicate that the memory region after ex_var is valid. For the actual running example on bear metal, please refer to this post.
Here is the disassembly of main.c. It is compiled with x86-64 gcc 12.2 with -O1 option:
main:
mov eax, OFFSET FLAT:ex_var+64
.L2:
add rax, 4
mov DWORD PTR [rax-4], 1
cmp rax, OFFSET FLAT:ex_var+88
jne .L2
mov eax, DWORD PTR ex_var[rip]
mov DWORD PTR another[rip], eax
mov eax, 0
ret
It can be found that the code for setting the external variable ex_var is optimized out.
I tried several versions of gcc, including x86-64 gcc, x86 gcc, arm64 gcc, and arm gcc, and it seems that all tested gcc versions above 8.x have such issue. Note that optimization option -O1 or above is needed to reproduce this issue.
The code can be found at this link at Compiler Explorer.
Update:
This bug in the above code is not related to external references.
Here is another example code that has the same bug. Note that it should be compiled with -O1 or above. You can try it at Compiler Explorer.
#include <stdio.h>
unsigned int var;
// In embedded environment, LD files can be used to make valid_address stores right after var
volatile unsigned int valid_address[1024];
int main ()
{
var = 56326;
unsigned int *ttb=&var;
ttb += 16;
for (int i = 0; i < 8; i++ )
{
*ttb++ = 1;
}
valid_address[0] = var;
printf("Value is: %d", valid_address[0]);
}
If you compile this code with gcc like
gcc -O1 main1.c
and execute this code, you might get the following output:
Value is: 0
Which is not correct.
The calculation &ex_var+16 is not defined by the C standard (because it only defines pointer arithmetic within an object, including to the address just beyond its end) and the assignment *pos++ = 1 is not defined by the C standard (because, for the purposes of the standard, pos does not point to an object). When there is behavior not defined by the C standard on a code path, the standard does not define any behavior on the code path.
You can make the behavior defined, to the extent the compiler can see, by declaring ex_var as an array of unknown size, so that the address calculation and the assignments would be defined if this translation unit were linked with another that defined ex_var to be an array of sufficient size:
extern unsigned int ex_var[];
int main ()
{
ex_var[0] = 56326;
unsigned int *pos = ex_var+16;
for (int i = 0; i < 6; i++ )
{
*pos++ = 1;
}
*(volatile unsigned int*)(0x00100000) = ex_var[0];
}
(Note that *(volatile unsigned int*)(0x00100000) = remains not defined by the C standard, but GCC is intended for some use in bare-metal environments and appears to work with this. Additional compilation switches might be necessary to ensure it is defined for GCC’s purposes.)
This yields assembly that sets ex_var[0] and uses it in the assignment to 0x00100000:
main:
mov DWORD PTR ex_var[rip], 56326
…
mov eax, DWORD PTR ex_var[rip]
mov DWORD PTR ds:1048576, eax
mov eax, 0
ret
In a C program, there is a swap function and this function takes a parameter called x.I expect it to return it by changing the x value in the swap function inside the main function.
When I value the parameter as a variable, I want it, but when I set an integer value directly for the parameter, the program produces random outputs.
#include <stdio.h>
int swap (int x) {
x = 20;
}
int main(void){
int y = 100;
int a = swap(y);
printf ("Value: %d", a);
return 0;
}
Output of this code: 100 (As I wanted)
But this code:
#include <stdio.h>
int swap (int x) {
x = 20;
}
int main(void){
int a = swap(100);
printf ("Value: %d", a);
return 0;
}
Return randomly values such as Value: 779964766 or Value:1727975774.
Actually, in two codes, I give an integer type value into the function, even the same values, but why are the outputs different?
First of all, C functions are call-by-value: the int x arg in the function is a copy. Modifying it doesn't modify the caller's copy of whatever they passed, so your swap makes zero sense.
Second, you're using the return value of the function, but you don't have a return statement. In C (unlike C++), it's not undefined behaviour for execution to fall off the end of a non-void function (for historical reasons, before void existed, and function returns types defaulted to int). But it is still undefined behaviour for the caller to use a return value when the function didn't return one.
In this case, returning 100 was the effect of the undefined behaviour (of using the return value of a function where execution falls off the end without a return statement). This is a coincidence of how GCC compiles in debug mode (-O0):
GCC -O0 likes to evaluate non-constant expressions in the return-value register, e.g. EAX/RAX on x86-64. (This is actually true for GCC across architectures, not just x86-64). This actually gets abused on codegolf.SE answers; apparently some people would rather golf in gcc -O0 as a language than ANSI C. See this "C golfing tips" answer and the comments on it, and this SO Q&A about why i=j inside a function putting a value in RAX. Note that it only works when GCC has to load a value into registers, not just do a memory-destination increment like add dword ptr [rbp-4], 1 for x++ or whatever.
In your case (with your code compiled by GCC10.2 on the Godbolt compiler explorer)
int y=100; stores 100 directly to stack memory (the way GCC compiles your code).
int a = swap(y); loads y into EAX (for no apparent reason), then copies to EDI to pass as an arg to swap. Since GCC's asm for swap doesn't touch EAX, after the call, EAX=y, so effectively the function returns y.
But if you call it with swap(100), GCC doesn't end up putting 100 into EAX while setting up the args.
The way GCC compiles your swap, the asm doesn't touch EAX, so whatever main left there is treated as the return value.
main:
...
mov DWORD PTR [rbp-4], 100 # y=100
mov eax, DWORD PTR [rbp-4] # load y into EAX
mov edi, eax # copy it to EDI (first arg-passing reg)
call swap # swap(y)
mov DWORD PTR [rbp-8], eax # a = EAX as the retval = y
...
But with your other main:
main:
... # nothing that touches EAX
mov edi, 100
call swap
mov DWORD PTR [rbp-4], eax # a = whatever garbage was there on entry to main
...
(The later ... reloads a as an arg for printf, matching the ISO C semantics because GCC -O0 compiles each C statement to a separate block of asm; thus the later ones aren't affected by the earlier UB (unlike in the general case with optimization enabled), so do just print whatever's in a's memory location.)
The swap function compiles like this (again, GCC10.2 -O0):
swap:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-4], 20
nop
pop rbp
ret
Keep in mind none of this has anything to do with valid portable C. This (using garbage left in memory or registers) one of the kinds of things you see in practice from C that invokes undefined behaviour, but certainly not the only thing. See also What Every C Programmer Should Know About Undefined Behavior from the LLVM blog.
This answer is just answering the literal question of what exactly happened in asm. (I'm assuming un-optimized GCC because that easily explains the result, and x86-64 because that's a common ISA, especially when people forget to mention any ISA.)
Other compilers are different, and GCC will be different if you enable optimization.
You need to use return or use pointer.
Using return function.
#include <stdio.h>
int swap () {
return 20;
}
int main(void){
int a = swap(100);
printf ("Value: %d", a);
return 0;
}
Using pointer function.
#include <stdio.h>
int swap (int* x) {
(*x) = 20;
}
int main(void){
int a;
swap(&a);
printf ("Value: %d", a);
return 0;
}
I have an assembly application for Linux x64 where I pass arguments to the functions via registers, thus I'm using a certain a certain calling convention, in this case fastcall. Now I want to call a C function from the assembly application which, say, expects 10 arguments. Do I have to switch to cdecl for that and pass the arguments via stack regardless of the fact everywhere else in my application I'm passing them via registers? Is it allowed to mix calling conventions in one application?
I assume that by fastcall, you mean the amd64 calling convention used by the SysV ABI (i.e. what Linux uses) where the first few arguments are passed in rdi, rsi, and rdx.
The ABI is slightly complicated, the following is a simplification. You might want to read the specification for details.
Generally speaking, the first few (leftmost) integer or pointer arguments are placed into the registers rdi, rsi, rdx, rcx, r8, and r9. Floating point arguments are passed in xmm0 to xmm7. If the register space is exhausted, additional arguments are passed through the stack from right to left. For example, to call a function with 10 integer arguments:
foo(a, b, c, d, e, f, g, h, i, k);
you would need code like this:
mov $a,%edi
mov $b,%esi
mov $c,%edx
mov $d,%ecx
mov $e,%r8d
mov $f,%r9d
push $k
push $i
push $h
push $g
call foo
add $32,%rsp
For your concrete example, of getnameinfo:
int getnameinfo(
const struct sockaddr *sa,
socklen_t salen,
char *host,
size_t hostlen,
char *serv,
size_t servlen,
int flags);
You would pass sa in rdi, salen in rsi, host in rdx, hostlen in rcx, serv in r8, servlen in r9 and flags on the stack.
Yes of course. Calling convention is applied on per-function basis. This is a perfectly valid application:
int __stdcall func1()
{
return(1);
}
int __fastcall func2()
{
return(2);
}
int __cdecl main(void)
{
func1();
func2();
return(0);
}
You can, but you don't need to.
__attribute__((fastcall)) only asks for the first two parameters to be passed in registers - everything else will anyhow automatically be passed on the stack, just like with cdecl. This is done in order to not limit the number of parameters that can be given to a function by chosing a certain calling convention.
In your example with 10 parameters for a function that is called with the fastcall calling convention, the first two parameters will be passed in registers, the remaining 8 automatically on the stack, just like with standard calling convention.
As you have chosen to use fastcall for all your other functions, I do not see a reason why you'd want to change this for one specific function.
I am a beginner at assembly, and I am curious to know how the stack frame looks like here, so I could access the argument by understanding and not algorithm.
P.S.: the assembly function is process
#include <stdio.h>
# define MAX_LEN 120 // Maximal line size
extern int process(char*);
int main(void) {
char buf[MAX_LEN];
int str_len = 0;
printf("Enter a string:");
fgets(buf, MAX_LEN, stdin);
str_len = process(buf);
So, I know that when I want to access the process function's argument, which is in assembly, I have to do the following:
push ebp
mov ebp, esp ; now ebp is pointing to the same address as esp
pushad
mov ebx, dword [ebp+8]
Now I also would like someone to correct me on things I think are correct:
At the start, esp is pointing to the return address of the function, and [esp+8] is the slot in the stack under it, which is the function's argument
Since the function process has one argument and no inner declarations (not sure about the declarations) then the stack frame, from high to low, is 8 bytes for the argument, 8 bytes for the return address.
Thank you.
There's no way to tell other than by means of debugger. You are using ia32 conventions (ebp, esp) instead of x64 (rbp, rsp), but expecting int / addresses to be 64 bit. It's possible, but not likely.
Compile the program (gcc -O -g foo.c), then run with gdb a.out
#include <stdio.h>
int process(char* a) { printf("%p", (void*)a); }
int main()
{
process((char *)0xabcd1234);
}
Break at process; run; disassemble; inspect registers values and dump the stack.
- break process
- run
- disassemble
- info frame
- info args
- info registers
- x/32x $sp - 16 // to dump stack +-16 bytes in both side of stack pointer
Then add more parameters, a second subroutine or local variables with known values. Single step to the printf routine. What does the stack look like there?
You can also use gdb as calculator: what is the difference in between sp and rax ?
It's print $sp - $rax if you ever want to know.
Tickle your compiler to produce assembler output (on Unixy systems usually with the -S flag). Play around with debugging/non-debugging flags, the extra hints for the debugger might help in refering back to the source. Don't give optimization flags, the reorganizing done by the compiler can lead to thorough confusion. Add a simple function calling into your code to see how it is set up and torn down too.
I was reading some answers and questions on here and kept coming up with this suggestion but I noticed no one ever actually explained "exactly" what you need to do to do it, On Windows using Intel and GCC compiler. Commented below is exactly what I am trying to do.
#include <stdio.h>
int main()
{
int x = 1;
int y = 2;
//assembly code begin
/*
push x into stack; < Need Help
x=y; < With This
pop stack into y; < Please
*/
//assembly code end
printf("x=%d,y=%d",x,y);
getchar();
return 0;
}
You can't just push/pop safely from inline asm, if it's going to be portable to systems with a red-zone. That includes every non-Windows x86-64 platform. (There's no way to tell gcc you want to clobber it). Well, you could add rsp, -128 first to skip past the red-zone before pushing/popping anything, then restore it later. But then you can't use an "m" constraints, because the compiler might use RSP-relative addressing with offsets that assume RSP hasn't been modified.
But really this is a ridiculous thing to be doing in inline asm.
Here's how you use inline-asm to swap two C variables:
#include <stdio.h>
int main()
{
int x = 1;
int y = 2;
asm("" // no actual instructions.
: "=r"(y), "=r"(x) // request both outputs in the compiler's choice of register
: "0"(x), "1"(y) // matching constraints: request each input in the same register as the other output
);
// apparently "=m" doesn't compile: you can't use a matching constraint on a memory operand
printf("x=%d,y=%d\n",x,y);
// getchar(); // Set up your terminal not to close after the program exits if you want similar behaviour: don't embed it into your programs
return 0;
}
gcc -O3 output (targeting the x86-64 System V ABI, not Windows) from the Godbolt compiler explorer:
.section .rodata
.LC0:
.string "x=%d,y=%d"
.section .text
main:
sub rsp, 8
mov edi, OFFSET FLAT:.LC0
xor eax, eax
mov edx, 1
mov esi, 2
#APP
# 8 "/tmp/gcc-explorer-compiler116814-16347-5i3lz1/example.cpp" 1
# I used "\n" instead of just "" so we could see exactly where our inline-asm code ended up.
# 0 "" 2
#NO_APP
call printf
xor eax, eax
add rsp, 8
ret
C variables are a high level concept; it doesn't cost anything to decide that the same registers now logically hold different named variables, instead of swapping the register contents without changing the varname->register mapping.
When hand-writing asm, use comments to keep track of the current logical meaning of different registers, or parts of a vector register.
The inline-asm didn't lead to any extra instructions outside the inline-asm block either, so it's perfectly efficient in this case. Still, the compiler can't see through it, and doesn't know that the values are still 1 and 2, so further constant-propagation would be defeated. https://gcc.gnu.org/wiki/DontUseInlineAsm
#include <stdio.h>
int main()
{
int x=1;
int y=2;
printf("x::%d,y::%d\n",x,y);
__asm__( "movl %1, %%eax;"
"movl %%eax, %0;"
:"=r"(y)
:"r"(x)
:"%eax"
);
printf("x::%d,y::%d\n",x,y);
return 0;
}
/* Load x to eax
Load eax to y */
If you want to exchange the values, it can also be done using this way. Please note that this instructs GCC to take care of the clobbered EAX register. For educational purposes, it is okay, but I find it more suitable to leave micro-optimizations to the compiler.
You can use extended inline assembly. It is a compiler feature whicg allows you to write assembly instructions within your C code. A good reference for inline gcc assembly is available here.
The following code copies the value of x into y using pop and push instructions.
( compiled and tested using gcc on x86_64 )
This is only safe if compiled with -mno-red-zone, or if you subtract 128 from RSP before pushing anything. It will happen to work without problems in some functions: testing with one set of surrounding code is not sufficient to verify the correctness of something you did with GNU C inline asm.
#include <stdio.h>
int main()
{
int x = 1;
int y = 2;
asm volatile (
"pushq %%rax\n" /* Push x into the stack */
"movq %%rbx, %%rax\n" /* Copy y into x */
"popq %%rbx\n" /* Pop x into y */
: "=b"(y), "=a"(x) /* OUTPUT values */
: "a"(x), "b"(y) /* INPUT values */
: /*No need for the clobber list, since the compiler knows
which registers have been modified */
);
printf("x=%d,y=%d",x,y);
getchar();
return 0;
}
Result x=2 y=1, as you expected.
The intel compiler works in a similar way, I think you have just to change the keyword asm to __asm__. You can find info about inline assembly for the INTEL compiler here.