I'm interested in systems programming, and want to see how structs are implemented in assembly, and how they are linked.
I've written three short .c codes, with same named structs but in different files and compiled and linked together, but I can't understand the output.
I believed that the struct is just a contiguous block of memory in assembly, in the data segment. But, I can't access the values after the first int data, either by using pointers or by having each function use its corresponding files' offsets. Can someone explain the output?
I tried to change data types, implement chars to avoid padding or endian problems. I tried using a char* pointer to print the entire structure, which surprisingly gives me weird values (not random, since they are the same in every execution)
#include <stdio.h>
struct S1
{
int i1;
int i2;
int i3;
int i4;
int i5;
int i6;
};
int main()
{
struct S1 s;
s.i1 = 5;
s.i2 = 10;
s.i3 = 15;
s.i4 = 16;
func1(s); //Implicit calls, no declarations needed
// because linker will know where to find defs
func2(s);
}
#include <stdio.h>
struct S1
{
int i1;
int i2;
float i3;
};
void func1(struct S1 s)
{
printf("In func1 : %d %d %f %lu\n", s.i1, s.i2, s.i3, sizeof(s));
};
#include <stdio.h>
struct S3
{
double i1;
int i2;
};
void func2(struct S3 s)
{
printf("In func2 : %lf %d %lu \n", s.i1, s.i2, sizeof(s));
};
I'm getting outputs:
In func1 : 1 0 0.000000 12
In func2 : 0.000000 1 16
I expected the values to be printed
As GCC 9.2 compiles your file with main, we see these instructions for the call to func1:
mov rdx, QWORD PTR [rbp-16]
mov rax, QWORD PTR [rbp-8]
mov rdi, rdx
mov rsi, rax
mov eax, 0
call func1
Note that the compiler has loaded data into general registers—rsi, rdi, and so on. Compare this to the instructions in func1:
mov rdx, rdi
movd eax, xmm0
mov QWORD PTR [rbp-16], rdx
mov DWORD PTR [rbp-8], eax
movss xmm0, DWORD PTR [rbp-8]
In particular, note the movss instruction. That is attempting to retrieve the float i3 member from the xmm0 register. But, as we have seen, the calling routine did not put anything in the xmm0 register.
The compiler has a specification for how arguments are passed between routines. This is called an Application Binary Interface. (The ABI is shared by software intended to be compatible on a particular platform and is often recommended by the processor manufacturer.) For small structures, at least in this case, the compiler does not pass them by pointing to them in memory or reproducing their exact layout. Instead, the members are passed individually, as if they were separate arguments.
Because of this, your code is not studying how structures are laid out in memory. It is studying how structures are passed in function calls. And part of the answer is that members are sometimes passed individually, and where they are passed depends in part on their types. Because main is passing only integer data, it uses registers for integer arguments. Because func1 is expecting some floating-point data, it looks in a register for that. The result is that func1 never gets the data that is passed for i3 and i4.
While Eric's answer is good, I'd like to highlight one additional misunderstanding your question contains:
func1(s); //Implicit calls, no declarations needed
The comment is incorrect. C always requires declarations of functions. It does allow you to declare it without a prototype, as:
void func1();
However, to call it, the promoted type of the arguments must match the actual type in the function's definition (even if that definition is in another translation unit). If it does not match, the behavior is undefined.
Related
I'm pretty new to C, and I know that static functions can only be used within the same object file.
Something that still confuses me though, is how if I hover over a call to printf in my IDE it tells me that printf is a static function, when I can perfectly use printf in multiple object files without any problems?
Why is that?
Edit: I'm using Visual Studio Code the library is stdio.h and compiling using GCC
#include <stdio.h>
int main()
{
printf("Hello, world!");
return 0;
}
Hovering over printf would give that hint
Edit 2: if static inline functions are different from inline functions how so? I don't see how making a function inline would change the fact that it's only accessible from the same translation unit
Edit 3: per request, here's the definition of printf in the stdio.h header
__mingw_ovr
__attribute__((__format__ (gnu_printf, 1, 2))) __MINGW_ATTRIB_NONNULL(1)
int printf (const char *__format, ...)
{
int __retval;
__builtin_va_list __local_argv; __builtin_va_start( __local_argv, __format );
__retval = __mingw_vfprintf( stdout, __format, __local_argv );
__builtin_va_end( __local_argv );
return __retval;
}
screenshot of the printf definition in my IDE
Thanks for all the updates. Your IDE is showing you implementation details of your C library that you're not supposed to have to worry about. That is arguably a bug in your IDE but there's nothing wrong with the C library.
You are correct to think that a function declared static is visible only within the translation unit where it is defined. And that is true whether or not the function is also marked inline. However, this particular static inline function is defined inside stdio.h, so every translation unit that includes stdio.h can see it. (There is a separate copy of the inline in each TU. This is technically a conformance violation since, as chux surmises, it means that &printf in one translation unit will not compare equal to &printf in another. However, this is very unlikely to cause problems for a program that isn't an ISO C conformance tester.)
As "Adrian Mole" surmised in the comments on the question, the purpose of this inline function is, more or less, to rewrite
printf("%s %d %p", string, integer, pointer);
into
fprintf(stdout, "%s %d %p", string, integer, pointer);
(All the stuff with __builtin_va_start and __mingw_vfprintf is because of limitations in how variadic functions work in C. The effect is the same as what I showed above, but the generated assembly language will not be nearly as tidy.)
Update 2022-12-01: It's worse than "the generated assembly language will not be nearly as tidy". I experimented with all of the x86 compilers supported by godbolt.org and none of them will inline a function that takes a variable number of arguments, even if you try to force it. Most silently ignore the force-inlining directive; GCC gets one bonus point for actually saying it refuses to do this:
test.c:4:50: error: function ‘xprintf’ can never be inlined
because it uses variable argument lists
In consequence, every program compiled against this version of MinGW libc, in which printf is called from more than one .c file, will have multiple copies of the following glob of assembly embedded in its binary. This is bad just because of cache pollution; it would be better to have a single copy in the C library proper -- which is exactly what printf normally is.
printf:
mov QWORD PTR [rsp+8], rcx
mov QWORD PTR [rsp+16], rdx
mov QWORD PTR [rsp+24], r8
mov QWORD PTR [rsp+32], r9
push rbx
push rsi
push rdi
sub rsp, 48
mov rdi, rcx
lea rsi, QWORD PTR f$[rsp+8]
mov ecx, 1
call __acrt_iob_func
mov rbx, rax
call __local_stdio_printf_options
xor r9d, r9d
mov QWORD PTR [rsp+32], rsi
mov r8, rdi
mov rdx, rbx
mov rcx, QWORD PTR [rax]
call __stdio_common_vfprintf
add rsp, 48
pop rdi
pop rsi
pop rbx
ret 0
(Probably not exactly this assembly, this is what godbolt's "x64 MSVC 19.latest" produces with optimization option /O2, godbolt doesn't seem to have any MinGW compilers. But you get the idea.)
__mingw_ovr is normally defined as
#define __mingw_ovr static __attribute__ ((__unused__)) __inline__ __cdecl
It appears that this definition of printf is a violation of the C standard. The standard says
7.1.2 Standard headers
6 Any declaration of a library function shall have external linkage
x86 Function Attributes in the GCC documentation says this:
On 32-bit and 64-bit x86 targets, you can use an ABI attribute to indicate which calling convention should be used for a function. The ms_abi attribute tells the compiler to use the Microsoft ABI, while the sysv_abi attribute tells the compiler to use the System V ELF ABI, which is used on GNU/Linux and other systems. The default is to use the Microsoft ABI when targeting Windows. On all other systems, the default is the System V ELF ABI.
But consider this C code:
#include <assert.h>
#ifdef _MSC_VER
#define MS_ABI
#else
#define MS_ABI __attribute__((__ms_abi__))
#endif
typedef struct {
void *x, *y;
} foo;
static_assert(sizeof(foo) == 8, "foo must be an 8-byte structure");
foo MS_ABI f(void *x, void *y) {
foo rv;
rv.x = x;
rv.y = y;
return rv;
}
gcc -O2 -m32 compiles it to this:
f:
mov eax, DWORD PTR [esp+4]
mov edx, DWORD PTR [esp+8]
mov DWORD PTR [eax], edx
mov edx, DWORD PTR [esp+12]
mov DWORD PTR [eax+4], edx
ret
But cl /O2 compiles it to this:
_x$ = 8 ; size = 4
_y$ = 12 ; size = 4
_f PROC ; COMDAT
mov eax, DWORD PTR _x$[esp-4]
mov edx, DWORD PTR _y$[esp-4]
ret 0
_f ENDP
Godbolt link
These are clearly using incompatible calling conventions. Argument Passing and Naming Conventions on MSDN says this:
Return values are also widened to 32 bits and returned in the EAX register, except for 8-byte structures, which are returned in the EDX:EAX register pair. Larger structures are returned in the EAX register as pointers to hidden return structures.
Which means MSVC is correct. So why is GCC using the pointer-to-hidden-return-structure approach even though the return value is an 8-byte structure? Is this a bug in GCC, or am I not allowed to use ms_abi like I think I am?
That seems buggy to me. i386 SysV only returns int64_t in registers, not structs of the same size, but perhaps GCC forgot to take that into account for ms_abi. Same problem with gcc -mabi=ms (docs).
Even -mabi=ms doesn't in general change struct layouts in cases where that differs, or for x86-64 make long a 32-bit type. But your struct does have the same layout in both ABIs, so you'd expect it to get returned the way an MSVC caller wants. But that's not happening.
This is IIRC not the first time I've heard of bugs in __attribute__((ms_abi)) or -mabi=ms. But you are using it correctly; in 64-bit code that would make the difference to which arg-passing registers it looked in.
Clang makes the same asm as GCC, but that's not significant because it warns that the 'ms_abi' calling convention is not supported for this target. This is the asm we'd expect from i386 SysV. (And Godbolt doesn't have clang-cl.)
So thanks for reported it to GCC as bug #105932.
(Funny that when it chooses to use SSE or AVX, it loads both dword stack args separately and shuffles them together, instead of a movq load. I guess that avoids likely store-forwarding stalls if the caller didn't use a single 64-bit store to write the args, though.)
I am learning C and wrote the following code:
#include <stdio.h>
int main()
{
double a = 2.5;
say(a);
}
void say(int num)
{
printf("%u\n", num);
}
When I compile this program, the compiler gives following warnings:
test.c: In function ‘main’:
test.c:6:2: warning: implicit declaration of function ‘say’ [-Wimplicit-function-declaration]
6 | say(a);
| ^~~
test.c: At top level:
test.c:9:6: warning: conflicting types for ‘say’
9 | void say(int num)
| ^~~
test.c:6:2: note: previous implicit declaration of ‘say’ was here
6 | say(a);
| ^~~
Running the program unexpectedly leads to a 1 being printed. From my limited understanding, because I did not add a function prototype for the compiler, the compiler implicitly creates one from the function call on line 6, expecting a double as a parameter and warns me about this implicit declaration. But later I define the function with a parameter of type int. The compiler gives me two warnings about the type mismatch.
I expect argument coercion, meaning the double will be converted to an integer. But in that case, the output should be 2, and not a 1. What exactly is going on here?
What exactly is going on here?
From the C standard perspective it's undefined behavior.
What exactly is going on here?
I am assuming you are using x86_64 architecture. The psABI-x86_64 standard defines how variables should be passed to functions on that architecture. double arguments are passed via %xmm0 register, and edi register is used to pass 1st argument to function.
Your compiler most probably produces:
main:
push rbp
mov rbp, rsp
sub rsp, 16
movsd xmm0, QWORD PTR .LC0[rip]
movsd QWORD PTR [rbp-8], xmm0
mov rax, QWORD PTR [rbp-8]
movq xmm0, rax ; set xmm0 to the value of double
mov eax, 1 ; I guess gcc assumes `int say(double, ...)` for safety
call say
mov eax, 0
leave
ret
say:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], edi ; read %edi
mov eax, DWORD PTR [rbp-4]
mov esi, eax ; pass value in %edi as argument to printf
mov edi, OFFSET FLAT:.LC1
mov eax, 0
call printf
nop
leave
ret
Ie. main set's %xmm0 to the value of double. Yet say() reads from %edi register that was nowhere set in your code. Because there is some left-over value 1 in edi, most probably from crt0 or such, you code prints 1.
#edit The leftover value actually comes from main arguments. It's int main(int argc, char *argv[]) - because your program is passed no arguments, argc is set to 1 by startup code, which means that the leftover value in %edi is 1.
Well, you can for example "manually" set the %edi value to some value by calling a function that takes int before calling say. The following code prints the value that I put in func() call.
int func(int a) {}
int main() {
func(50); // set %edi to 50
double a = 2.5;
say(a);
}
void say(int num) {
printf("%u\n", num); // will print '50', the leftover in %edi
}
I expect argument coercion
If you had declared the function properly, that's what you'd get. But as you correctly pointed out, you didn't declare the function and got an implicit declaration that takes a double as an argument. So when the compiler sees the function call it sees a function call where the argument is a double and the function takes a double. Therefore it has no reason to coerce anything. It just generates the usual code for calling a function with a double as an argument.
What exactly is going on here?
In terms of the C language, it's undefined behaviour and that's it.
In terms of implementation, what's likely happening is that, as I said, the compiler will generate the usual code for calling a function with a double. On a 64-bit x86 architecture using the usual calling conventions, this will mean putting the value 2.5 into the XMM0 register and then calling the function. The function itself will assume that the argument is an int, so it will read its value from the EDI register (or ECX using Microsoft's calling convention), which is the register used to pass the first integer argument. So the argument is written into one register and then read from a totally different register, so you'll get whatever happened to be in that register.
Still, what exactly would qualify it as [undefined behaviour]?
The fact that you (implicitly) declared the function using one type, but then defined it using another. If the declaration and definition of a function don't match, that causes undefined behaviour.
mov rax,QWORD PTR [rbp-0x10]
mov eax,DWORD PTR [rax]
add eax,0x1
mov DWORD PTR [rbp-0x14], eax
Next lines written in C, compiled with GCC in GNU/Linux environment.
Assembly code is for int b = *a + 1;.
...
int a = 5;
int* ptr = &a;
int b = *a + 1;
dereferencing whats in address of a and adding 1 to that. After that, store under new variable.
What I don`t understand is second line in that assembly code. Does it mean that I cut QWORD to get the DWORD(one part of QWORD) and storing that into eax?
Since the code is few lines long, I would love that to be broke into step by step sections just to confirm that I`m on right track, also, to figure out what that second line does. Thank you.
What I don`t understand is second line in that assembly code. Does it mean that I cut QWORD to get the DWORD(one part of QWORD) and storing that into eax?
No, the 2nd line dereferences it. There's no splitting up of a qword into two dword halves. (Writing EAX zeros the upper 32 bits of RAX).
It just happens to use the same register that it was using for the pointer, because it doesn't need the pointer anymore.
Compile with optimizations enabled; it's much easier to see what's happening if gcc isn't storing/reloading all the time. (How to remove "noise" from GCC/clang assembly output?)
int foo(int *ptr) {
return *ptr + 1;
}
mov eax, DWORD PTR [rdi]
add eax, 1
ret
(On Godbolt)
int a = 5;
int* ptr = &a;
int b = *a + 1;
your example is an undefined behaviour as you dereference the integer value converted to the pointer (in this case 5) and it will not compile at all as this conversion has the unknown type.
To make it work you need to cast it first.
`int b = *(int *)a + 1;
https://godbolt.org/g/Yo8dd1
Explanation of your assembly code:
line 1: loads rax with the value of a (in this case 5)
line 2: dereferences this value (reads from the address 5 so probably you will get the segmentation fault). this code loads from the stack only because you use the -O0 option.
This question regards the difference between the volatile and extern variable and also the compiler optimization.
One extern variable defined in main file and used in one more source file, like this:
ExternTest.cpp:
short ExtGlobal;
void Fun();
int _tmain(int argc, _TCHAR* argv[])
{
ExtGlobal=1000;
while (ExtGlobal < 2000)
{
Fun();
}
return 0;
}
Source1.cpp:
extern short ExtGlobal;
void Fun()
{
ExtGlobal++;
}
The assembly generated for this in the vs2012 as below:
ExternTest.cpp assembly for accessing the external variable
ExtGlobal=1000;
013913EE mov eax,3E8h
013913F3 mov word ptr ds:[01398130h],ax
while (ExtGlobal < 2000)
013913F9 movsx eax,word ptr ds:[1398130h]
01391400 cmp eax,7D0h
01391405 jge wmain+3Eh (0139140Eh)
Source.cpp assembly for modifying the extern variable
ExtGlobal++;
0139145E mov ax,word ptr ds:[01398130h]
01391464 add ax,1
01391468 mov word ptr ds:[01398130h],ax
From the above assembly, every access to the variable "ExtGlobal" in the while loop reads the value from the corresponding address. If i add volatile to the external variable the same assembly code was generated. Volatile usage in two different threads and external variable usage in two different functions are same.
Asking about extern and volatile is like asking about peanuts and gorillas. They're completely unrelated.
extern is used simply to tell the compiler, "Hey, don't expect to find the definition of this symbol in this C file. Let the linker fix it up at the end."
volatile essentially tells the compiler, "Never trust the value of this variable. Even if you just stored a value from a register to that memory location, don't re-use the value in the register - make sure to re-read it from memory."
If you want to see that volatile causes different code to be generated, write a series of reads/writes from the variable.
For example, compiling this code in cygwin, with gcc -O1 -c,
int i;
void foo() {
i = 4;
i += 2;
i -= 1;
}
generates the following assembly:
_foo proc near
mov dword ptr ds:_i, 5
retn
_foo endp
Note that the compiler knew what the result would be, so it just went ahead and optimized it.
Now, adding volatile to int i generates the following:
public _foo
_foo proc near
mov dword ptr ds:_i, 4
mov eax, dword ptr ds:_i
add eax, 2
mov dword ptr ds:_i, eax
mov eax, dword ptr ds:_i
sub eax, 1
mov dword ptr ds:_i, eax
retn
_foo endp
The compiler never trusts the value of i, and always re-loads it from memory.