I have a main.c file
int boyut(const char* string);
char greeting[6] = {"Helle"};
int main(){
greeting[5] = 0x00;
int a = boyut(greeting);
return 0;
}
int boyut(const char* string){
int len=0;
while(string[len]){
len++;
}
return len;
}
I compile it with GCC command gcc -Wall -m32 -nostdlib main.c -o main.o
When I check disassembly, I see the variable greeting is placed in .data segment. And before calling boyut it's not pushed into stack. Inside the boyut function, it acts like variable greeting is in stack segment. So that variable actually not being accessed inside the function. Why is it generating a code like this? How can I correct this?
Disassembly of section .text:
080480f8 <main>:
80480f8: 55 push ebp
80480f9: 89 e5 mov ebp,esp
80480fb: 83 ec 18 sub esp,0x18
80480fe: c6 05 05 a0 04 08 00 mov BYTE PTR ds:0x804a005,0x0
8048105: 83 ec 0c sub esp,0xc
8048108: 68 00 a0 04 08 push 0x804a000
804810d: e8 0d 00 00 00 call 804811f <boyut>
8048112: 83 c4 10 add esp,0x10
8048115: 89 45 f4 mov DWORD PTR [ebp-0xc],eax
8048118: b8 00 00 00 00 mov eax,0x0
804811d: c9 leave
804811e: c3 ret
0804811f <boyut>:
804811f: 55 push ebp
8048120: 89 e5 mov ebp,esp
8048122: 83 ec 10 sub esp,0x10
8048125: c7 45 fc 00 00 00 00 mov DWORD PTR [ebp-0x4],0x0
804812c: eb 04 jmp 8048132 <boyut+0x13>
804812e: 83 45 fc 01 add DWORD PTR [ebp-0x4],0x1
8048132: 8b 55 fc mov edx,DWORD PTR [ebp-0x4]
8048135: 8b 45 08 mov eax,DWORD PTR [ebp+0x8]
8048138: 01 d0 add eax,edx
804813a: 0f b6 00 movzx eax,BYTE PTR [eax]
804813d: 84 c0 test al,al
804813f: 75 ed jne 804812e <boyut+0xf>
8048141: 8b 45 fc mov eax,DWORD PTR [ebp-0x4]
8048144: c9 leave
8048145: c3 ret
main.o: file format elf32-i386
Contents of section .data:
804a000 48656c6c 6500 Helle.
The function boyut is declared like this:
int boyut(const char* string);
That means: boyut takes a pointer to char and returns an int. And indeed, the compiler pushes a point to char on the stack. This pointer points to the beginning of greeting. This happens, because in C, an array is implicitly converted to a pointer to its first element under most circumstances.
If you want to pass an array to a function so it is copied to the function, you have to wrap the array into a structure and pass that.
Related
I have made a function in C which is pretty straightforward, it uses strlen() from <string.h> to return the length of a char* variable:
int length(char *str) {
return strlen(str);
}
Here is the corresponding x86_64 assembly from objdump -M intel -d a.out:
00000000000011a8 <length>:
11a8: f3 0f 1e fa endbr64
11ac: 55 push rbp
11ad: 48 89 e5 mov rbp,rsp
11b0: 48 83 ec 10 sub rsp,0x10
11b4: 48 89 7d f8 mov QWORD PTR [rbp-0x8],rdi
11b8: 48 8b 45 f8 mov rax,QWORD PTR [rbp-0x8]
11bc: 48 89 c7 mov rdi,rax
11bf: e8 ac fe ff ff call 1070 <strlen#plt>
11c4: c9 leave
11c5: c3 ret
Here is my current understanding of the code (please correct me if anything seems wrong):
00000000000011a8 <length>:
11a8: f3 0f 1e fa endbr64
11ac: 55 push rbp // stack setup, old rbp of previous frame pushed
11ad: 48 89 e5 mov rbp,rsp // rbp and rsp point to same place
11b0: 48 83 ec 10 sub rsp,0x10 // space is made for arguments
11b4: 48 89 7d f8 mov QWORD PTR [rbp-0x8],rdi // rdi stores argument and is moved into the space made on the line 11b0
11b8: 48 8b 45 f8 mov rax,QWORD PTR [rbp-0x8] // value at memory address rbp-0x8 aka argument is stored in rax
11bc: 48 89 c7 mov rdi,rax // move the value into rdi for function call
11bf: e8 ac fe ff ff call 1070 <strlen#plt> // strlen() is called
11c4: c9 leave // stack clear up
11c5: c3 ret // return address popped and control flow resumes
If anything above is incorrect please correct me, secondly how does call 1070 <strlen#plt> return a value? because the strlen() function returns the length of a string and i would have thought that something would have been moved into the rax register (which i believe is commonly used for return values). But nothing is moved into rax and it does not show a value returned in the assembly.
Lastly here is the code at address 1070 (from call 1070 strlen#plt)
0000000000001070 <strlen#plt>:
1070: f3 0f 1e fa endbr64
1074: f2 ff 25 45 2f 00 00 bnd jmp QWORD PTR [rip+0x2f45] # 3fc0 <strlen#GLIBC_2.2.5>
107b: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]
how does call 1070 strlen#plt return a value?
The strlen puts its result into rax register, which conveniently is also where your length() function should put its return value.
Under optimization your length() could be compiled into a single instruction: jmp strlen -- the parameter is already in rdi, and the return value will be in rax.
P.S.
Lastly here is the code at address 1070
That isn't the actual code of strlen. This is a "PLT jump stub". To understand what that is, you could read this blog post.
Also, from that small address, you can see this is a PIE executable: those are just offsets from the image base address; the runtime address will be something like 0x55...
I've created the following function in c as a demonstration/small riddle about how the stack works in c:
#include "stdio.h"
int* func(int i)
{
int j = 3;
j += i;
return &j;
}
int main()
{
int *tmp = func(4);
printf("%d\n", *tmp);
func(5);
printf("%d\n", *tmp);
}
It's obviously undefined behavior and the compiler also produces a warning about that. However unfortunately the compilation didn't quite work out. For some reason gcc replaces the returned pointer by NULL (see line 6d6).
00000000000006aa <func>:
6aa: 55 push %rbp
6ab: 48 89 e5 mov %rsp,%rbp
6ae: 48 83 ec 20 sub $0x20,%rsp
6b2: 89 7d ec mov %edi,-0x14(%rbp)
6b5: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
6bc: 00 00
6be: 48 89 45 f8 mov %rax,-0x8(%rbp)
6c2: 31 c0 xor %eax,%eax
6c4: c7 45 f4 03 00 00 00 movl $0x3,-0xc(%rbp)
6cb: 8b 55 f4 mov -0xc(%rbp),%edx
6ce: 8b 45 ec mov -0x14(%rbp),%eax
6d1: 01 d0 add %edx,%eax
6d3: 89 45 f4 mov %eax,-0xc(%rbp)
6d6: b8 00 00 00 00 mov $0x0,%eax
6db: 48 8b 4d f8 mov -0x8(%rbp),%rcx
6df: 64 48 33 0c 25 28 00 xor %fs:0x28,%rcx
6e6: 00 00
6e8: 74 05 je 6ef <func+0x45>
6ea: e8 81 fe ff ff callq 570 <__stack_chk_fail#plt>
6ef: c9 leaveq
6f0: c3 retq
This is the disassembly of the binary compiled with gcc version 7.5.0 and the -O0-flag; no other flags were used. This behavior makes the entire code pointless, since it's supposed to show how the stack behaves across function-calls. Is there any way to achieve a more literal compilation of this code with a at least somewhat up-to-date version of gcc?
And just for the sake of curiosity: what's the point of changing the behavior of the code like this in the first place?
Putting the return value in a pointer variable seems to change the behavior of the compiler and it generates the assembly code that returns a pointer to stack:
int* func(int i) {
int j = 3;
j += i;
int *p = &j;
return p;
}
Is there a difference between declaring a variable first and then assigning a value or directly declaring and assigning a value in the compiled function? Does the compiled function do the same work? e.g, does it still read the parameters, declare variables and then assign value or is there a difference between the two examples in the compiled versions?
example:
void foo(u32 value) {
u32 extvalue = NULL;
extvalue = value;
}
compared with
void foo(u32 value) {
u32 extvalue = value;
}
I am under the impression that there is no difference between those two functions if you look at the compiled code, e.g they will look the same and i will not be able to tell which is which.
it depends on the compiler & the optimization level of course.
A dumb compiler/low optimization level when it sees:
u32 extvalue = NULL;
extvalue = value;
could set to NULL then to value in the next line.
Since extvalue isn't used in-between, the NULL initialization is useless and most compilers directly set to value as an easy optimization
Note that declaring a variable isn't really an instruction per se. The compiler just allocates auto memory to store this variable.
I've tested a simple code with and without assignment and the result is diff
erent when using gcc compiler 6.2.1 with -O0 (don't optimize anything) flag:
#include <stdio.h>
void foo(int value) {
int extvalue = 0;
extvalue = value;
printf("%d",extvalue);
}
disassembled:
Disassembly of section .text:
00000000 <_foo>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 28 sub $0x28,%esp
6: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%ebp) <=== here we see the init
d: 8b 45 08 mov 0x8(%ebp),%eax
10: 89 45 f4 mov %eax,-0xc(%ebp)
13: 8b 45 f4 mov -0xc(%ebp),%eax
16: 89 44 24 04 mov %eax,0x4(%esp)
1a: c7 04 24 00 00 00 00 movl $0x0,(%esp)
21: e8 00 00 00 00 call 26 <_foo+0x26>
26: c9 leave
27: c3 ret
now:
void foo(int value) {
int extvalue;
extvalue = value;
printf("%d",extvalue);
}
disassembled:
Disassembly of section .text:
00000000 <_foo>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 28 sub $0x28,%esp
6: 8b 45 08 mov 0x8(%ebp),%eax
9: 89 45 f4 mov %eax,-0xc(%ebp)
c: 8b 45 f4 mov -0xc(%ebp),%eax
f: 89 44 24 04 mov %eax,0x4(%esp)
13: c7 04 24 00 00 00 00 movl $0x0,(%esp)
1a: e8 00 00 00 00 call 1f <_foo+0x1f>
1f: c9 leave
20: c3 ret
21: 90 nop
22: 90 nop
23: 90 nop
the 0 init has disappeared. The compiler didn't optimize the initialization in that case.
If I switch to -O2 (good optimization level) the code is then identical in both cases, compiler found that the initialization wasn't necessary (still, silent, no warnings):
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 ec 18 sub $0x18,%esp
6: 8b 45 08 mov 0x8(%ebp),%eax
9: c7 04 24 00 00 00 00 movl $0x0,(%esp)
10: 89 44 24 04 mov %eax,0x4(%esp)
14: e8 00 00 00 00 call 19 <_foo+0x19>
19: c9 leave
1a: c3 ret
I tried these functions in godbolt:
void foo(uint32_t value)
{
uint32_t extvalue = NULL;
extvalue = value;
}
void bar(uint32_t value)
{
uint32_t extvalue = value;
}
I ported to the actual type uint32_t rather than u32 which is not standard. The resulting non-optimized assembly generated by x86-64 GCC 6.3 is:
foo(unsigned int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-20], edi
mov DWORD PTR [rbp-4], 0
mov eax, DWORD PTR [rbp-20]
mov DWORD PTR [rbp-4], eax
nop
pop rbp
ret
bar(unsigned int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-20], edi
mov eax, DWORD PTR [rbp-20]
mov DWORD PTR [rbp-4], eax
nop
pop rbp
ret
So clearly the non-optimized code retains the (weird, as pointed out by others since it's not written to a pointer) NULL assignment, which is of course pointless.
I'd vote for the second one since it's shorter (less to hold in one's head when reading the code), and never allow/recommend the pointless setting to NULL before overwriting with the proper value. I would consider that a bug, since you're saying/doing something you don't mean.
I failed to find a flag that controls the named return value optimization for C language. For C++ it seems to be -fno-elide-constructors.
The source code implementing it is here but since it is a middle-end, no front end information is spoiled even in comments. The manual section did not exactly help either. However disassembling shows that as it is turned off on O0 and enabled on O1 it must be one of the following:
-fauto-inc-dec
-fcprop-registers
-fdce
-fdefer-pop
-fdelayed-branch
-fdse
-fguess-branch-probability
-fif-conversion2
-fif-conversion
-finline-small-functions
-fipa-pure-const
-fipa-reference
-fmerge-constants
-fsplit-wide-types
-ftree-builtin-call-dce
-ftree-ccp
-ftree-ch
-ftree-copyrename
-ftree-dce
-ftree-dominator-opts
-ftree-dse
-ftree-fre
-ftree-sra
-ftree-ter
-funit-at-a-time
C code:
struct p {
long x;
long y;
long z;
};
__attribute__((noinline))
struct p f(void) {
struct p copy;
copy.x = 1;
copy.y = 2;
copy.z = 3;
return copy;
}
int main(int argc, char** argv) {
volatile struct p inst = f();
return 0;
}
Compiled with O0 we see that the 'copy' structure is naively allocated on stack:
00000000004004b6 <f>:
4004b6: 55 push rbp
4004b7: 48 89 e5 mov rbp,rsp
4004ba: 48 89 7d d8 mov QWORD PTR [rbp-0x28],rdi
4004be: 48 c7 45 e0 01 00 00 mov QWORD PTR [rbp-0x20],0x1
4004c5: 00
4004c6: 48 c7 45 e8 02 00 00 mov QWORD PTR [rbp-0x18],0x2
4004cd: 00
4004ce: 48 c7 45 f0 03 00 00 mov QWORD PTR [rbp-0x10],0x3
4004d5: 00
4004d6: 48 8b 45 d8 mov rax,QWORD PTR [rbp-0x28]
4004da: 48 8b 55 e0 mov rdx,QWORD PTR [rbp-0x20]
4004de: 48 89 10 mov QWORD PTR [rax],rdx
4004e1: 48 8b 55 e8 mov rdx,QWORD PTR [rbp-0x18]
4004e5: 48 89 50 08 mov QWORD PTR [rax+0x8],rdx
4004e9: 48 8b 55 f0 mov rdx,QWORD PTR [rbp-0x10]
4004ed: 48 89 50 10 mov QWORD PTR [rax+0x10],rdx
4004f1: 48 8b 45 d8 mov rax,QWORD PTR [rbp-0x28]
4004f5: 5d pop rbp
4004f6: c3 ret
Compiled with O1 it is not allocated but a pointer is passed as an implicit argument
00000000004004b6 <f>:
4004b6: 48 89 f8 mov rax,rdi
4004b9: 48 c7 07 01 00 00 00 mov QWORD PTR [rdi],0x1
4004c0: 48 c7 47 08 02 00 00 mov QWORD PTR [rdi+0x8],0x2
4004c7: 00
4004c8: 48 c7 47 10 03 00 00 mov QWORD PTR [rdi+0x10],0x3
4004cf: 00
4004d0: c3 ret
The closest thing to that in GCC (i.e. a switch for copy elision) is -fcprop-registers. Copy elision doesn't exist in C, but this is the most similar feature to that. From the man page:
After register allocation and post-register allocation instruction
splitting, we perform a copy-propagation pass to try to reduce
scheduling dependencies and occasionally eliminate the copy.
Enabled at levels -O, -O2, -O3, -Os.
Suppose I have two files:
file1.c- contains global definition of an int array of size 10 named "array[10]".
file2.c- contains an int pointer named "extern int *array", here I am trying to link this pointer to array.
But when I check the address of array in file1.c and pointer value in file2.c, they are both different. Why it is happening?
That doesn't work, in file2.c, you need
extern int array[];
since arrays and pointers are not the same thing. Both declarations must have the compatible types, and int* is not compatible with int[N].
What actually happens is not specified, the programme is ill-formed with extern int *array;, but probably, the first sizeof(int*) bytes of the array are interpreted as an address.
extern1.c
#include <stdio.h>
extern int *array;
int test();
int main(int argc, char *argv[])
{
printf ("in main: array address = %x\n", array);
test();
return 0;
}
extern2.c
int array[10] = {1, 2, 3};
int test()
{
printf ("in test: array address = %x\n", array);
return 0;
}
The output:
in main: array address = 1
in test: array address = 804a040
And the assemble code:
08048404 <main>:
8048404: 55 push %ebp
8048405: 89 e5 mov %esp,%ebp
8048407: 83 e4 f0 and $0xfffffff0,%esp
804840a: 83 ec 10 sub $0x10,%esp
804840d: 8b 15 40 a0 04 08 mov 0x804a040,%edx <--------- this (1)
8048413: b8 20 85 04 08 mov $0x8048520,%eax
8048418: 89 54 24 04 mov %edx,0x4(%esp)
804841c: 89 04 24 mov %eax,(%esp)
804841f: e8 dc fe ff ff call 8048300 <printf#plt>
8048424: e8 07 00 00 00 call 8048430 <test>
8048429: b8 00 00 00 00 mov $0x0,%eax
804842e: c9 leave
804842f: c3 ret
08048430 <test>:
8048430: 55 push %ebp
8048431: 89 e5 mov %esp,%ebp
8048433: 83 ec 18 sub $0x18,%esp
8048436: c7 44 24 04 40 a0 04 movl $0x804a040,0x4(%esp) <------- this (2)
804843d: 08
804843e: c7 04 24 3d 85 04 08 movl $0x804853d,(%esp)
8048445: e8 b6 fe ff ff call 8048300 <printf#plt>
804844a: b8 00 00 00 00 mov $0x0,%eax
804844f: c9 leave
8048450: c3 ret
Pay attention to the <------- in the assemble code. You can see in main function the array is array[0] and in test function the array is the address.