I'm trying to get a better understanding of how compilers produce code for undefined expressions e.g. for the following code:
int main()
{
int i = 5;
i = i++;
return 0;
}
This is the assembler code generated by gcc 4.8.2 (Optimisation is off -O0 and I’ve inserted my own line numbers for reference purposes):
(gdb) disassemble main
Dump of assembler code for function main:
(1) 0x0000000000000000 <+0>: push %rbp
(2) 0x0000000000000001 <+1>: mov %rsp,%rbp
(3) 0x0000000000000004 <+4>: movl $0x5,-0x4(%rbp)
(4) 0x000000000000000b <+11>: mov -0x4(%rbp),%eax
(5) 0x000000000000000e <+14>: lea 0x1(%rax),%edx
(6) 0x0000000000000011 <+17>: mov %edx,-0x4(%rbp)
(7) 0x0000000000000014 <+20>: mov %eax,-0x4(%rbp)
(8) 0x0000000000000017 <+23>: mov $0x0,%eax
(9) 0x000000000000001c <+28>: pop %rbp
(10) 0x000000000000001d <+29>: retq
End of assembler dump.
Execution of this code results in the value of i remaining at the value of 5 (verified with a printf() statement) i.e. i doesn't appear to ever be incremented. I understand that different compilers will evaluate/compile undefined expressions in differnet ways and this may just be the way that gcc does it i.e. I could get a different result with a different compiler.
With respect to the assembler code, as I understand:
Ignoring line - 1-2 setting up of stack/base pointers etc.
line 3/4 - is how the value of 5 is assigned to i.
Can anyone explain what is happening on line 5-6? It looks as if i will be ultimately reassigned the value of 5 (line 7), but is the increment operation (required for the post increment operation i++) simply abandoned/skipped by the compiler in the case?
These three lines contain your answer:
lea 0x1(%rax),%edx
mov %edx,-0x4(%rbp)
mov %eax,-0x4(%rbp)
The increment operation isn't skipped. lea is the increment, taking the value from %rax and storing the incremented value in %edx. %edx is stored but then overwritten by the next line which uses the original value from %eax.
They key to understanding this code is to know how lea works. It stands for load effective address, so while it looks like a pointer dereference, it actually just does the math needed to get the final address of [whatever], and then keeps the address, instead of the value at that address. This means it can be used for any mathematical expression that can be expressed efficiently using addressing modes, as an alternative to mathematical opcodes. It's frequently used as a way to get a multiply and add into a single instruction for this reason. In particular, in this case it's used to increment the value and move the result to a different register in one instruction, where inc would instead overwrite it in-place.
Line 5-6, is the i++. The lea 0x1(%rax),%edx is i + 1 and mov %edx,-0x4(%rbp) writes that back to i. However line 7, the mov %eax,-0x4(%rbp) writes the original value back into i. The code looks like:
(4) eax = i
(5) edx = i + 1
(6) i = edx
(7) i = eax
Related
Doing some basic disassembly and have noticed that the buffer is being given additional buffer space for some reason although what i am looking at in a tutorial uses the same code but is only given the correct (500) chars in length. Why is this?
My code:
#include <stdio.h>
#include <string.h>
int main (int argc, char** argv){
char buffer[500];
strcpy(buffer, argv[1]);
return 0;
}
compiled with GCC, the dissembled code is:
0x0000000000001139 <+0>: push %rbp
0x000000000000113a <+1>: mov %rsp,%rbp
0x000000000000113d <+4>: sub $0x210,%rsp
0x0000000000001144 <+11>: mov %edi,-0x204(%rbp)
0x000000000000114a <+17>: mov %rsi,-0x210(%rbp)
0x0000000000001151 <+24>: mov -0x210(%rbp),%rax
0x0000000000001158 <+31>: add $0x8,%rax
0x000000000000115c <+35>: mov (%rax),%rdx
0x000000000000115f <+38>: lea -0x200(%rbp),%rax
0x0000000000001166 <+45>: mov %rdx,%rsi
0x0000000000001169 <+48>: mov %rax,%rdi
0x000000000000116c <+51>: call 0x1030 <strcpy#plt>
0x0000000000001171 <+56>: mov $0x0,%eax
0x0000000000001176 <+61>: leave
0x0000000000001177 <+62>: ret
However, this video https://www.youtube.com/watch?v=1S0aBV-Waeo clearly only has 500 bytes assigned
Why is this this the case as the only difference I can see here is one is 32-bit and another (mine) is on x86-64.
500 is not a multiple of 16.
The x86-64 ABI (application binary interface) requires the stack pointer to be a multiple of 16 whenever a call instruction is about to happen. (Since call pushes an 8-byte return address, this means the stack pointer is always congruent to 8, mod 16, when control reaches the first instruction of a called function.) For the code shown, it is convenient for the compiler to achieve this requirement by increasing the value it uses in the sub instruction, making it be a multiple of 16.
The x86-32 ABI did not make this requirement, so there was no reason for the compiler used in the video to increase the size of the stack frame.
Note that you appear to have compiled your code without optimization. I get this at -O2:
0x0000000000000000 <+0>: sub $0x208,%rsp
0x0000000000000007 <+7>: mov 0x8(%rsi),%rsi
0x000000000000000b <+11>: mov %rsp,%rdi
0x000000000000000e <+14>: call <strcpy#PLT>
0x0000000000000013 <+19>: xor %eax,%eax
0x0000000000000015 <+21>: add $0x208,%rsp
0x000000000000001c <+28>: ret
The stack adjustment is still somewhat larger than the size of the array, but not as big as what you had, and no longer a multiple of 16; the difference is that with optimization on, the frame pointer is eliminated, so %rbp does not need to be saved and restored, and so the stack pointer is not a multiple of 16 at the point of the sub instruction.
(Incidentally, there is no requirement anywhere for a stack frame to be as small as possible. "Quality of implementation" dictates that it should be as small as possible, but for various reasons it's quite common for the compiler to miss that target. In my optimized code dump, I don't see any reason why the immediate operand to sub and add couldn't have been 0x1f8 (504).
I have an exam comming up, and I'm strugling with assembly. I have written some simple C code, gotten its assembly code, and then trying to comment on the assembly code as practice. The C code:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char const *argv[])
{
int x = 10;
char const* y = argv[1];
printf("%s\n",y );
return 0;
}
Its assembly code:
0x00000000000006a0 <+0>: push %rbp # Creating stack
0x00000000000006a1 <+1>: mov %rsp,%rbp # Saving base of stack into base pointer register
0x00000000000006a4 <+4>: sub $0x20,%rsp # Allocate 32 bytes of space on the stack
0x00000000000006a8 <+8>: mov %edi,-0x14(%rbp) # First argument stored in stackframe
0x00000000000006ab <+11>: mov %rsi,-0x20(%rbp) # Second argument stored in stackframe
0x00000000000006af <+15>: movl $0xa,-0xc(%rbp) # Value 10 stored in x's address in the stackframe
0x00000000000006b6 <+22>: mov -0x20(%rbp),%rax # Second argument stored in return value register
0x00000000000006ba <+26>: mov 0x8(%rax),%rax # ??
0x00000000000006be <+30>: mov %rax,-0x8(%rbp) # ??
0x00000000000006c2 <+34>: mov -0x8(%rbp),%rax # ??
0x00000000000006c6 <+38>: mov %rax,%rdi # Return value copied to 1st argument register - why??
0x00000000000006c9 <+41>: callq 0x560 # printf??
0x00000000000006ce <+46>: mov $0x0,%eax # Value 0 is copied to return register
0x00000000000006d3 <+51>: leaveq # Destroying stackframe
0x00000000000006d4 <+52>: retq # Popping return address, and setting instruction pointer equal to it
Can a friendly soul help me out wherever I have "??" (meaning I don't understand what is happening or I'm unsure)?
0x00000000000006ba <+26>: mov 0x8(%rax),%rax # get argv[1] to rax
0x00000000000006be <+30>: mov %rax,-0x8(%rbp) # move argv[1] to local variable
0x00000000000006c2 <+34>: mov -0x8(%rbp),%rax # move local variable to rax (for move to rdi)
0x00000000000006c6 <+38>: mov %rax,%rdi # now rdi has argv[1]
0x00000000000006c9 <+41>: callq 0x560 # it is puts (optimized)
I will try to make a guess:
mov -0x20(%rbp),%rax # retrieve argv[0]
mov 0x8(%rax),%rax # store argv[1] into rax
mov %rax,-0x8(%rbp) # store argv[1] (which now is in rax) into y
mov -0x8(%rbp),%rax # put y back into rax (which might look dumb, but possibly it has its reasons)
mov %rax,%rdi # copy y to rdi, possibly to prepare the context for the printf
When you deal with assembler, please specify which architecture you are using. An Intel processor might use a different set of instructions from an ARM one, the same instructions might be different or they might rely on different assumptions. As you might know, optimisations change the sequence of assembler instructions generated by the compiler, you might want to specify whether you are using that as well (looks like not?) and which compiler you are using as everyone has its own policy for generating assembler.
Maybe we will never know why the compiler must prepare the context for printf by copying from rax, it could be a compiler's choice or an obligation imposed by the specific architecture. For all those annoying reasons, most of people prefer to use a "high level language" such as C, so that the set of instructions is always right although it might look very dumb for a human (as we know computers are dumb by design) and not always the most choice, that's why there are still many compilers around.
I can give you two more tips:
you IDE must have a way to interleave assembler instructions with C code, and to single step within the assembler. Try to find it out and explore it yourself
the IDE should also have a function to explore the memory of your program. If you find that try to enter the 0x560 address and look were it will lead you. It is very likely that that will be the entry point of your printf
I hope that my answer will help you work it out, good luck
While debugging one of the assembly code examples, I found following piece of information:
(gdb) x /10i 0x4005c4
0x4005c4: push %rbp
0x4005c5: mov %rsp,%rbp
0x4005c8: sub $0xa0,%rsp
0x4005cf: mov %fs:0x28,%rax
0x4005d8: mov %rax,-0x8(%rbp)
0x4005dc: xor %eax,%eax
0x4005de: movabs $0x6673646c6a6b3432,%rax
0x4005e8: mov %rax,-0x40(%rbp)
0x4005ec: movl $0x323339,-0x38(%rbp)
0x4005f3: movl $0x553059,-0x90(%rbp)
As per my understanding movabs should not be used, it seems like it was introduced intentionally. Am I right in my understanding?
What should be the equivalent MOV command to replace it?
As a direct copy from this question: https://reverseengineering.stackexchange.com/questions/2627/what-is-the-meaning-of-movabs-in-gas-x86-att-syntax
[...] The movabs instruction to load arbitrary 64-bit
constant into register and to load/store integer register from/to
arbitrary constant 64-bit address is available.
http://www.ucw.cz/~hubicka/papers/amd64/node1.html
It does exactly what you'd expect from it - it puts the immediate into the register.
There's a series of problems in SPOJ about creating a function in a single line with some constraints. I've already solved the easy, medium and hard ones, but for the impossible one I keep getting Wrong Answer.
To sum it up, the problem requests to fill in the code of the return statement such that if x is 1, the return value should be 2. For other x values, it should return 3. The constraint is that the letter 'x' can't be used, and no more code can be added; one can only code that return statement. Clearly, to solve this, one must create a hack.
So I've used gcc's built in way to get the stack frame, and then decreased the pointer to get a pointer to the first parameter. Other than that, the statement is just a normal comparison.
On my machine it works fine, but for the cluster (Intel Pentinum G860) used by the online judge, it doesn't work, probably due to a different calling convention. I'm not sure I understood the processor's ABI (I'm not sure if the stack frame pointer is saved on the stack or only on a register), or even if I'm reading the correct ABI.
The question is: what would be the correct way to get the first parameter of a function using the stack?
My code is (it must be formatted this way, otherwise it's not accepted):
#include <stdio.h>
int count(int x){
return (*(((int*)__builtin_frame_address(0))-1) == 1) ? 2 : 3;
}
int main(i){
for(i=1;i%1000001;i++)
printf("%d %d\n",i,count(i));
return 0;
}
The question is: what would be the correct way to get the first
parameter of a function using the stack?
There is no way in portable manner. You must assume specific compiler, its settings and ABI, along with calling conventions.
The gcc compiler is likely to "lay down" an int local variable with -0x4 offset (assuming that sizeof(int) == 4). You might observe with most basic definition of count:
4 {
0x00000000004004c4 <+0>: push %rbp
0x00000000004004c5 <+1>: mov %rsp,%rbp
0x00000000004004c8 <+4>: mov %edi,-0x4(%rbp)
5 return x == 1 ? 2 : 3;
0x00000000004004cb <+7>: cmpl $0x1,-0x4(%rbp)
0x00000000004004cf <+11>: jne 0x4004d8 <count+20>
0x00000000004004d1 <+13>: mov $0x2,%eax
0x00000000004004d6 <+18>: jmp 0x4004dd <count+25>
0x00000000004004d8 <+20>: mov $0x3,%eax
6 }
0x00000000004004dd <+25>: leaveq
0x00000000004004de <+26>: retq
You may also see that %edi register holds first parameter. This is the case for AMD64 ABI (%edi is also not preserved between calls).
Now, with that knowledge, you might write something like:
int count(int x)
{
return *((int*)(__builtin_frame_address(0) - sizeof(int))) == 1 ? 2 : 3;
}
which can be obfuscated as:
return *((int*)(__builtin_frame_address(0)-sizeof(int)))==1?2:3;
However, trick is that such optimizing compiler may enthusiastically assume that since x is not referenced in count, it could simply skip moving into stack. For example it produces following assembly with -O flag:
4 {
0x00000000004004c4 <+0>: push %rbp
0x00000000004004c5 <+1>: mov %rsp,%rbp
5 return *((int*)(__builtin_frame_address(0)-sizeof(int)))==1?2:3;
0x00000000004004c8 <+4>: cmpl $0x1,-0x4(%rbp)
0x00000000004004cc <+8>: setne %al
0x00000000004004cf <+11>: movzbl %al,%eax
0x00000000004004d2 <+14>: add $0x2,%eax
6 }
0x00000000004004d5 <+17>: leaveq
0x00000000004004d6 <+18>: retq
As you can see mov %edi,-0x4(%rbp) instruction is now missing, thus the only way1 would be to access value of x from %edi register:
int count(int x)
{
return ({register int edi asm("edi");edi==1?2:3;});
}
but this method lacks of ability to "obfuscate", as whitespaces are needed for variable declaration, that holds value of %edi.
1) Not necessarily. Even if compiler decides to skip mov operation from register to stack, there is still a possibility to "force" it to do so, by asm("mov %edi,-0x4(%rbp)"); inline assembly. Beware though, compiler may have its revenge, sooner or later.
C standard does NOT require a stack in any implementation, so really your problem doesn't make any sense.
in the context of gcc, the behavior is different in x86 and x86-64(and any others).
in x86, parameters reside in stack, but in x86-64, the first 6 parameters(including the implicit ones) reside in registers. so basically you can't do the hacking as you say.
if you want to hack the code, you need to specify the platform you want to run on, otherwise, there is no point to answer your question.
I've written a piece of C code and I've disassembled it as well as read the registers to understand how the program works in assembly.
int test(char *this){
char sum_buf[6];
strncpy(sum_buf,this,32);
return 0;
}
The piece of my code that I've been examining is the test function. When I disassemble the output my test function I get ...
0x00000000004005c0 <+12>: mov %fs:0x28,%rax
=> 0x00000000004005c9 <+21>: mov %rax,-0x8(%rbp)
... stuff ..
0x00000000004005f0 <+60>: xor %fs:0x28,%rdx
0x00000000004005f9 <+69>: je 0x400600 <test+76>
0x00000000004005fb <+71>: callq 0x4004a0 <__stack_chk_fail#plt>
0x0000000000400600 <+76>: leaveq
0x0000000000400601 <+77>: retq
What I would like to know is what mov %fs:0x28,%rax is really doing?
Both the FS and GS registers can be used as base-pointer addresses in order to access special operating system data-structures. So what you're seeing is a value loaded at an offset from the value held in the FS register, and not bit manipulation of the contents of the FS register.
Specifically what's taking place, is that FS:0x28 on Linux is storing a special sentinel stack-guard value, and the code is performing a stack-guard check. For instance, if you look further in your code, you'll see that the value at FS:0x28 is stored on the stack, and then the contents of the stack are recalled and an XOR is performed with the original value at FS:0x28. If the two values are equal, which means that the zero-bit has been set because XOR'ing two of the same values results in a zero-value, then we jump to the test routine, otherwise we jump to a special function that indicates that the stack was somehow corrupted, and the sentinel value stored on the stack was changed.
If using GCC, this can be disabled with:
-fno-stack-protector
glibc:
uintptr_t stack_chk_guard = _dl_setup_stack_chk_guard (_dl_random);
# ifdef THREAD_SET_STACK_GUARD
THREAD_SET_STACK_GUARD (stack_chk_guard);
the _dl_random from kernel.
Looking at http://www.imada.sdu.dk/Courses/DM18/Litteratur/IntelnATT.htm, I think %fs:28 is actually an offset of 28 bytes from the address in %fs. So I think it's loading a full register size from location %fs + 28 into %rax.