core file memory analysis verification - c

An application is generating a core file.
Here is the core info.
The error stack was obtained and the objective is to check the data content within the myStruct variable as this is what gets passed into myFunction when the core occurs.
(gdb) where
#0 0x000000000041bba1 in myFunction (myStruct=0x7ffff9dd0c20) at myTest.c:344
:
:
:
From the above I obtained the address of myStruct at 0x7ffff9dd0c20 and dumped out 200 words.
(gdb) x/200x 0x7ffff9dd0c20
0x7ffff9dd0c20: 0x01938640 0x00000000 0x00001c34 0x000002c8
0x7ffff9dd0c30: 0x00000400 0x00000000 0x01939760 0x00000000
0x7ffff9dd0c40: 0x00000014 0x00000000 0x00000000 0x00000000
0x7ffff9dd0c50: 0x00000000 0x0005000c 0x00000000 0x000d0000
0x7ffff9dd0c60: 0x00000004 0x00000000 0x00000000 0x00000000
0x7ffff9dd0c70: 0x00000000 0x00000000 0x00040000 0x00000000
0x7ffff9dd0c80: 0x0001000c 0x00000000 0x00000000 0x00000000
0x7ffff9dd0c90: 0x00000000 0x00000000 0x00000009 0x00000000
0x7ffff9dd0ca0: 0x00000000 0x00000000 0x410d999a 0x418d999a
0x7ffff9dd0cb0: 0x40b80000 0x41380000 0x000010cc 0x00000000
0x7ffff9dd0cc0: 0x0192edd0 0x00000000 0x00000a30 0x00000158
:
:
:
Now I want to verify I am reading the output of the data correctly.
Here is the structure information .
typedef struct
{
char *a;
unsigned int b;
unsigned int c;
int d;
} structA;
typedef struct
{
structA e;
char *f;
} myStruct; <----------This is what gets passed in and what I am trying to examine.
Working on Linux, a program was run to verify datatype size.
Size of char* : 8
Size of int : 4
Knowing the address is 0x7ffff9dd0c20 and the structure is myStruct, I assume
I start with analyzing structA since it is first.
Since char *a is the first part of structA and a char * is 2 words do I look at this :
0x01938640 0x00000000
Next there are essentially 3 ints (2 unsigned int and 1 int) which I viewed as these from the output :
0x00001c34
0x000002c8
0x00000400
This takes me back into structB and to the char *f attribute which should be 2 words.
Does that meaan this this the proper address :
0x00000000 0x01939760
I copied the memory info here again to make it easier to reference.
(gdb) x/200x 0x7ffff9dd0c20
0x7ffff9dd0c20: 0x01938640 0x00000000 0x00001c34 0x000002c8
0x7ffff9dd0c30: 0x00000400 0x00000000 0x01939760 0x00000000
0x7ffff9dd0c40: 0x00000014 0x00000000 0x00000000 0x00000000
0x7ffff9dd0c50: 0x00000000 0x0005000c 0x00000000 0x000d0000
0x7ffff9dd0c60: 0x00000004 0x00000000 0x00000000 0x00000000
0x7ffff9dd0c70: 0x00000000 0x00000000 0x00040000 0x00000000
0x7ffff9dd0c80: 0x0001000c 0x00000000 0x00000000 0x00000000
0x7ffff9dd0c90: 0x00000000 0x00000000 0x00000009 0x00000000
0x7ffff9dd0ca0: 0x00000000 0x00000000 0x410d999a 0x418d999a
0x7ffff9dd0cb0: 0x40b80000 0x41380000 0x000010cc 0x00000000
0x7ffff9dd0cc0: 0x0192edd0 0x00000000 0x00000a30 0x00000158
Is that the proper way to examine and parse out the data?
Here is the definition of one more structure.
typedef struct
{
short w;
unsigned long x;
short y;
short z;
} otherStruct;
This is what is expected to be stored in f.
char *f;
Within gdb, I tried the following :
p (otherStruct *)0x1939760
and it prints out :
$12 = (otherStruct *) 0x1939760
The strange part is when I originally printed out the data, it showed *f as the following which doesn't look anything like the structure :
f = 0x1939760 "\315\314\274#"

You're reading the data the hard way. gdb knows the struct definitions, so you can tell it to print the structure directly using the p command.
For example, given this code using your structs:
int main()
{
myStruct s = { { "aaa", 4, 5, 6 }, "bbb" };
printf("hello\n");
}
Running the code under gdb:
(gdb) start
Temporary breakpoint 1 at 0x400535: file x1.c, line 19.
Starting program: /home/dbush/./x1
Temporary breakpoint 1, main () at x1.c:19
19 myStruct s = { { "aaa", 4, 5, 6 }, "bbb" };
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64
(gdb) step
20 printf("hello\n");
(gdb) p s
$1 = {e = {a = 0x400600 "aaa", b = 4, c = 5, d = 6}, f = 0x400604 "bbb"}
(gdb)
With regard to the field f actually containing a otherStruct *, you can print what this contains by casting f and dereferencing the result:
p *(otherStruct *)myStruct->f
That being said, you failed to take into account padding within the structures. Because structA contains a char * which is 8 bytes in size, the stuct must be aligned on an 8 byte boundary. Looking at the layout, this means 4 bytes of padding are at the end of the struct.
So the 6th 32-bit word in your dump (0x00000000) is actually that padding. So the bytes making up f are actually 0x01939760 0x00000000.
Looking at the dump of the sample code above:
(gdb) p s
$1 = {e = {a = 0x400600 "aaa", b = 4, c = 5, d = 6}, f = 0x400604 "bbb"}
(gdb) p &s
$5 = (myStruct *) 0x7fffffffde50
(gdb) x/20x 0x7fffffffde50
0x7fffffffde50: 0x00400600 0x00000000 0x00000004 0x00000005
0x7fffffffde60: 0x00000006 0x00007fff 0x00400604 0x00000000
0x7fffffffde70: 0x00000000 0x00000000 0xf7a2f505 0x00007fff
0x7fffffffde80: 0x00000000 0x00000000 0xffffdf58 0x00007fff
0x7fffffffde90: 0x00000000 0x00000001 0x0040052d 0x00000000
You can see that the values in the fields of e match up with the raw dump.
Next you see the padding bytes which in this case have the value 0x00007fff. The next 8 bytes after that match the value of f.

Related

Difficulties Understanding Format String Exploitation

I am reading a book, Hacking: The Art of Exploitation 2nd Edition, and I'm at the chapter of format string vulnerability. I read the chapter multiple times but I'm unable to clearly understand it, even with some googling.
So, in the book there is this vulnerable code:
char text[1024];
...
strcpy(text, argv[1]);
printf("The right way to print user-controlled input:\n");
printf("%s", text);
printf("\nThe wrong way to print user-controlled input:\n");
printf(text);
Then after compiling,
reader#hacking:~/booksrc $ ./fmt_vuln $(perl -e 'print "%08x."x40')
The right way to print user-controlled input:
%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.
%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.
%08x.%08x.
The wrong way to print user-controlled input:
bffff320.b7fe75fc.00000000.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252
e78.252e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.2
52e7838.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.30252e78.252e78
38.2e783830.78383025.3830252e.30252e78.252e7838.2e783830.78383025.3830252e.
The bytes 0x25, 0x30, 0x38, 0x78, and 0x2e seem to be repeating a lot.
reader#hacking:~/booksrc $ printf "\x25\x30\x38\x78\x2e\n"
%08x.
First, why is that value repeating itself?
As you can see, they’re the memory for the format string itself. Because
the format function will always be on the highest stack frame, as long as the
format string has been stored anywhere on the stack, it will be located below
the current frame pointer (at a higher memory address).
But it seems to me this contradicts what he previously wrote and the way stack frames are organized
When this printf() function is called (as with any function), the arguments are pushed to the stack in reverse order.
So, shouldn't the format string be at a lower memory address since it is the first argument? And where is the format string stored?
reader#hacking:~/booksrc $ ./fmt_vuln AAAA%08x.%08x.%08x.%08x
The right way to print user-controlled input:
AAAA%08x.%08x.%08x.%08x
The wrong way to print user-controlled input:
AAAAbffff3d0.b7fe75fc.00000000.41414141
Here again, why is AAAA repeated in 41414141. From what I understand, the printf function prints AAAA first, then when it sees the first %08x, it gets a value from a memory address in the preceding stack frame, then does the same with the second %08x, thus the value of the second is located in a memory address higher than the first one, and finally returns to the value of AAAA located in a lower memory address, in the stack frame of printf function.
I debugged the first example with $(perl -e 'print "%08x."x40') as argument. I run: Linux 5.3.0-40-generic, 18.04.1-Ubuntu, x86_64
(gdb) run $(perl -e 'print "%08x." x 40')
Starting program: /home/kuro/fmt_vuln $(perl -e 'print "%08x." x 40')
The right way to print user-controlled input:
%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.
The wrong way to print user-controlled input:
07a51260.4b3eb8c0.4b10e154.00000000.4b16c3a0.9d357fc8.9d357b10.78383025.30252e78.2e783830.3830252e.252e7838.78383025.30252e78.2e783830.3830252e.252e7838.78383025.30252e78.2e783830.3830252e.252e7838.78383025.30252e78.2e783830.3830252e.252e7838.78383025.30252e78.2e783830.3830252e.252e7838.4b618d00.4b5fd000.00000000.9d357c80.00000000.00000000.00000000.4b3ef6f0.
Breakpoint 1, main (argc=2, argv=0x7ffd9d357fc8) at fmt_vuln.c:19
19 printf("[*] test_val # 0x%08x = %d 0x%08x\n", &test_val, test_val, test_val);
(gdb) x/-100xw $rsp
0x7ffd9d357940: 0x00000400 0x00000000 0x4b07c1aa 0x00007fb8
0x7ffd9d357950: 0x00000016 0x00000000 0x00000003 0x00000000
0x7ffd9d357960: 0x00000001 0x00000000 0x00002190 0x000003e8
0x7ffd9d357970: 0x00000005 0x00000000 0x00008800 0x00000000
0x7ffd9d357980: 0x00000000 0x00000000 0x00000400 0x00000000
0x7ffd9d357990: 0x00000000 0x00000000 0x5e970730 0x00000000
0x7ffd9d3579a0: 0x65336234 0x30663666 0x90890300 0x79e57be9
0x7ffd9d3579b0: 0x1cd79dbf 0x00000000 0x00000000 0x00000000
0x7ffd9d3579c0: 0x05cec660 0x000055ef 0x9d357fc0 0x00007ffd
0x7ffd9d3579d0: 0x00000000 0x00000000 0x00000000 0x00000000
0x7ffd9d3579e0: 0x9d357ee0 0x00007ffd 0x4b062f26 0x00007fb8
0x7ffd9d3579f0: 0x00000030 0x00000030 0x9d357be8 0x00007ffd
0x7ffd9d357a00: 0x9d357a10 0x00007ffd 0x90890300 0x79e57be9
0x7ffd9d357a10: 0x4b3ea760 0x00007fb8 0x07a51260 0x000055ef
0x7ffd9d357a20: 0x4b3eb8c0 0x00007fb8 0x4b0891bd 0x00007fb8
0x7ffd9d357a30: 0x00000000 0x00000000 0x4b3ea760 0x00007fb8
0x7ffd9d357a40: 0x00000d68 0x00000000 0x00000169 0x00000000
0x7ffd9d357a50: 0x07a51260 0x000055ef 0x4b08af51 0x00007fb8
0x7ffd9d357a60: 0x4b3e62a0 0x00007fb8 0x4b3ea760 0x00007fb8
0x7ffd9d357a70: 0x0000000a 0x00000000 0x05cec660 0x000055ef
0x7ffd9d357a80: 0x9d357fc0 0x00007ffd 0x00000000 0x00000000
0x7ffd9d357a90: 0x00000000 0x00000000 0x4b08b403 0x00007fb8
0x7ffd9d357aa0: 0x4b3ea760 0x00007fb8 0x9d357ee0 0x00007ffd
0x7ffd9d357ab0: 0x05cec660 0x000055ef 0x4b0808f5 0x00007fb8
0x7ffd9d357ac0: 0x00000000 0x00000000 0x05cec824 0x000055ef
(gdb) x/100xw $rsp
0x7ffd9d357ad0: 0x9d357fc8 0x00007ffd 0x9d357b10 0x00000002
0x7ffd9d357ae0: 0x78383025 0x3830252e 0x30252e78 0x252e7838
0x7ffd9d357af0: 0x2e783830 0x78383025 0x3830252e 0x30252e78
0x7ffd9d357b00: 0x252e7838 0x2e783830 0x78383025 0x3830252e
0x7ffd9d357b10: 0x30252e78 0x252e7838 0x2e783830 0x78383025
0x7ffd9d357b20: 0x3830252e 0x30252e78 0x252e7838 0x2e783830
0x7ffd9d357b30: 0x78383025 0x3830252e 0x30252e78 0x252e7838
0x7ffd9d357b40: 0x2e783830 0x78383025 0x3830252e 0x30252e78
0x7ffd9d357b50: 0x252e7838 0x2e783830 0x78383025 0x3830252e
0x7ffd9d357b60: 0x30252e78 0x252e7838 0x2e783830 0x78383025
0x7ffd9d357b70: 0x3830252e 0x30252e78 0x252e7838 0x2e783830
0x7ffd9d357b80: 0x78383025 0x3830252e 0x30252e78 0x252e7838
0x7ffd9d357b90: 0x2e783830 0x78383025 0x3830252e 0x30252e78
0x7ffd9d357ba0: 0x252e7838 0x2e783830 0x4b618d00 0x00007fb8
0x7ffd9d357bb0: 0x4b5fd000 0x00007fb8 0x00000000 0x00000000
0x7ffd9d357bc0: 0x9d357c80 0x00007ffd 0x00000000 0x00000000
0x7ffd9d357bd0: 0x00000000 0x00000000 0x00000000 0x00000000
0x7ffd9d357be0: 0x4b3ef6f0 0x00007fb8 0x4b6184c8 0x00007fb8
0x7ffd9d357bf0: 0x9d357c80 0x00007ffd 0x4b3ef000 0x00007fb8
0x7ffd9d357c00: 0x4b3ef914 0x00007fb8 0x4b3ef3c0 0x00007fb8
0x7ffd9d357c10: 0x4b617048 0x00007fb8 0x00000000 0x00000000
0x7ffd9d357c20: 0x00000000 0x00000000 0x4b6179f0 0x00007fb8
0x7ffd9d357c30: 0x4b0030e8 0x00007fb8 0x00000000 0x00000000
0x7ffd9d357c40: 0x4b3efa00 0x00007fb8 0x00000480 0x00000000
0x7ffd9d357c50: 0x00000027 0x00000000 0x00000000 0x00000000
The values, that appear before "%08x." in the Wrong way output, appear in lower addresses than "%08x." values. Why? The format string is supposed to be at the top of the stack.
The values, that appear after the "%08x." values in the Wrong way output, appear in higher addresses than"%08x." values. So in the preceding stack.
Why is it like this? Shouldn't the output begin from the format string values, or after?
Also, in the book, it doesn't print values after "%08x." values. But some are printed in my case. And some values in the output don't even figure in the stack, like 4b16c3a0.
I have to recommend against what you're doing. You're focussing on security vulnerabilities in C without a strong understanding of the language itself. That's an exercise in frustration. As evidence, I offer that every question you're posing about the exercise is answered by understanding printf(3), not stack vulnerabilities.
The output of your perl line (the contents of argv[1]) starts with, %08x.%08x.%08x.%08x.%08x. Thats a format string. Each %08x is looking for a further printf argument, an integer to print in hex representation. Normally, you might do something like,
int a = 'B';
printf( "%02x\n", a );
which produces 42 much faster than the computer in the Hitchhiker's Guide to the Galaxy.
What you've done is pass a long format string with zero arguments. printf(3) can't know how many arguments it was passed; it has to infer them from the format string. Your format string tells printf to print a long list of integers. Since none were provided, it looks for them "up the stack" (wherever they should have been). You print nonsense because the contents of those memory locations is unpredictable. Or, at any rate, weren't defined by you.
In the "good" case, the format string is "%s", declaring one argument of type string, which you provided. That works much better, yes.
Most compilers nowadays take special care with printf. They can produce warnings if the format string isn't a compile-time constant, and they can verify that each argument is of the correct type for its corresponding format specifier. The whole chapter in your book can thus be made moot simply by using the compiler's capabilities and paying attention to its diagnostics.

How to parse DW_OP_call_frame_cfa?

I am trying to get the variables location from a C (x86_64) code using libdwarf, but GCC even with optimizations off (-O0) generates the frame base relative from DW_OP_call_frame_cfa. In turn, clang uses DW_OP_reg6, which points to rbp in x86_64 ABI.
I have read the DWARF-4 Standard but I can not figure out how to get the actual address, relative from the stack or base pointer. I also encountered a similar question What do I do with DW_OP_call_frame_cfa here, but without success.
A simple code like:
int main()
{
int va = 1;
int vb = 2;
va = va + vb;
return (va);
}
will produce the following output in dwarfdump:
< 1><0x0000002d> DW_TAG_subprogram
DW_AT_external yes(1)
DW_AT_name main
DW_AT_decl_file 0x00000001 file.c
DW_AT_decl_line 0x00000001
DW_AT_type <0x00000069>
DW_AT_low_pc 0x00400620
DW_AT_high_pc <offset-from-lowpc>29
DW_AT_frame_base len 0x0001: 9c: DW_OP_call_frame_cfa
DW_AT_GNU_all_call_sites yes(1)
DW_AT_sibling <0x00000069>
< 2><0x0000004e> DW_TAG_variable
DW_AT_name va
DW_AT_decl_file 0x00000001 file.c
DW_AT_decl_line 0x00000003
DW_AT_type <0x00000069>
DW_AT_location len 0x0002: 916c: DW_OP_fbreg -20
< 2><0x0000005b> DW_TAG_variable
DW_AT_name vb
DW_AT_decl_file 0x00000001 file.c
DW_AT_decl_line 0x00000004
DW_AT_type <0x00000069>
DW_AT_location len 0x0002: 9168: DW_OP_fbreg -24
and the following information in readelf -wf file
00000070 000000000000001c 00000044 FDE cie=00000030 pc=0000000000400620..000000000040063d
DW_CFA_advance_loc: 1 to 0000000000400621
DW_CFA_def_cfa_offset: 16
DW_CFA_offset: r6 (rbp) at cfa-16
DW_CFA_advance_loc: 3 to 0000000000400624
DW_CFA_def_cfa_register: r6 (rbp)
DW_CFA_advance_loc: 24 to 000000000040063c
DW_CFA_def_cfa: r7 (rsp) ofs 8
DW_CFA_nop
DW_CFA_nop
DW_CFA_nop
from what I read, I have to implement a minimal stack machine, but I do not even know how to parse this manually.
Is there a way to force GCC to not use the DW_OP_call_frame_cfa and use a simple register or do I really need to interpret the frame information? and if so, how?
I am not really familiar with DWARF or debug info formats in general, but to aswer just the
Is there a way to force GCC to not use the DW_OP_call_frame_cfa
part, I've found out from grepping for DW_OP_call_frame_cfa in GCC sources that the -gdwarf-2 gcc switch makes it produce this instead:
DW_AT_frame_base <loclist at offset 0x00000000 with 4 entries follows>
[ 0]< offset pair low-off : 0x00000000 addr 0x00001119 high-off 0x00000001 addr 0x0000111a>DW_OP_breg7+8
[ 1]< offset pair low-off : 0x00000001 addr 0x0000111a high-off 0x00000004 addr 0x0000111d>DW_OP_breg7+16
[ 2]< offset pair low-off : 0x00000004 addr 0x0000111d high-off 0x0000001c addr 0x00001135>DW_OP_breg6+16
[ 3]< offset pair low-off : 0x0000001c addr 0x00001135 high-off 0x0000001d addr 0x00001136>DW_OP_breg7+8
Not sure whether switching to version 2 of DWARF suits you though.

Why am I not getting a Buffer Overflow?

Everything I read leads me to believe that this should cause a stack buffer overflow, but it does not:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]) {
char password[8];
int correctPassword = 0;
printf("Password \n");
gets(password);
if(strcmp(password, "password"))
{
printf ("Wrong password entered, root privileges not granted... \n");
}
else
{
correctPassword = 1;
}
if(correctPassword)
{
printf ("Root privileges given to the user \n");
}
return 0;
}
But here is my output:
in this case, testtesttesttesttest is clearly larger than 8 characters, and, according to the source, it should cause a stack buffer overflow, but it does not. Why is this?
Reading more bytes then your buffer can contain won't always lead to a run-time error but it's a very bad and common error (read this article about smashing the stack). As I read from comments you added -fno-stack-protector to get the program to not print * stack smashing detected * but that's not a good idea. You should use scanf(" %8s",password) or something similar to limit the dimension of what you read.
Your code does cause a buffer overflow on the stack, in the sense that you have overwritten the allocated memory for the password buffer. Behold the memory that has been overwritten after you provide the input.
gcc -o Overflow Overflow.c -fno-stack-protector -g
gdb Overflow
(gdb) b 8
Breakpoint 1 at 0x4005cc: file Overflow.c, line 8.
(gdb) b 11
Breakpoint 2 at 0x4005e2: file Overflow.c, line 11.
(gdb) r
Starting program: /home/hq6/Code/SO/C/Overflow
Breakpoint 1, main (argc=1, argv=0x7fffffffde08) at Overflow.c:8
8 printf("Password \n");
(gdb) x/20x password
# Memory before overflow
0x7fffffffdd10: 0xffffde00 0x00007fff 0x00000000 0x00000000
0x7fffffffdd20: 0x00400630 0x00000000 0xf7a2e830 0x00007fff
0x7fffffffdd30: 0x00000000 0x00000000 0xffffde08 0x00007fff
0x7fffffffdd40: 0xf7ffcca0 0x00000001 0x004005b6 0x00000000
0x7fffffffdd50: 0x00000000 0x00000000 0x67fbace7 0x593e0a93
(gdb) c
Continuing.
Password
correctPassword
Breakpoint 2, main (argc=1, argv=0x7fffffffde08) at Overflow.c:11
11 if(strcmp(password, "password"))
(gdb) x/20x password
# Memory after overflow
0x7fffffffdd10: 0x72726f63 0x50746365 0x77737361 0x0064726f
0x7fffffffdd20: 0x00400630 0x00000000 0xf7a2e830 0x00007fff
0x7fffffffdd30: 0x00000000 0x00000000 0xffffde08 0x00007fff
0x7fffffffdd40: 0xf7ffcca0 0x00000001 0x004005b6 0x00000000
0x7fffffffdd50: 0x00000000 0x00000000 0x67fbace7 0x593e0a93
Whether or not a buffer overflow has undesirable side effects is undefined behavior.

Different buffer length in gdb

I am trying to exploit the following program :
#include <string.h>
int main(int argc, char *argv[]) {
char little_array[512];
if (argc > 1)
strcpy(little_array, argv[1]);
}
I want to first find the buffer length in order to overflow the stack so I use
(gdb) x/20xw $esp-532
0xffffcda8: 0x00000001 0x00000000 0x41414141 0x41414141
0xffffcdb8: 0x41414141 0x41414141 0x41414141 0x41414141
0xffffcdc8: 0x00000041 0xf7fe6b8c 0xf7ffd000 0x00000000
0xffffcdd8: 0xffffce98 0xf7fe70db 0xf7ffdaf0 0xf7fd8e08
0xffffcde8: 0x00000001 0x00000001 0x00000000 0xf7ff55ac
(gdb)
And I find the address (since I ran 'AAAA'), so the address is 0xffffcdaa .
Im running a 64bit system, disabled ASLR.
And I defined the buffer 512 bytes long.
And I get
(gdb) p 0xffffddf0 - 0xffffcdaa
$1 = 4166
(gdb)
How can this be? It has something to do with my 64bit system? Im trying to follow an old book and cant really find anything better.
I used this program to find the starting point
// find_start.c
unsigned long find_start(void)
{
__asm__("movl %esp, %eax");
}
int main()
{
printf("0x%x\n",find_start());
}
(when this program compiled with the -m32 flag the output of it gives me a starting point that gives me a little better result, 574, but still is too far)

Unable to understand a format string exploitation code

I came across this code showing format string exploitation while reading this article.
#include <stdio.h>
int main(void)
{
char secret[]="hack.se is lame";
char buffer[512];
char target[512];
printf("secret = %pn",&secret);
fgets(buffer,512,stdin);
snprintf(target,512,buffer);
printf("%s",target);
}
Executing it with following input
[root#knark]$ ./a.out
secret = 0xbffffc68
AAAA%x %x %x %x %x %x %x //Input given
AAAA4013fe20 0 0 0 41414141 33313034 30326566
- [root#knark]$
What I understand till now is the sequence of %x's will keep on printing the values at addresses above current %esp (I'm assuming that stack is growing downwards towards lower address).
What I'm unable to understand is the input given is stored in buffer array which can't be less than 512 bytes away from current %esp. So, how can the output contain 41414141 (the hex representation of AAAA) just after the 4 %x, i.e, just above the 4 addresses of current %esp. I tried hard to stare at assembly code too but I think I couldn't follow the manipulation of strings on stack.
On entry to snprintf, the stack has the following:
0xbfd257d0: 0xxxxxxxxx 0xxxxxxxxx 0xxxxxxxxx 0x080484d5
0xbfd257e0: 0xbfd25800 0x00000200 0xbfd25a00 0x00000000
0xbfd257f0: 0x00000000 0x00000000 0x00000000 0x00000000
0xbfd25800: 0x00000000 0x00000040 0xb7f22f2c 0x00000000
0xbfd25810: 0x00000000 0x00000000 0x00000000 0x00000000
0xbfd25800 -> target (initially 0x00000000 0x00000040 ...)
... -> garbage
0xbfd257e8 -> pointer to buffer
0xbfd257e4 -> 512
0xbfd257e0 -> pointer to target
0xbfd257df -> return address
target gets overwritten with the result of snprintf before snprintf gets to use its words as arguments: It first writes "AAAA" (0x41414141) at 0xbfd25800, then "%x" reads the value at 0xbfd257ec and writes it at 0xbfd25804, ..., then "%x" reads the value at 0xbfd25800 (0x41414141) and writes it at 0xbfd25814, ...
First of all, let's have a look at the stack after calling snprintf():
Reading symbols from /home/blackbear/a.out...done.
(gdb) run
Starting program: /home/blackbear/a.out
secret = 0xbffff40c
ABCDEF%x %x %x %x %x %x %x
Breakpoint 1, main () at prova.c:13
13 printf("%s",target);
(gdb) x/20x $esp
0xbfffeff0: 0xbffff00c 0x00000200 0xbffff20c 0x00155d7c
0xbffff000: 0x00155d7c 0x000000f0 0x000000f0 0x44434241
0xbffff010: 0x35314645 0x63376435 0x35353120 0x20633764
0xbffff020: 0x66203066 0x34342030 0x32343334 0x33203134
0xbffff030: 0x34313335 0x20353436 0x37333336 0x35333436
(gdb)
We can actually see, at 0xbffff00c, the string already formatted, so sprintf() wrote right there. We can also see, at 0xbfffeff0, the last argument for snprintf(): target's address, which is actually 0xbffff00c.
So I can deduce that strings are saved from the end to the beginning of their allocated space on the stack, as we can also see adding a strcpy():
blackbear#blackbear-laptop:~$ cat prova.c
#include <stdio.h>
#include <string.h>
int main(void)
{
char secret[]="hack.se is lame";
char buffer[512];
char target[512];
printf("secret = %p\n", &secret);
strcpy(target, "ABCDEF");
fgets(buffer,512,stdin);
snprintf(target,512,buffer);
printf("%s",target);
}
blackbear#blackbear-laptop:~$ gcc prova.c -g
prova.c: In function ‘main’:
prova.c:14: warning: format not a string literal and no format arguments
prova.c:14: warning: format not a string literal and no format arguments
blackbear#blackbear-laptop:~$ gdb ./a.out -q
Reading symbols from /home/blackbear/a.out...done.
(gdb) break 13
Breakpoint 1 at 0x8048580: file prova.c, line 13.
(gdb) run
Starting program: /home/blackbear/a.out
secret = 0xbffff40c
Breakpoint 1, main () at prova.c:13
13 fgets(buffer,512,stdin);
(gdb) x/10x $esp
0xbfffeff0: 0xbffff00c 0x080486bd 0x00000007 0x00155d7c
0xbffff000: 0x00155d7c 0x000000f0 0x000000f0 0x44434241
0xbffff010: 0x00004645 0x00000004
(gdb)
That's it! In conclusion, we've found the string there because strings are stored on the stack in a reversed way, and the beginning (or the end?) of target is near esp.

Resources