I wrote a small program to find how the exit() function works in Linux.
#include <unistd.h>
int main()
{
exit(0);
}
And then I compiled the program with gcc.
gcc -o example -g -static example.c
In gdb, when I set a breakpoint, I got these lines.
Dump of assembler code for function exit:
0x080495a0 <+0>: sub $0x1c,%esp
0x080495a3 <+3>: mov 0x20(%esp),%eax
0x080495a7 <+7>: movl $0x1,0x8(%esp)
0x080495af <+15>: movl $0x80d602c,0x4(%esp)
0x080495b7 <+23>: mov %eax,(%esp)
0x080495ba <+26>: call 0x80494b0 <__run_exit_handlers>
End of assembler dump.
(gdb) b 0x080495a3
Function "0x080495a3" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (0x080495a3) pending.
(gdb) run
Starting program: /home/jack/Documents/overflow/example
[Inferior 1 (process 2299) exited normally]
The program does not stop at the breakpoint. Why? I use -static to compile the program, why does the breakpoint pend until the library loads into the memory?
You're asking gdb to break on a function called 0x080495a3. You'll need to use b *0x080495a3 instead.
(gdb) help break
Set breakpoint at specified line or function.
break [LOCATION] [thread THREADNUM] [if CONDITION]
LOCATION may be a line number, function name, or "*" and an address.
As the help says, The * tells gdb it's an address you want to break on.
From your example:
Function "0x080495a3" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (0x080495a3) pending.
The "pending" means that the breakpoint is waiting until a function called 0x080495a3 is loaded from a shared library.
You might also be interested in break-range:
(gdb) help break-range
Set a breakpoint for an address range.
break-range START-LOCATION, END-LOCATION
where START-LOCATION and END-LOCATION can be one of the following:
LINENUM, for that line in the current file,
FILE:LINENUM, for that line in that file,
+OFFSET, for that number of lines after the current line
or the start of the range
FUNCTION, for the first line in that function,
FILE:FUNCTION, to distinguish among like-named static functions.
*ADDRESS, for the instruction at that address.
The breakpoint will stop execution of the inferior whenever it executes
an instruction at any address within the [START-LOCATION, END-LOCATION]
range (including START-LOCATION and END-LOCATION).
It looks like that you're trying to set a breakpoint at a function named 0x080495a3. Instead try b *0x080495a3 to indicate to GDB that you want to break at a specific address.
0x080495a3 is an address of the line on which you are willing to apply break point. But the format for gdb is b (function name or line number). So You have 2 ways to do this.
1) do an l after your gdb session has started. It will list you the code in C. And then apply a break point using the line number else
2) if you want to use the address, use b *0x080495a3 way to set a break point.
This way you will be able to halt at line
0x080495a3 <+3>: mov 0x20(%esp),%eax
Related
Here's a debugging scenario:
Create start breakpoint A and finish breakpoint B.
Start recording. Continue.
Reach breakpoint B.
Set watchpoint to watch writes to some piece of memory.
Reverse continue until watchpoint breaks execution.
Let's suppose that setting the watchpoint is only possible in step 4, not earlier, since the location of memory that should be watched is only known at that point.
Here's a simple example.
main.c:
int main(void)
{
int num;
num = 5;
return 0;
}
Debugging session:
$ cc main.c
$ gdb -q -nx -ex 'set disassembly-flavor intel' ./a.out
Reading symbols from ./a.out...
(No debugging symbols found in ./a.out)
(gdb) b *main + 0
Breakpoint 1 at 0x1129
(gdb) disas main
Dump of assembler code for function main:
0x0000000000001129 <+0>: push rbp
0x000000000000112a <+1>: mov rbp,rsp
0x000000000000112d <+4>: mov DWORD PTR [rbp-0x4],0x5
0x0000000000001134 <+11>: mov eax,0x0
0x0000000000001139 <+16>: pop rbp
0x000000000000113a <+17>: ret
End of assembler dump.
(gdb) b *main + 16
Breakpoint 2 at 0x1139
(gdb) r
Starting program: /var/tmp/test/a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, 0x0000555555555129 in main ()
(gdb) record
(gdb) c
Continuing.
Breakpoint 2, 0x0000555555555139 in main ()
(gdb) p/x $rbp-0x4
$1 = 0x7fffffffe63c
(gdb) watch *(int*)0x7fffffffe63c
Hardware watchpoint 3: *(int*)0x7fffffffe63c
(gdb) reverse-continue
Continuing.
No more reverse-execution history.
0x0000555555555129 in main ()
(gdb)
The last command shows that reverse execution didn't stop where it is supposed to stop, that is *main+4. Instead, it stopped where the recording has started, ignoring any memory writes.
My simple experiment was motivated by that video at the CppCon 2015, highlighting usefulness of the record & replay concept. Really, I don't know what I've done wrong, because for me it seems that I've repeated every step from the video.
Also, reading gdb docs didn't help either.
So, why the watchpoint has been ignored and how can it be prevented?
Try telling gdb to use a software breakpoint instead of the hardware one through the command set can-use-hw-watchpoints 0.
It seems that the hardware watchpoint doesn't work here because when executing back and forth through reverse-execution history, no traditional process execution takes place, it's only gdb changing it's internal memory to make appearance that a program executed backwards.
In other words, we should use hardware watchpoints when the real process executes on a CPU, and software watchpoints when gdb makes an emulation through its recording history.
Thanks to Peter Cordes for the explanation in the comments.
I have an 64-bits ELF binary. I don't have its source code, don't know with which parameters it was compiled, and am not allowed to provide it here. The only relevant information I have is that the source is a .c file (so no hand-crafted assembly), compiled through a Makefile.
While reversing this binary using IDA, I stumbled upon an extremely weird construction I have never seen before and absolutely cannot explain. Here is the raw decompilation of one function with IDA syntax:
mov rax, [rsp+var_20]
xor rax, fs:28h
jnz location
add rsp, 28h
pop rbx
pop rbp
retn
location:
call __stack_chk_fail
nop dword ptr [rax]
db 2Eh
nop word ptr [rax+rax+00000000h]
...then dozens of instructions of normal and functional code
Here, we have a simple canary check, where we return if it is valid, and call __stack_chk_fail otherwise. Everything is perfectly normal. But after this check, there is still assembly, and of fully-functional code.
Looking at the manual of __stack_chk_fail, I made sure that this function does exit the program, and that there is no edge case where it could continue:
Description
The interface __stack_chk_fail() shall abort the function that called it with a message that a stack overflow has been detected. The program that called the function shall then exit.
I also tried to write this small C program, to search for a method to reproduce this:
#include <stdio.h>
#include <stdlib.h>
int foo()
{
int a = 3;
printf("%d\n", a);
return 0;
int b = 7;
printf("%d\n", b);
}
int main()
{
foo();
return 0;
}
But the code after the return is simply omitted by gcc.
It does not appear either that my binary is vulnerable to a buffer overflow that I could exploit to control rip and jump to the code after the canary check. I also inspected every call and jumps using objdump, and this code seems to never be called.
Could someone explain what is going on? How was this code generated in the first place? Is it a joke from the author of the binary?
I suspect you are looking at padding, followed by an unrelated function that IDA does not have a name for.
To test this hypothesis, I need the following additional information:
The address of the byte immediately after call __stack_chk_fail.
The next higher address that is the target of a call or jump instruction.
A raw hex dump of the bytes in between those two addresses.
The disassembly of four or five instructions starting at the next higher address that is the target of a call or jump instruction.
I am actually trying to run C code to write my operating system kernel for studying how operating systems work. I am stuck on this infinite loop when the bootloader jumps to my C code. How should I prevent this error
Although my bootloader works correctly the problem comes when my bootloader jumps to the kernel code written in C as a.COM program. The main thing is that the dummy code just keeps on printing a character again and again although the code must run only once. It seems as if the main code is being called again and again. Here is the code for the startpoint.asm assembly header and bootmain.cpp file.
Here is the code for startpoint.asm which is used while linking at first so that the code can be invoked automatically. (Written in MASM )
Note: The code is loaded at the address 2000H:0000H.
;------------------------------------------------------------
.286 ; CPU type
;------------------------------------------------------------
.model TINY ; memory of model
;---------------------- EXTERNS -----------------------------
extrn _BootMain:near ; prototype of C func
;------------------------------------------------------------
;------------------------------------------------------------
.code
main:
jmp short start ; go to main
nop
;----------------------- CODE SEGMENT -----------------------
start:
cli
mov ax,cs ; Setup segment registers
mov ds,ax ; Make DS correct
mov es,ax ; Make ES correct
mov ss,ax ; Make SS correct
mov bp,2000h
mov sp,2000h ; Setup a stack
sti
; start the program
call _BootMain
ret
END main ; End of prog
Code for bootmain.cpp
extern "C" void BootMain()
{
__asm
{
mov ah,0EH
mov al,'G'
int 10H
}
return;
}
The compiling and linker commands are as follows:
Code to compile bootmain.cpp:
CL.EXE /AT /G2 /Gs /Gx /c /Zl bootmain.cpp
Code to compile startpoint.asm:
ML.EXE /AT /c startpoint.asm
Code to link them both (In preserved order):
LINK.EXE /T /NOD startPoint.obj bootmain.obj
Expected output:
G
Actual Output:
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
Take a closer look at the end of start.
start is never called -- it is jumped to directly, and it sets up the stack itself. When _BootMain returns, the stack is empty; the ret at the end of start will pop garbage data from above the end of the stack and attempt to jump to it. If that memory contains zeroes, program flow will return to main.
You need to set up something specific to happen after _BootMain returns. If you just want the system to hang after executing _BootMain, insert an infinite loop (e.g. jmp .) to the end of start instead of the erroneous ret.
Alternatively, consider having your bootloader set up the stack itself and call the COM executable. When that returns, the bootloader can take appropriate action.
OK so I am really trying to understand what's going on this example of "The art of exploitation" second edition. I am trying to see exactly what is going on with the example by closely following the output of GDB on the book. My greatest problem with this is the last part, I included the whole thing so that everyone can see what's going on. Granted I only have very(very) basic knowledge of assembly code. I do understand basic C.
In the last part the author says that there is a minor difference in the second run of the program from the last one in the address that strcpy() points to and I just can't see it.
The program is simply
#include<stdio.h>
#include<string.h>
int main() {
char str_a[20];
strcpy(str_a, "Hello, world!\n");
printf(str_a);
}
After I compile it with the necessary options to be able to debug it I load it on
GDB and include the following:
(gdb) break 6
Breakpoint 1 at 0x80483c4: file char_array2.c, line 6.
(gdb) break strcpy
Function "strcpy" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (strcpy) pending.
(gdb) break 8
Breakpoint 3 at 0x80483d7: file char_array2.c, line 8.
(gdb)
I have no problem with this, it is to my understanding that the
debugger can only do this sort of things with user defined functions. I also know how to go around this problem with gcc options.
I also know that when the program runs the strcpy breakpoint is resolved. Let me continue.
(gdb) run
Starting program: /home/reader/booksrc/char_array2
Breakpoint 4 at 0xb7f076f4
Pending breakpoint "strcpy" resolved
Breakpoint 1, main() at char_array2.c:7
7 strcpy(str_a, "Hello, world!\n");
(gdb) i r eip
eip 0x80483c4 0x80483c4 <main+16>
(gdb) x/5i $eip
0x80483c4 <main+16>: mov DWORD PTR [esp+4],0x80484c4
0x80483cc <main+24>: lea eax,[ebp-40]
0x80483cf <main+27>: mov DWORD PTR [esp],eax
0x80483d2 <main+30>: call 0x80482c4 <strcpy#plt>
0x80483d7 <main+35>: lea eax,[ebp-40]
(gdb) continue
Continuing.
Breakpoint 4, 0xb7f076f4 in strcpy () from /lib/tls/i686/cmov/libc.so.6
(gdb) i r eip
eip 0xb7f076f4 0xb7f076f4 <strcpy+4>
(gdb) x/5i $eip
0xb7f076f4 <strcpy+4>: mov esi,DWORD PTR [ebp+8]
0xb7f076f7 <strcpy+7>: mov eax,DWORD PTR [ebp+12]
0xb7f076fa <strcpy+10>: mov ecx,esi
0xb7f076fc <strcpy+12>: sub ecx,eax
0xb7f076fe <strcpy+14>: mov edx,eax
(gdb) continue
Continuing.
Breakpoint 3, main () at char_array2.c:8
8
printf(str_a);
(gdb) i r eip
eip 0x80483d7 0x80483d7 <main+35>
(gdb) x/5i $eip
0x80483d7 <main+35>: lea eax,[ebp-40]
0x80483da <main+38>: mov DWORD PTR [esp],eax
0x80483dd <main+41>: call 0x80482d4 <printf#plt>
0x80483e2 <main+46>: leave
0x80483e3 <main+47>: ret
(gdb)
This is the second run of the program in which supposedly the address to strcpy is different from the other address.
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/reader/booksrc/char_array2
Error in re-setting breakpoint 4:
Function "strcpy" not defined.
Breakpoint 1, main () at char_array2.c:7
7
strcpy(str_a, "Hello, world!\n");
(gdb) bt
#0 main () at char_array2.c:7
(gdb) cont
Continuing.
Breakpoint 4, 0xb7f076f4 in strcpy () from /lib/tls/i686/cmov/libc.so.6
(gdb) bt
#0 0xb7f076f4 in strcpy () from /lib/tls/i686/cmov/libc.so.6
#1 0x080483d7 in main () at char_array2.c:7
(gdb) cont
Continuing.
Breakpoint 3, main () at char_array2.c:8
8
printf(str_a);
(gdb) bt
#0 main () at char_array2.c:8
(gdb)
Where is the difference? am I wrong for thinking that 0xb7f076f4 is the address of strcpy? On the second run if I am correct everything indicates that the address is 0xb7f076f4.
Also, what is ? I can't find the explanation for this anywhere earlier in the book. If someone could be kind enough to explain this from the top down to me I would appreciate it so much being that I don't know any expert in real life that could help me. I find the explanations to be vague, he explains variables and loops like if he was explaining it to a 5 year old, but leaves much of the assembly code for us to figure out by ourselves, I have not been very successful at this.
Any help would be greatly appreciated.
Apparently gdb turns off ASLR for the debugged process to make (session-to-session) debugging easier.
From https://sourceware.org/gdb/current/onlinedocs/gdb/Starting.html
set disable-randomization
set disable-randomization on
This option (enabled by default in GDB) will turn off the native
randomization of the virtual address space of the started program.
This option is useful for multiple debugging sessions to make the
execution better reproducible and memory addresses reusable across
debugging sessions.
Set set disable-randomization off in gdb or in a .gdbinit file and try it again. Libc should now get loaded at a different address each time you run the binary.
Running watch -n 1 cat /proc/self/maps also is nice to see how the binary and the libraries are mapped at 'random' addresses.
As #Voo said in his comment above, the book probably refers to ASLR (Address Space Layout Randomization) which is a security feature. It changes how the address space is used for each execution so you can't rely on finding things always in the same place.
If you don't see it in gdb that means you have ASLR turned off. Either globally or locally in gdb. You can check the former using cat /proc/sys/kernel/randomize_va_space and the latter using show disable-randomization command at the gdb prompt.
From this question, I've seen a funny code which compile (although with warnings) and produce a segmentation fault (gcc 4.4.4; clang 2.8):
main;
If we expand it, here is the result:
int main = 0;
So what is the linker's behavior here?
The linker's behavior is that it defines a symbol called main in either the program's data or BSS segment. It is 4 bytes long and initialized to 0. Ordinarily, it creates a symbol in the program's code segment (typically called .text) with the executable code for the main function.
The C runtime starts up at a fixed entry point (typically called _start), initializes a bunch of stuff (e.g. sets up the program's arguments), and calls the main function. When main is executable code, this is all fine and dandy, but if it's instead 4 zero bytes, the program will transfer control to those zero bytes and try to execute them.
Typically, the data and BSS segments are marked as non-executable, so when you try to execute code there, the processor will raise an exception, which the OS will interpret and then terminate your program with a signal. If somehow the segment it's in is executable, then it will try to execute the machine instructions defined by 00 00 00 00. In x86 and x86-64, that's an illegal instruction, so you'd also get a SIGILL signal in POSIX OSes.
Under my system (CentOS 6.3), main is placed into the BSS and contains all 0's, hence the crash:
Program received signal SIGSEGV, Segmentation fault.
0x00000000006007f0 in main ()
(gdb) where
#0 0x00000000006007f0 in main ()
(gdb) l
"main" is not a function
(gdb) disass 0x6007f0
Dump of assembler code for function main:
=> 0x00000000006007f0 <+0>: add %al,(%rax)
0x00000000006007f2 <+2>: add %al,(%rax)
End of assembler dump.
(gdb) info symbol &main
main in section .bss of /home/ajd/tmp/x
(gdb) x/16b 0x6007f0
0x6007f0 <main>: 0 0 0 0 0 0 0 0
0x6007f8: 0 0 0 0 0 0 0 0
(gdb)
The symbol main is expected to be a function, not an integer. However, the linker doesn't much care about the type of main; the symbol is defined. If the symbol main is not a function with one of the prescribed signatures, then you invoke undefined behaviour.
The start-up code then calls the 'function', which is actually an address in the data segment of the program. It goes wrong because the 'code' stored at that address is invalid — the first 4 bytes are likely to be zeros; what comes later is anyone's guess. The data segment may be marked non-executable, in which case trying to execute the data will trigger a crash on that account.
When you invoke undefined behaviour, anything can happen. A crash is quite sensible here.