gdb is setting a breakpoint at an unexpected line - c

1 #include <stdio.h>
2 int main()
3 {
4 int i;
5 for (i=0; i<=10; i++)
6 {
7 puts("Hello world\n");
8 }
9 return 0;
10 //test
}
I am trying to debug the code above using gdb. The problem I am facing is that when I use the command break main, it sets the breakpoint at line 5 rather then setting the breakpoint at line 2 where it should be. Below is the output that I am getting:
(gdb) break main
Breakpoint 1 at 0x555555555141: file firstprogram.c, line 5.
I was expecting that the breakpoint would be set at line no 2 rather than line no 5.

You can only set a breakpoint at an address where there's an instruction.
Look at how your compiler turns this C function into asm. (e.g. on Godbolt targeting Linux with current GCC/clang.)
In GCC or clang output, with or without optimization, there are instructions for the function prologue (pushing a call-preserved register or making a stack frame with a frame pointer, respectively). The debug info generated by GCC and clang associates those instructions with the opening {, line 3.
(The Godbolt compiler explorer's colour syntax highlighting and mouseover highlighting is based on the debug info generated by compilers, the same info GDB uses when mapping source lines to/from addresses.)
The first instruction for anything in the function body is either the i=0 in the unoptimized version, or the mov edi, OFFSET FLAT:.LC0 / call puts in the loop body.
(GCC/clang consider the mov ebx, 11 loop counter init to not be associated with a source line, since after optimization they've transformed the loop into counting down to zero for 11 iterations. That mov-to-register instruction is part of the function prologue, according to the debug info.)

Related

Abnormal presence of code after __stack_chk_fail

I have an 64-bits ELF binary. I don't have its source code, don't know with which parameters it was compiled, and am not allowed to provide it here. The only relevant information I have is that the source is a .c file (so no hand-crafted assembly), compiled through a Makefile.
While reversing this binary using IDA, I stumbled upon an extremely weird construction I have never seen before and absolutely cannot explain. Here is the raw decompilation of one function with IDA syntax:
mov rax, [rsp+var_20]
xor rax, fs:28h
jnz location
add rsp, 28h
pop rbx
pop rbp
retn
location:
call __stack_chk_fail
nop dword ptr [rax]
db 2Eh
nop word ptr [rax+rax+00000000h]
...then dozens of instructions of normal and functional code
Here, we have a simple canary check, where we return if it is valid, and call __stack_chk_fail otherwise. Everything is perfectly normal. But after this check, there is still assembly, and of fully-functional code.
Looking at the manual of __stack_chk_fail, I made sure that this function does exit the program, and that there is no edge case where it could continue:
Description
The interface __stack_chk_fail() shall abort the function that called it with a message that a stack overflow has been detected. The program that called the function shall then exit.
I also tried to write this small C program, to search for a method to reproduce this:
#include <stdio.h>
#include <stdlib.h>
int foo()
{
int a = 3;
printf("%d\n", a);
return 0;
int b = 7;
printf("%d\n", b);
}
int main()
{
foo();
return 0;
}
But the code after the return is simply omitted by gcc.
It does not appear either that my binary is vulnerable to a buffer overflow that I could exploit to control rip and jump to the code after the canary check. I also inspected every call and jumps using objdump, and this code seems to never be called.
Could someone explain what is going on? How was this code generated in the first place? Is it a joke from the author of the binary?
I suspect you are looking at padding, followed by an unrelated function that IDA does not have a name for.
To test this hypothesis, I need the following additional information:
The address of the byte immediately after call __stack_chk_fail.
The next higher address that is the target of a call or jump instruction.
A raw hex dump of the bytes in between those two addresses.
The disassembly of four or five instructions starting at the next higher address that is the target of a call or jump instruction.

Gdb weird behaviour wrt printf vs strcpy [shared library load] [duplicate]

So I was reading Hacking the Art of Exploitation and in the book, they use the strcpy() function in their C code:
1 #include <stdio.h>
2 #include <string.h>
3
4 int main() {
5 char str_a[20];
6
7 strcpy(str_a, "Hello, world!\n");
8 printf(str_a);
9 }
They then proceed to compile their source code and analyze it with gdb. He sets a breakpoint on line 6, the strcpy function, and line 8, but when setting a break on strcpy it reads the following:
(gdb) break strcpy
Function "strcpy" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
I understand that this is because the library has not yet been loaded, so it's asking if he wants to have it as a pending breakpoint. Then he runs the program and continues through the breakpoints:
Everything works well for him, but when I tried to re-create this on my computer, I get the following:
frinto#kali:~/Documents/theclang/programs/helloworld$ gcc -m32 -g -o char_array char_array.c
frinto#kali:~/Documents/theclang/programs/helloworld$ gdb -q char_array
Reading symbols from char_array...done.
(gdb) list
1 #include <stdio.h>
2 #include <string.h>
3
4 int main() {
5 char str_a[20];
6
7 strcpy(str_a, "Hello, world!\n");
8 printf(str_a);
9 }
(gdb) break 6
Breakpoint 1 at 0x11b6: file char_array.c, line 6.
(gdb) break strcpy
Function "strcpy" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (strcpy) pending.
(gdb) break 8
Breakpoint 3 at 0x11d7: file char_array.c, line 8.
(gdb) run
Starting program: /home/frinto/Documents/theclang/programs/helloworld/char_array
Breakpoint 1, main () at char_array.c:7
7 strcpy(str_a, "Hello, world!\n");
(gdb) cont
Continuing.
Breakpoint 3, main () at char_array.c:8
8 printf(str_a);
(gdb) cont
Continuing.
Hello, world!
[Inferior 1 (process 4021) exited normally]
(gdb)
Notice how it completely skipped the strcpy breakpoint? Well, I asked a friend of mine what was the issue here, and he told me that I was missing the argument -fno-builtin when compiling. I did some minimal google searching on this argument and all I really understood is that it lets you set breakpoints on built-in functions. So I compiled the program with the -fno-builtin argument and then tried to re-create this again:
frinto#kali:~/Documents/theclang/programs/helloworld$ gcc -m32 -fno-builtin -g -o char_array char_array.c
frinto#kali:~/Documents/theclang/programs/helloworld$ gdb -q char_array
Reading symbols from char_array...done.
(gdb) list
1 #include <stdio.h>
2 #include <string.h>
3
4 int main() {
5 char str_a[20];
6
7 strcpy(str_a, "Hello, world!\n");
8 printf(str_a);
9 }
(gdb) break 6
Breakpoint 1 at 0x11c6: file char_array.c, line 6.
(gdb) break strcpy
Breakpoint 2 at 0x1040
(gdb) break 8
Breakpoint 3 at 0x11dc: file char_array.c, line 8.
(gdb) run
Starting program: /home/frinto/Documents/theclang/programs/helloworld/char_array
Breakpoint 1, main () at char_array.c:7
7 strcpy(str_a, "Hello, world!\n");
(gdb) cont
Continuing.
Breakpoint 2, 0xf7e510b0 in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) cont
Continuing.
Breakpoint 3, main () at char_array.c:8
8 printf(str_a);
(gdb) cont
Continuing.
Hello, world!
[Inferior 1 (process 3969) exited normally]
(gdb)
Now it works! I have three questions:
What exactly is the -fno-builtin argument doing?
Why does it show question marks instead of the strcpy function in
Breakpoint 2, 0xf7e510b0 in ?? () from /lib/i386-linux-gnu/libc.so.6
Why doesn't it ask to set the strcpy breakpoint as pending when I use the -fno-builtin argument?
Sorry for the long thread, I just wanted to make sure everything was understood.
From man gcc
-fno-builtin
-fno-builtin-function
Don't recognize built-in functions that do not begin with
__builtin_ as prefix. GCC normally generates special code to
handle certain built-in functions more efficiently; for
instance, calls to "alloca" may become single instructions
which adjust the stack directly, and calls to "memcpy" may
become inline copy loops. The resulting code is often both
smaller and faster, but since the function calls no longer
appear as such, you cannot set a breakpoint on those calls, nor
can you change the behavior of the functions by linking with a
different library. In addition, when a function is recognized
as a built-in function, GCC may use information about that
function to warn about problems with calls to that function, or
to generate more efficient code, even if the resulting code
still contains calls to that function. For example, warnings
are given with -Wformat for bad calls to "printf" when "printf"
is built in and "strlen" is known not to modify global memory.
With the -fno-builtin-function option only the built-in
function function is disabled. function must not begin with
__builtin_. If a function is named that is not built-in in
this version of GCC, this option is ignored. There is no
corresponding -fbuiltin-function option; if you wish to enable
built-in functions selectively when using -fno-builtin or
-ffreestanding, you may define macros such as:
#define abs(n) __builtin_abs ((n))
#define strcpy(d, s) __builtin_strcpy ((d), (s))
function builtins allow to generate a faster code by inlining the function, but as stated in the manual
you cannot set a breakpoint on those calls
Inlining a function means that, instead of generating a function call, its effects are replaced by code directly inserted by the compiler. This saves a function call and can be more efficiently optimized and generally leads to a large improvement in performances.
But, the inlined function no longer exists in the code. Debugger breakpoints are implemented by replacing instructions at specific addresses by some software traps or by using specific hardware to recognize when the breakpointed address is reached. But as the function no longer exists, no address is associated with it, and there is no way to breakpoint it.
Pending breakpoints are a mean to set a breakpoint on some code that will be dynamically loaded later by the program. With -fno-builtin, strcpy is directly available and the bp can be directly set by gdb.
Note that debugging requires specific information in the executable generated by the -g flag. Generally system libraries like libc do not have the debugging information embedded and when entering function in these libraries, gdb indicates the lack of debugging information by ??.
What exactly is the -fno-builtin argument doing?
-fno-builtin means that gcc will not try to replace library functions with builtin compiled code, and you'll not get any weirdness due to such replacements. I've been bitten by replacements of printf("%s", mystr) by puts(mystr), for example - even when I wasn't including stdio.h at all!
Why does it show question marks instead of the strcpy function
Because there's no source file or line from which the "code" of strcpy() was taken - gcc just plugged in some assembly or its GIMPLE IR or whatever, instead of the glibc strcpy() implementation.
Why doesn't it ask to set the strcpy breakpoint as pending when I use the -fno-builtin argument?
Because then you're in the regular case of a plain vanilla function call with a breakpoint, and the debugger doesn't have to scratch its head and ponder what to do.

"target record-full" in gdb makes "n" command fail on printf with "Process record does not support instruction 0xc5 at address 0x7ffff7dee6e7"?

I was tring to use "reverse-step" and "reverse-next" command inside gdb. Stack overflow tells me that I should run "target record-full" in the execution context where I wish to "rn" and "rs". But some weird error happened:
1
2 #include<stdio.h>
3 int i=0;
4 void fa()
5 {
6 ++i;
7 printf("%d\n",i);
8 ++i;
9 }
10 int main(){
11 fa();
12 return 0;
13 }
I compile and run this program:
(gdb) b 4
Breakpoint 1 at 0x40052a: file test02.c, line 4.
(gdb) r
Starting program: /home/Troskyvs/a.out
Breakpoint 1, fa () at test02.c:6
6 ++i;
(gdb) target record-full
(gdb) n
7 printf("%d\n",i);
(gdb) n # Error happens here!
Process record does not support instruction 0xc5 at address 0x7ffff7dee6e7.
Process record: failed to record execution log.
Program stopped.
_dl_runtime_resolve_avx () at ../sysdeps/x86_64/dl-trampoline.h:81
81 ../sysdeps/x86_64/dl-trampoline.h: No such file or directory.
Well if I don't run "target record-full", then the 2nd "n" will be OK and run to next line. I don't quite get the error information here.
Is it related to "target record-full"? How can I conquor it?
I tried another approach:
(gdb) set exec-direction reverse
(gdb) n
No more reverse-execution history.
fa () at test02.c:7
7 printf("%d\n",i);
(gdb) n
No more reverse-execution history.
fa () at test02.c:7
7 printf("%d\n",i);
(gdb) n
Well, it doesn't work
AVX is not supported as of GDB 7.11.1
The underlying problem seems to be that AVX instructions are not currently supported, but glibc uses them on Ubuntu 16.04 64-bit:
gdb reverse debugging avx2
https://sourceware.org/ml/gdb/2016-08/msg00028.html
rr is an awesome working alternative: https://github.com/mozilla/rr Here is a minimal working example: Setting breakpoint in GDB where the function returns
Actually for the simple case you have, a record-full should work if you add the parameter "-static" to your gcc compilation command.

How does the stack frame look like in my function?

I am a beginner at assembly, and I am curious to know how the stack frame looks like here, so I could access the argument by understanding and not algorithm.
P.S.: the assembly function is process
#include <stdio.h>
# define MAX_LEN 120 // Maximal line size
extern int process(char*);
int main(void) {
char buf[MAX_LEN];
int str_len = 0;
printf("Enter a string:");
fgets(buf, MAX_LEN, stdin);
str_len = process(buf);
So, I know that when I want to access the process function's argument, which is in assembly, I have to do the following:
push ebp
mov ebp, esp ; now ebp is pointing to the same address as esp
pushad
mov ebx, dword [ebp+8]
Now I also would like someone to correct me on things I think are correct:
At the start, esp is pointing to the return address of the function, and [esp+8] is the slot in the stack under it, which is the function's argument
Since the function process has one argument and no inner declarations (not sure about the declarations) then the stack frame, from high to low, is 8 bytes for the argument, 8 bytes for the return address.
Thank you.
There's no way to tell other than by means of debugger. You are using ia32 conventions (ebp, esp) instead of x64 (rbp, rsp), but expecting int / addresses to be 64 bit. It's possible, but not likely.
Compile the program (gcc -O -g foo.c), then run with gdb a.out
#include <stdio.h>
int process(char* a) { printf("%p", (void*)a); }
int main()
{
process((char *)0xabcd1234);
}
Break at process; run; disassemble; inspect registers values and dump the stack.
- break process
- run
- disassemble
- info frame
- info args
- info registers
- x/32x $sp - 16 // to dump stack +-16 bytes in both side of stack pointer
Then add more parameters, a second subroutine or local variables with known values. Single step to the printf routine. What does the stack look like there?
You can also use gdb as calculator: what is the difference in between sp and rax ?
It's print $sp - $rax if you ever want to know.
Tickle your compiler to produce assembler output (on Unixy systems usually with the -S flag). Play around with debugging/non-debugging flags, the extra hints for the debugger might help in refering back to the source. Don't give optimization flags, the reorganizing done by the compiler can lead to thorough confusion. Add a simple function calling into your code to see how it is set up and torn down too.

How can I know how the exit() function work?

I wrote a small program to find how the exit() function works in Linux.
#include <unistd.h>
int main()
{
exit(0);
}
And then I compiled the program with gcc.
gcc -o example -g -static example.c
In gdb, when I set a breakpoint, I got these lines.
Dump of assembler code for function exit:
0x080495a0 <+0>: sub $0x1c,%esp
0x080495a3 <+3>: mov 0x20(%esp),%eax
0x080495a7 <+7>: movl $0x1,0x8(%esp)
0x080495af <+15>: movl $0x80d602c,0x4(%esp)
0x080495b7 <+23>: mov %eax,(%esp)
0x080495ba <+26>: call 0x80494b0 <__run_exit_handlers>
End of assembler dump.
(gdb) b 0x080495a3
Function "0x080495a3" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (0x080495a3) pending.
(gdb) run
Starting program: /home/jack/Documents/overflow/example
[Inferior 1 (process 2299) exited normally]
The program does not stop at the breakpoint. Why? I use -static to compile the program, why does the breakpoint pend until the library loads into the memory?
You're asking gdb to break on a function called 0x080495a3. You'll need to use b *0x080495a3 instead.
(gdb) help break
Set breakpoint at specified line or function.
break [LOCATION] [thread THREADNUM] [if CONDITION]
LOCATION may be a line number, function name, or "*" and an address.
As the help says, The * tells gdb it's an address you want to break on.
From your example:
Function "0x080495a3" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (0x080495a3) pending.
The "pending" means that the breakpoint is waiting until a function called 0x080495a3 is loaded from a shared library.
You might also be interested in break-range:
(gdb) help break-range
Set a breakpoint for an address range.
break-range START-LOCATION, END-LOCATION
where START-LOCATION and END-LOCATION can be one of the following:
LINENUM, for that line in the current file,
FILE:LINENUM, for that line in that file,
+OFFSET, for that number of lines after the current line
or the start of the range
FUNCTION, for the first line in that function,
FILE:FUNCTION, to distinguish among like-named static functions.
*ADDRESS, for the instruction at that address.
The breakpoint will stop execution of the inferior whenever it executes
an instruction at any address within the [START-LOCATION, END-LOCATION]
range (including START-LOCATION and END-LOCATION).
It looks like that you're trying to set a breakpoint at a function named 0x080495a3. Instead try b *0x080495a3 to indicate to GDB that you want to break at a specific address.
0x080495a3 is an address of the line on which you are willing to apply break point. But the format for gdb is b (function name or line number). So You have 2 ways to do this.
1) do an l after your gdb session has started. It will list you the code in C. And then apply a break point using the line number else
2) if you want to use the address, use b *0x080495a3 way to set a break point.
This way you will be able to halt at line
0x080495a3 <+3>: mov 0x20(%esp),%eax

Resources