I know and understand the purpose of volatile variables and optimisation in general (well, I think I do!). This question relates specifically to what happens if a variable is accessed outside the module it is declared in.
In the following scenario, if funcThatWaits was called inside bar.c, it could be optimised and not fetch the value of sTheVar each loop iteration.
However, when GetTheVar is called externally could the same optimisation apply or does the function call ensure sTheVar will always be read each loop iteration?
I am not suggesting this is good code or practice, but an example for the sake of the question.
bar.h
int GetTheVar(void);
bar.c
static /*volatile*/ int sTheVar;
int GetTheVar(void)
{
return sTheVar;
}
static void someISROrFuncCalledFromAnotherThread(void)
{
sTheVar = 1;
}
foo.c
#include "bar.h"
void funcThatWaits(void)
{
while(GetTheVar() != 1) {}
}
when GetTheVar is called externally could the same optimisation apply or does the function call ensure sTheVar will always be read each loop iteration?
The same optimization may apply. For instance, if you are using LTO (Link-Time Optimization), then the compiler knows everything about GetTheVar and will likely decide funcThatWaits is an infinite loop (which, by the way, would be UB).
Function calls are not going to be optimized away since, for all the caller knows, the function being called could depend on some exogenous state.
I compiled the following three files using gcc:
foo.c
#include "bar.h"
void funcThatWaits(void) {
while ( getVar() != 1 );
}
bar.c
#include "foo.h"
static int theVar;
int getTheVar(void) {
return theVar;
}
void theFunc(void) {
funcThatWaits();
}
test.c
#include "bar.h"
int main() {
theFunc();
return 0;
}
Compiling those three into a.out and running objdump -d a.out, the following comes out:
00000000000005fa <main>:
5fa: 55 push %rbp
5fb: 48 89 e5 mov %rsp,%rbp
5fe: e8 25 00 00 00 callq 628 <theFunc>
603: b8 00 00 00 00 mov $0x0,%eax
608: 5d pop %rbp
609: c3 retq
000000000000060a <funcThatWaits>:
60a: 55 push %rbp
60b: 48 89 e5 mov %rsp,%rbp
60e: 90 nop
60f: e8 08 00 00 00 callq 61c <getTheVar>
614: 83 f8 01 cmp $0x1,%eax
617: 75 f6 jne 60f <funcThatWaits+0x5>
619: 90 nop
61a: 5d pop %rbp
61b: c3 retq
000000000000061c <getTheVar>:
61c: 55 push %rbp
61d: 48 89 e5 mov %rsp,%rbp
620: 8b 05 ee 09 20 00 mov 0x2009ee(%rip),%eax # 201014 <theVar>
626: 5d pop %rbp
627: c3 retq
0000000000000628 <theFunc>:
628: 55 push %rbp
629: 48 89 e5 mov %rsp,%rbp
62c: e8 d9 ff ff ff callq 60a <funcThatWaits>
631: 90 nop
632: 5d pop %rbp
633: c3 retq
634: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
63b: 00 00 00
63e: 66 90 xchg %ax,%ax
Related
I wrote the following 2 programs using C:
First:
int foo(int x)
{
return 1;
}
int main()
{
return foo(4);
}
Second:
static int foo(int x)
{
return 1;
}
int main()
{
return foo(4);
}
Then I ran:
gcc -c my_file.c
For the first file I saw (Not full output):
000000000000000e <main>:
e: 55 push %rbp
f: 48 89 e5 mov %rsp,%rbp
12: bf 04 00 00 00 mov $0x4,%edi
17: e8 00 00 00 00 callq 1c <main+0xe>
1c: 5d pop %rbp
1d: c3 retq
And for the second:
000000000000000e <main>:
e: 55 push %rbp
f: 48 89 e5 mov %rsp,%rbp
12: bf 04 00 00 00 mov $0x4,%edi
17: e8 e4 ff ff ff callq 0 <foo>
1c: 5d pop %rbp
1d: c3 retq
My question is, why in the first file we needed relocation when the function is defined (and not only declared) in the current file? This sounds too strange to me.
You look at unresolved code.
The first version makes foo() global, and therefore there are entries in appropriate tables, symbols and relocations, not shown in the listing. <Edit>Most probably because the compiler works that way, when it emits a call to a global function, it puts zeroes (or anything else) in the address field. It does not matter that this global function is in the same translation unit. Called with other options or other versions of the compiler or other compilers might yield a different result.</Edit>
In the second version the compiler knows that foo() is local and resolves the call instantly without the need to generate relocation entries.
The calls will be resolved to equal values if you link the program.
<Edit>Interesting: I tried to reproduce this with GCC 8.1.0 (MinGW-W64) on Windows, and both calls are resolved by the compiler. However, with GCC 11.1.0 of the current Manjaro Linux, it shows the described behaviour.</Edit>
I wonder how to switch the call to a function for another inside an executable (.exe in my case)
Here is the code I try to play with
#include <stdio.h>
void hello()
{
printf("Hello world!");
}
void investigate()
{
printf("Investigate all the things!");
}
main()
{
hello();
}
Once I compiled the above code (with gcc) and got an executable (.exe) out of it, I want to switch the "hello" call with "investigate".
--Edit--
My environment: Windows 10 (64bit), mingw with gcc/g++ 4.8.1
--Edit 2--
I'm fine with Linux answer (any Ubuntu or any OpenSuse and any architecture) too as for me it's very important to have a proof-of-concept.
Assuming the compiler is not omitting dead functions entirely, that it is not inlining the function and that the call won't go through the PLT, once you have compiled the executable, you can simply edit the call instruction.
Note that the two functions must be "compatible", where the notion of compatibility is fuzzy, it means "the new must satisfy at least the same assumptions the compiler made when calling the old one".
The ABI is of course one such assumption but it may not be the only one.
If your compiler omitted dead function, you can't switch the function (one is missing).
If your compiler inlined the call, you can't switch the function (there is no call). You can work against the compiler and rewrite the code at the call-site (in the C source), this is called patching.
If your compiler used the PLT, you need to change the GOT entry used by the PLT stub. You may need to document your self a bit but changing the linked procedure is actually a feature of PLT machinery.
If your compiler did nothing of that, this should be case for such a simple source when no optimisations are enabled, you can use objdump -d <file> to find the call-site and the address of the new function:
000000000040051d <hello>:
40051d: 55 push %rbp
40051e: 48 89 e5 mov %rsp,%rbp
400521: bf f0 05 40 00 mov $0x4005f0,%edi
400526: b8 00 00 00 00 mov $0x0,%eax
40052b: e8 d0 fe ff ff callq 400400 <printf#plt>
400530: 5d pop %rbp
400531: c3 retq
0000000000400532 <investigate>:
400532: 55 push %rbp
400533: 48 89 e5 mov %rsp,%rbp
400536: bf fd 05 40 00 mov $0x4005fd,%edi
40053b: b8 00 00 00 00 mov $0x0,%eax
400540: e8 bb fe ff ff callq 400400 <printf#plt>
400545: 5d pop %rbp
400546: c3 retq
0000000000400547 <main>:
400547: 55 push %rbp
400548: 48 89 e5 mov %rsp,%rbp
40054b: b8 00 00 00 00 mov $0x0,%eax
400550: e8 c8 ff ff ff callq 40051d <hello>
400555: b8 00 00 00 00 mov $0x0,%eax
40055a: 5d pop %rbp
40055b: c3 retq
40055c: 0f 1f 40 00 nopl 0x0(%rax)
Then change the immediate value of the call instruction with the difference between the target address and the address after the end of the call instruction (it doesn't matter where the origin is as long as it's the same for both addresses).
Target = 400532
After the end of call = 400555
Difference = 400532 - 400555 = -23 = 0xFFFFFFDD
Change from:
400550: e8 c8 ff ff ff
to:
400550: e8 dd ff ff ff
Note that immediates are little-endiands.
You can use an hexeditor to edit the code, to find the offset into the file you can either use an elf reader and do a bit of math your self or you can simply search for the bytes of the call instruction (check also the bytes around the call to be sure).
After the edit, the binary has been patched:
0000000000400532 <investigate>:
400532: 55 push %rbp
400533: 48 89 e5 mov %rsp,%rbp
400536: bf fd 05 40 00 mov $0x4005fd,%edi
40053b: b8 00 00 00 00 mov $0x0,%eax
400540: e8 bb fe ff ff callq 400400 <printf#plt>
400545: 5d pop %rbp
400546: c3 retq
0000000000400547 <main>:
400547: 55 push %rbp
400548: 48 89 e5 mov %rsp,%rbp
40054b: b8 00 00 00 00 mov $0x0,%eax
400550: e8 dd ff ff ff callq 400532 <investigate>
400555: b8 00 00 00 00 mov $0x0,%eax
40055a: 5d pop %rbp
40055b: c3 retq
I am having trouble replicating the stack buffer overflow example given by OWASP here.
Here is my attempt:
$ cat test.c
#include <stdio.h>
#include <string.h>
void doit(void)
{
char buf[8];
gets(buf);
printf("%s\n", buf);
}
int main(void)
{
printf("So... The End...\n");
doit();
printf("or... maybe not?\n");
return 0;
}
$ gcc test.c -o test -fno-stack-protection -ggdb
$ objdump -d test # omitted irrelevant parts i think
000000000040054c <doit>:
40054c: 55 push %rbp
40054d: 48 89 e5 mov %rsp,%rbp
400550: 48 83 ec 10 sub $0x10,%rsp
400554: 48 8d 45 f0 lea -0x10(%rbp),%rax
400558: 48 89 c7 mov %rax,%rdi
40055b: e8 d0 fe ff ff callq 400430 <gets#plt>
400560: 48 8d 45 f0 lea -0x10(%rbp),%rax
400564: 48 89 c7 mov %rax,%rdi
400567: e8 a4 fe ff ff callq 400410 <puts#plt>
40056c: c9 leaveq
40056d: c3 retq
000000000040056e <main>:
40056e: 55 push %rbp
40056f: 48 89 e5 mov %rsp,%rbp
400572: bf 4c 06 40 00 mov $0x40064c,%edi
400577: e8 94 fe ff ff callq 400410 <puts#plt>
40057c: e8 cb ff ff ff callq 40054c <doit>
400581: bf 5d 06 40 00 mov $0x40065d,%edi
400586: e8 85 fe ff ff callq 400410 <puts#plt>
40058b: b8 00 00 00 00 mov $0x0,%eax
400590: 5d pop %rbp
400591: c3 retq # this is where i took my overflow value from
400592: 90 nop
400593: 90 nop
400594: 90 nop
400595: 90 nop
400596: 90 nop
400597: 90 nop
400598: 90 nop
400599: 90 nop
40059a: 90 nop
40059b: 90 nop
40059c: 90 nop
40059d: 90 nop
40059e: 90 nop
40059f: 90 nop
$ perl -e 'print "A"x12 ."\x91\x05\x40"' | ./test
So... The End...
AAAAAAAAAAAAâ–’#
or... maybe not? # this shouldn't be outputted
Why isn't this working? I'm assuming that the memory address that I am supposed to insert is the retq from <main>.
My goal is to figure out how to do a stack buffer overflow that calls a function elsewhere in the program. Any help is much appreciated. :)
I'm using Windows & MSVC but you should get the idea.
Consider the following code:
#include <stdio.h>
void someFunc()
{
puts("wow, we should never get here :|");
}
// MSVC inlines this otherwise
void __declspec(noinline) doit(void)
{
char buf[8];
gets(buf);
printf("%s\n", buf);
}
int main(void)
{
printf("So... The End...\n");
doit();
printf("or... maybe not?\n");
return 0;
}
(Note: I had to compile it with /OPT:NOREF to force MSVC not to remove "unused" code and /GS- to turn off stack checks)
Now, let's open it in my favorite disassembler:
We'd like to exploit the gets vulnerability so the execution jumps to someFunc. We can see that its address is 001D1000, so if we can write enough bytes past the buffer to overwrite the return address, we'll be good. Let's take a look at the stack when gets is called:
As we can see, there's 8 bytes of our stack allocated buffer (buf), 4 bytes of some stuff (actually the PUSHed EBP), and the return address. Thus, we need to write 12 bytes of whatever and then our 4 byte return address (001D1000) to "hijack" the execution flow. Let's do just that - we'll prepare an input file with the bytes we need using a hex editor:
And indeed, when we run the program with that input, we get this:
After it prints that line, it will crash with an access violation since there was some garbage on the stack. However, there's nothing stopping you from carefully analyzing the code and preparing such bytes in your input that the program will appear to function as normal (we could overwrite the next bytes with the address of ExitProcess, so that someFunc would jump there).
#include <stdio.h>
void DispString(const char* charList)
{
puts(charList);
}
void main()
{
DispString("Hello, world!");
}
compile: gcc -c -g test.c -o test.o
link: gcc -o test test.o
Very simple, but when I use objdump to disassemble the object file(test.o), I got the following result:
objdump -d test.o:
boot.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <DispString>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 10 sub $0x10,%rsp
8: 48 89 7d f8 mov %rdi,-0x8(%rbp)
c: 48 8b 45 f8 mov -0x8(%rbp),%rax
10: 48 89 c7 mov %rax,%rdi
13: e8 00 00 00 00 callq 18 <DispString+0x18>
18: c9 leaveq
19: c3 retq
000000000000001a <main>:
1a: 55 push %rbp
1b: 48 89 e5 mov %rsp,%rbp
1e: bf 00 00 00 00 mov $0x0,%edi
23: e8 00 00 00 00 callq 28 <main+0xe>
28: 5d pop %rbp
29: c3 retq
For the line 23, it passed 0 to %edi register, which is definitely wrong. It should pass the address of the "Hello, world!" string to it. And it called 28 <main+0xe>? The line 28 is just its next line, rather than function DispString(which is in line 0). Why could this happen? I've also looked into the final test file, in which all the values are just correct. So how could the linker know where to find those functions or strings?
You are only translating file so no linking has been done. Once linking jas been done, then and then DispString()'s address will be known to main and it will jump to there. So as suggested in one of the comments, use objdump with the comliled executable.
I have this test.c on my Ubuntu14.04 x86_64 system.
void foo(int a, long b, int c) {
}
int main() {
foo(0x1, 0x2, 0x3);
}
I compiled this with gcc --no-stack-protector -g test.c -o test and got the assembly code with objdump -dS test -j .text
00000000004004ed <_Z3fooili>:
void foo(int a, long b, int c) {
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: 89 7d fc mov %edi,-0x4(%rbp)
4004f4: 48 89 75 f0 mov %rsi,-0x10(%rbp)
4004f8: 89 55 f8 mov %edx,-0x8(%rbp) // !!Attention here!!
}
4004fb: 5d pop %rbp
4004fc: c3 retq
00000000004004fd <main>:
int main() {
4004fd: 55 push %rbp
4004fe: 48 89 e5 mov %rsp,%rbp
foo(0x1, 0x2, 0x3);
400501: ba 03 00 00 00 mov $0x3,%edx
400506: be 02 00 00 00 mov $0x2,%esi
40050b: bf 01 00 00 00 mov $0x1,%edi
400510: e8 d8 ff ff ff callq 4004ed <_Z3fooili>
}
400515: b8 00 00 00 00 mov $0x0,%eax
40051a: 5d pop %rbp
40051b: c3 retq
40051c: 0f 1f 40 00 nopl 0x0(%rax)
I know that the function parameters should be pushed to stack from right to left in sequence. So I was expecting this
void foo(int a, long b, int c) {
push %rbp
mov %rsp,%rbp
mov %edi,-0x4(%rbp)
mov %rsi,-0x10(%rbp)
mov %edx,-0x14(%rbp) // c should be push on stack after b, not after a
But gcc seemed clever enough to push parameter c(0x3) right after a(0x1) to save the four bytes which should be reserved for data alignment of b(0x2). Can someone please explain this and show me some documentation on why gcc did this?
The parameters are passed in registers - edi, esi, edx (then rcx, r8, r9 and only then pushed on stack) - just what the Linux amd64 calling convention mandates.
What you see in your function is just how the compiler saves them upon entry when compiling with -O0, so they're in memory where a debugger can modify them. It is free to do it in any way it wants, and it cleverly does this space optimization.
The only reason it does this is that gcc -O0 always spills/reloads all C variables between C statements to support modifying variables and jumping between lines in a function with a debugger.
All this would be optimized out in release build in the end.