Segmentation fault creating a user-level thread with C and assembly - c

I am trying to understand some OS fundamentals using some assignments. I have already posted a similar question and got satisfying answers. But this one is slightly different but I haven't been able to debug it. So here's what I do:
What I want to do is to start a main program, malloc a space, use it as a stack to start a user-level thread. My problem is with return address. Here's the code so far:
[I'm editing my code to make it up-to-date to the current state of my answer ]
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#define STACK_SIZE 512
void switch_thread(int*,int*);
int k = 0;
void simple_function()
{
printf("I am the function! k is: %d\n",k);
exit(0);
}
void create_thread(void (*function)())
{
int* stack = malloc(STACK_SIZE + 32);
stack = (int* )(((long)stack & (-1 << 4)) + 0x10);
stack = (int* ) ((long)stack + STACK_SIZE);
*stack = (long) function;
switch_thread(stack,stack);
}
int main()
{
create_thread(simple_function);
assert(0);
return 0;
}
switch_thread is an assembly code I've written as follows:
.text
.globl switch_thread
switch_thread:
movq %rdi, %rsp
movq %rsi, %rbp
ret
This code runs really well under GDB and gives the expected output (which is,passing the control to simple_function and printing "I am the function! k is: 0". But when run separately, this gives a segmentation fault. I'm baffled by this result.
Any help would be appreciated. Thanks in advance.

Two problems with your code:
Unless your thread is actually inside a proper procedure (or a nested procedure), there's no such thing as "base pointer". This makes the value of %rbp irrelevant since the thread is not inside a particular procedure at the point of initialization.
Contrary to what you think, when the ret instruction gets executed, the value that %rsp is referring to becomes the new value of the program counter. This means that instead of *(base_pointer + 1), *(base_pointer) will be consulted when it gets executed. Again, the value of %rbp is irrelevant here.
Your code (with minimal modification to make it run) should look like this:
void switch_thread(int* stack_pointer,int* entry_point);
void create_thread(void (*function)())
{
int* stack_pointer = malloc(STACK_SIZE + 8);
stack_pointer += STACK_SIZE; //you'd probably want to back up the original allocated address if you intend to free it later for any reason.
switch_thread(stack_pointer,function);
}
Your switch_thread routine should look like this:
.text
.globl switch_thread
switch_thread:
mov %rsp, %rax //move the original stack pointer to a scratch register
mov %rdi, %rsp //set stack pointer
push %rax //back-up the original stack pointer
call %rsi //call the function
pop %rsp //restore the original stack pointer
ret //return to create_thread
FYI: If you're initializing a thread on your own, I suggest that you first create a proper trampoline that acts as a thread entry point (e.g. ntdll's RtlUserThreadStart). This will make things much cleaner especially if you want to make your program multithreaded and also pass in any parameters to the start routine.

base_pointer needs to be suitably aligned to store void (*)() values, otherwise you're dealing with undefined behaviour. I think you mean something like this:
void create_thread(void (*function)())
{
size_t offset = STACK_SIZE + sizeof function - STACK_SIZE % sizeof function;
char *stack_pointer = malloc(offset + sizeof *base_pointer);
void (**base_pointer)() = stack_pointer + offset;
*base_pointer = function;
switch_thread(stack_pointer,base_pointer);
}
There is no need to cast malloc. It's generally a bad idea to cast pointers to integer types, or function pointers to object pointer types.
I understand that this is all portable-C nit-picky advice, but it really does help to write as much as your software as possible in portable code rather than relying upon undefined behaviour.

Related

How to change the local variable without its reference

Interview question : Change the local variable value without using a reference as a function argument or returning a value from the function
void func()
{
/*do some code to change the value of x*/
}
int main()
{
int x = 100;
printf("%d\n", x); // it will print 100
func(); // not return any value and reference of x also not sent
printf("%d\n", x); // it need to print 200
}
x value need to changed
The answer is that you can’t.
The C programming language offers no way of doing this, and attempting to do so invariably causes undefined behaviour. This means that there are no guarantees about what the result will be.
Now, you might be tempted to exploit undefined behaviour to subvert C’s runtime system and change the value. However, whether and how this works entirely depends on the specific executing environment. For example, when compiling the code with a recent version of GCC and clang, and enabling optimisation, the variable x simply ceases to exist in the output code: There is no memory location corresponding to its name, so you can’t even directly modify a raw memory address.
In fact, the above code yields roughly the following assembly output:
main:
subq $8, %rsp
movl $100, %esi
movl $.LC0, %edi
xorl %eax, %eax
call printf
xorl %eax, %eax
call func
movl $100, %esi
movl $.LC0, %edi
xorl %eax, %eax
call printf
xorl %eax, %eax
addq $8, %rsp
ret
As you can see, the value 100 is a literal directly stored in the ESI register before the printf call. Even if your func attempted to modify that register, the modification would then be overwritten by the compiled printf call:
…
movl $200, %esi /* This is the inlined `func` call! */
movl $100, %esi
movl $.LC0, %edi
xorl %eax, %eax
call printf
…
However you dice it, the answer is: There is no x variable in the compiled output, so you cannot modify it, even accepting undefined behaviour. You could modify the output by overriding the printf function call, but that wasn’t the question.
By the design of the C language, and by the definition of a local variable, you cannot access it from outside without making it available in some way.
Some ways to make a local variable accessible to the outside world:
send a copy of it (the value);
send a pointer to it (don't save and use the pointer for too long, since the variable may be removed when its scope ends);
export it with extern if the variable is declared at file level (outside of all functions).
Hack
Only changing code in void func(), create a define.
Akin to #chqrlie.
void func()
{
/*do some code to change the value of x*/
#define func() { x = 200; }
}
int main()
{
int x = 100;
printf("%d\n", x); // it will print 100
func(); // not return any value and reference of x also not sent
printf("%d\n", x); // it need to print 200
}
Output
100
200
The answer is that you can’t, but...
I perfectly agree with what #virolino and #Konrad Rudolph and I don't like my "solution" to this problem be recognised as a best practise, but since this is some sort of challenge one can come up with this approach.
#include <stdio.h>
static int x;
#define int
void func() {
x = 200;
}
int main() {
int x = 100;
printf("%d\n", x); // it prints 100
func(); // not return any value and reference of x also not sent
printf("%d\n", x); // it prints 200
}
The define will set int to nothing. Thus x will be the global static x and not the local one. This compiles with a warning, since the line int main() { is now only main(){. It only compiles due to the special handling of a function with return type int.
This approach is hacky and fragile, but that interviewer is asking for it. So here's an example for why C and C++ are such fun languages:
// Compiler would likely inline it anyway and that's necessary, because otherwise
// the return address would get pushed onto the stack as well.
inline
void func()
{
// volatile not required here as the compiler is told to work with the
// address (see lines below).
int tmp;
// With the line above we have pushed a new variable onto the stack.
// "volatile int x" from main() was pushed onto it beforehand,
// hence we can take the address of our tmp variable and
// decrement that pointer in order to point to the variable x from main().
*(&tmp - 1) = 200;
}
int main()
{
// Make sure that the variable doesn't get stored in a register by using volatile.
volatile int x = 100;
// It prints 100.
printf("%d\n", x);
func();
// It prints 200.
printf("%d\n", x);
return 0;
}
Boring answer: I would use a straightforward, global pointer variable:
int *global_x_pointer;
void func()
{
*global_x_pointer = 200;
}
int main()
{
int x = 100;
global_x_pointer = &x;
printf("%d\n", x);
func();
printf("%d\n", x);
}
I'm not sure what "sending reference" means. If setting a global pointer counts as sending a reference, then this answer obviously violates the stated problem's curious stipulations and isn't valid.
(On the subject of "curious stipulations", I've sometimes wished SO had another tag, something like driving-screws-with-a-hammer, because that's what these "brain teasers" always make me think of. Perfectly obvious question, perfectly obvious answer, but no, gotcha, you can't use that answer, you're stuck on a desert island and your C compiler's for statement got broken in the shipwreck, so you're supposed to be McGyver and use a coconut shell and a booger instead. Occasionally these questions can demonstrate good lateral thinking skills and are interesting, but most of the time, they're just dumb.)

Is it possible to wrap shellcode in a C function such that control is returned to the caller after completion?

Suppose I have some arbitrary x86 instructions that I want to have executed in the context of some program, and I convert these instructions automatically or manually into shellcode. For example, the following instructions.
movq 1, %rax
cpuid
There are various questions, such as here and here, about casting shellcode to a function pointer and executing it by using a standard function invocation. However, arbitrary asm will generally not have the instructions to return to the caller after all the instructions have been completed.
I am interesting in writing an "interpreter" of sorts for arbitrary shellcode, so that it can execute a bunch of instructions (perhaps they are in a file somewhere), read out the value of certain registers, and return control to the main C program. I assume the shell code does not do something like exec and change the process, but merely runs instructions like rdpmc or cpuid.
I imagine something that looks like this, but I am not sure how I can patch the shellcode so that it returns control to the right place.
void executeAndReadRegisters(char* shellcode, int length, uint64_t* rax, uint64_t* rbx, uint64_t* rbx) {
// Modify the shellcode in some way so that it returns control to the
// current program's code after execution, right after "read out registers".
char* modifiedShellCode = malloc((length + EXTRA_NEEDED) * sizeof(char));
// How do I modify the shellcode to return to "Read out registers?"
int (*func)();
func = (int (*)()) modifiedShellCode;
(int)(*func)();
// Read out registers
asm("\t movq %%rax,%0" : "=r"(*rax));
asm("\t movq %%rbx,%0" : "=r"(*rbx));
asm("\t movq %%rcx,%0" : "=r"(*rcx));
}
int main(int argc, char **argv)
{
// Suppose this comes from a file somewhere
char shellcode[] = "...";
int length = ; // Get from external source
uint64_t rax,rbx,rcx;
executeAndReadRegisters(shellcode, length, &rax,&rbx, &rcx);
printf("%lu %lu %lu\n", rax,rbx,rcx);
}

How can I print the contents of stack in C program?

I want to, as the title says, print the contents of the stack in my C program.
Here are the steps I took:
I made a simple assembly (helper.s) file that included a function to return the address of my ebp register and a function to return the address of my esp register
.globl get_esp
get_esp:
movl %esp, %eax
ret
# get_ebp is defined similarly, and included in the .globl section
I called the get_esp () and get_ebp () functions from my C program ( fpC = get_esp (); where fpC is an int)
I (successfully, I think) printed the address of my esp and ebp registers ( fprintf (stderr, "%x", fcP); )
I tried, and failed to, print out the contents of my esp register. (I tried fprintf (sderr, "%d", *fcP); and fprintf (sderr, "%x", *((int *)fcP));, among other methods). My program hits a segmentation fault at runtime when this line is processed.
What am I doing wrong?
EDIT: This must be accomplished by calling these assembly functions to get the stack pointers.
EDIT2: This is a homework assignment.
If your utilising a GNU system, you may be able to use GNU's extension to the C library for dealing backtraces, see here.
#include <execinfo.h>
int main(void)
{
//call-a-lot-of-functions
}
void someReallyDeepFunction(void)
{
int count;
void *stack[50]; // can hold 50, adjust appropriately
char **symbols;
count = backtrace(stack, 50);
symbols = backtrace_symbols(stack, count);
for (int i = 0; i < count; i++)
puts(symbols[i]);
free(symbols);
}
get_esp returns esp as it is within the function. But this isn't the same as esp in the calling function, because the call operation changes esp.
I recommend replacing the function with a piece of inline assembly. This way esp won't change as you try to read it.
Also, printing to sderr wouldn't help. From my experience, stderr works much better.

Buffer overflow in C

I'm attempting to write a simple buffer overflow using C on Mac OS X 10.6 64-bit. Here's the concept:
void function() {
char buffer[64];
buffer[offset] += 7; // i'm not sure how large offset needs to be, or if
// 7 is correct.
}
int main() {
int x = 0;
function();
x += 1;
printf("%d\n", x); // the idea is to modify the return address so that
// the x += 1 expression is not executed and 0 gets
// printed
return 0;
}
Here's part of main's assembler dump:
...
0x0000000100000ebe <main+30>: callq 0x100000e30 <function>
0x0000000100000ec3 <main+35>: movl $0x1,-0x8(%rbp)
0x0000000100000eca <main+42>: mov -0x8(%rbp),%esi
0x0000000100000ecd <main+45>: xor %al,%al
0x0000000100000ecf <main+47>: lea 0x56(%rip),%rdi # 0x100000f2c
0x0000000100000ed6 <main+54>: callq 0x100000ef4 <dyld_stub_printf>
...
I want to jump over the movl instruction, which would mean I'd need to increment the return address by 42 - 35 = 7 (correct?). Now I need to know where the return address is stored so I can calculate the correct offset.
I have tried searching for the correct value manually, but either 1 gets printed or I get abort trap – is there maybe some kind of buffer overflow protection going on?
Using an offset of 88 works on my machine. I used Nemo's approach of finding out the return address.
This 32-bit example illustrates how you can figure it out, see below for 64-bit:
#include <stdio.h>
void function() {
char buffer[64];
char *p;
asm("lea 4(%%ebp),%0" : "=r" (p)); // loads address of return address
printf("%d\n", p - buffer); // computes offset
buffer[p - buffer] += 9; // 9 from disassembling main
}
int main() {
volatile int x = 7;
function();
x++;
printf("x = %d\n", x); // prints 7, not 8
}
On my system the offset is 76. That's the 64 bytes of the buffer (remember, the stack grows down, so the start of the buffer is far from the return address) plus whatever other detritus is in between.
Obviously if you are attacking an existing program you can't expect it to compute the answer for you, but I think this illustrates the principle.
(Also, we are lucky that +9 does not carry out into another byte. Otherwise the single byte increment would not set the return address how we expected. This example may break if you get unlucky with the return address within main)
I overlooked the 64-bitness of the original question somehow. The equivalent for x86-64 is 8(%rbp) because pointers are 8 bytes long. In that case my test build happens to produce an offset of 104. In the code above substitute 8(%%rbp) using the double %% to get a single % in the output assembly. This is described in this ABI document. Search for 8(%rbp).
There is a complaint in the comments that 4(%ebp) is just as magic as 76 or any other arbitrary number. In fact the meaning of the register %ebp (also called the "frame pointer") and its relationship to the location of the return address on the stack is standardized. One illustration I quickly Googled is here. That article uses the terminology "base pointer". If you wanted to exploit buffer overflows on other architectures it would require similarly detailed knowledge of the calling conventions of that CPU.
Roddy is right that you need to operate on pointer-sized values.
I would start by reading values in your exploit function (and printing them) rather than writing them. As you crawl past the end of your array, you should start to see values from the stack. Before long you should find the return address and be able to line it up with your disassembler dump.
Disassemble function() and see what it looks like.
Offset needs to be negative positive, maybe 64+8, as it's a 64-bit address. Also, you should do the '+7' on a pointer-sized object, not on a char. Otherwise if the two addresses cross a 256-byte boundary you will have exploited your exploit....
You might try running your code in a debugger, stepping each assembly line at a time, and examining the stack's memory space as well as registers.
I always like to operate on nice data types, like this one:
struct stackframe {
char *sf_bp;
char *sf_return_address;
};
void function() {
/* the following code is dirty. */
char *dummy;
dummy = (char *)&dummy;
struct stackframe *stackframe = dummy + 24; /* try multiples of 4 here. */
/* here starts the beautiful code. */
stackframe->sf_return_address += 7;
}
Using this code, you can easily check with the debugger whether the value in stackframe->sf_return_address matches your expectations.

How to skip a line doing a buffer overflow in C

I want to skip a line in C, the line x=1; in the main section using bufferoverflow; however, I don't know why I can not skip the address from 4002f4 to the next address 4002fb in spite of the fact that I am counting 7 bytes form <main+35> to <main+42>.
I also have configured the options the randomniZation and execstack environment in a Debian and AMD environment, but I am still getting x=1;. What it's wrong with this procedure?
I have used dba to debug the stack and the memory addresses:
0x00000000004002ef <main+30>: callq 0x4002a4 **<function>**
**0x00000000004002f4** <main+35>: movl $0x1,-0x4(%rbp)
**0x00000000004002fb** <main+42>: mov -0x4(%rbp),%esi
0x00000000004002fe <main+45>: mov $0x4629c4,%edi
void function(int a, int b, int c)
{
char buffer[5];
int *ret;
ret = buffer + 12;
(*ret) += 8;
}
int main()
{
int x = 0;
function(1, 2, 3);
x = 1;
printf("x = %i \n", x);
return 0;
}
You must be reading Smashing the Stack for Fun and Profit article. I was reading the same article and have found the same problem it wasnt skipping that instruction. After a few hours debug session in IDA I have changed the code like below and it is printing x=0 and b=5.
#include <stdio.h>
void function(int a, int b) {
int c=0;
int* pointer;
pointer =&c+2;
(*pointer)+=8;
}
void main() {
int x =0;
function(1,2);
x = 3;
int b =5;
printf("x=%d\n, b=%d\n",x,b);
getch();
}
In order to alter the return address within function() to skip over the x = 1 in main(), you need two pieces of information.
1. The location of the return address in the stack frame.
I used gdb to determine this value. I set a breakpoint at function() (break function), execute the code up to the breakpoint (run), retrieve the location in memory of the current stack frame (p $rbp or info reg), and then retrieve the location in memory of buffer (p &buffer). Using the retrieved values, the location of the return address can be determined.
(compiled w/ GCC -g flag to include debug symbols and executed in a 64-bit environment)
(gdb) break function
...
(gdb) run
...
(gdb) p $rbp
$1 = (void *) 0x7fffffffe270
(gdb) p &buffer
$2 = (char (*)[5]) 0x7fffffffe260
(gdb) quit
(frame pointer address + size of word) - buffer address = number of bytes from local buffer variable to return address
(0x7fffffffe270 + 8) - 0x7fffffffe260 = 24
If you are having difficulties understanding how the call stack works, reading the call stack and function prologue Wikipedia articles may help. This shows the difficulty in making "buffer overflow" examples in C. The offset of 24 from buffer assumes a certain padding style and compile options. GCC will happily insert stack canaries nowadays unless you tell it not to.
2. The number of bytes to add to the return address to skip over x = 1.
In your case the saved instruction pointer will point to 0x00000000004002f4 (<main+35>), the first instruction after function returns. To skip the assignment you need to make the saved instruction pointer point to 0x00000000004002fb (<main+42>).
Your calculation that this is 7 bytes is correct (0x4002fb - 0x4002fb = 7).
I used gdb to disassemble the application (disas main) and verified the calculation for my case as well. This value is best resolved manually by inspecting the disassembly.
Note that I used a Ubuntu 10.10 64-bit environment to test the following code.
#include <stdio.h>
void function(int a, int b, int c)
{
char buffer[5];
int *ret;
ret = (int *)(buffer + 24);
(*ret) += 7;
}
int main()
{
int x = 0;
function(1, 2, 3);
x = 1;
printf("x = %i \n", x);
return 0;
}
output
x = 0
This is really just altering the return address of function() rather than an actual buffer overflow. In an actual buffer overflow, you would be overflowing buffer[5] to overwrite the return address. However, most modern implementations use techniques such as stack canaries to protect against this.
What you're doing here doesn't seem to have much todo with a classic bufferoverflow attack. The whole idea of a bufferoverflow attack is to modify the return adress of 'function'. Disassembling your program will show you where the ret instruction (assuming x86) takes its adress from. This is what you need to modify to point at main+42.
I assume you want to explicitly provoke the bufferoverflow here, normally you'd need to provoke it by manipulating the inputs of 'function'.
By just declaring a buffer[5] you're moving the stackpointer in the wrong direction (verify this by looking at the generated assembly), the return adress is somewhere deeper inside in the stack (it was put there by the call instruction). In x86 stacks grow downwards, that is towards lower adresses.
I'd approach this by declaring an int* and moving it upward until I'm at the specified adress where the return adress has been pushed, then modify that value to point at main+42 and let function ret.
You can't do that this way.
Here's a classic bufferoverflow code sample. See what happens once you feed it with 5 and then 6 characters from your keyboard. If you go for more (16 chars should do) you'll overwrite base pointer, then function return address and you'll get segmentation fault. What you want to do is to figure out which 4 chars overwrite the return addr. and make the program execute your code. Google around linux stack, memory structure.
void ff(){
int a=0; char b[5];
scanf("%s",b);
printf("b:%x a:%x\n" ,b ,&a);
printf("b:'%s' a:%d\n" ,b ,a);
}
int main() {
ff();
return 0;
}

Resources