Im trying to implement custom threads in C by creating a simple context switch function plus a FCFS scheduler.
The first step that i want to perform is to copy the entire function stack frame to heap to a saved frame and replace it by the first in queue.
The problem i have is that after finishing task one the stack of the second gets corrupted. I dont have any idea about why.
The code i have is the following:
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#define ITERATIONS 10
#define SSIZE 15
int * last;
void kwrite(const char*);
void kyield(int *);
void f1() {
int i = ITERATIONS;
while (i--) kwrite("A\n");
}
void f2() {
int i = ITERATIONS*2;
while (i--) {
printf("[%d]", i);
kwrite("B\n");
getchar();
}
}
void kwrite(const char* str) {
int a[10] = {5, 5, 5, 5, 5, 5, 5, 5, 5, 5};
write(1, str, strlen(str));
int *frame = malloc(sizeof(int)*SSIZE);
memcpy(frame, a, SSIZE*sizeof(int));
kyield(frame);
printf("ERROR\n");
}
void kyield(int * from) {
if (from == NULL) {
f1();
from = malloc(sizeof(int)*SSIZE);
memcpy(from, last, SSIZE*sizeof(int));
}
if (last == NULL) {
last = malloc(sizeof(int)*SSIZE);
memcpy(last, from, SSIZE*sizeof(int));
free(from);
f2();
exit(0);
}
int a[10] = {3, 3, 3, 3, 3, 3, 3, 3, 3, 3};
memcpy(a, last, SSIZE*sizeof(int));
memcpy(last, from, SSIZE*sizeof(int));
free(from);
}
int main(int argc, char** argv) {
kyield(NULL);
free(last);
}
It should call 10 times f1 and 20 f2 then exit. But when the var i of f2 is 8 it gets corrupted on the next iteration. Therefore entering an infinite loop.
Any help or suggestions would be appreciated! Have a nice day.
[edit]
I suppose the code can be a bit tricky to understand so here it is a little clarification:
main calls kyield with null parameters.
kyield detects it and calls f1
f1 executes until kwrite is called
kwrite calls kyield passing its current stack frame
kyield detects the last stack frame is null so it copies the stack frame (sf from now on) given by kwrite, then calls f2
f2 does the same as f1
when kyield is executed next, both from and last wont be NULL so it will overwrite its current s f with the one in last, swap it with the one in from and lastly it will return, as the stack has been altered it will jump to the return address of the last kwrite, not the actual one, thus. Jumping from the f1 thread to f2.
Your memcpy(frame, a, SSIZE*sizeof(int)) looks wrong. Your SSIZE is defined to 15, but a has only a size of 10.
this is deliberate, as by copying 15 elements of 4 bytes we are copying the rax last value, the last ebp and the return address of the function.
https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64/
I can see a couple of issues with the design. The normal way of scheduling different threads is for them to get complete stacks, not to share the same stack.
What that means in this case, is that the stack depends on the addresses being used.
+---------+--------+---------+--------+---------+
| main | kyield | f1 | kwrite | kyield |
| | | |a[10] | |
+---------+--------+---------+--------+---------+
|-------------| << copied by the slice of the stack.
The slice of the stack you are taking, is independent of the amount used in the function preceding it, so would be broken if the functions calling kwrite have different stack requirements (different amounts of state would be required).
It is also broken because the amount of information which is captured on the stack is not complete. The state of execution is based on the stack and the current values in non-volatile registers. These values can pollute the alternate threads.
Finally the stack also contains addresses. This scheme could only work, if all the threads executed had identical stack requirements, as if the yield function pastes a value back in which requires addresses, then they always have to align.
Your memcpy(frame, a, SSIZE*sizeof(int)) looks wrong. Your SSIZE is defined to 15, but a has only a size of 10.
Related
The code inside main.c
#include <stdio.h>
#include <unistd.h>
int main() {
int c_variable = 0; // the target
for(int x = 0; x < 100; x++) {
c_variable += 5; // increase by 5 to change the value of the int
printf("%i\n", c_variable); // print current value
sleep(8); // sleep so I have time to scan memory
}
return 0;
}
What I am trying to achieve is to read the integer c_variable and then to modify it inside another .c program. I am on linux so I did ps -A | grep main and got the PID of the running program. I then did sudo scanmem PID and entered the current value of c_variable a few times. I was left with three memory addresses and executing the command set 500 changed the value the program printed, effectively changing the memory address' value to 500 instead of 35 or whatever the program was currently at. I then executed the following code
#include <stdio.h>
int main() {
const long unsigned addr = 0x772d85fa1008; // one of the three addresses from scanmem
printf("%lu\n", addr);
return 0;
}
but I got some random long string of numbers, not the current number. The tutorials and answers I have read on how to read and write memory on linux does not have to use long unsigned but can use char* or just int* instead. My memory address seems to be a bit long, I have not see memory addresses that long before. Anyhow, how do I read and write the memory address of the integer c_variable?
Edit: the output of scanmem looks something like this
info: we currently have 3 matches.
3> list
[ 0] 7771ff64b090, 6 + 1e090, stack, 20, [I64 I32 I16 I8 ]
[ 1] 7771ff64b5d8, 6 + 1e5d8, stack, 20, [I64 I32 I16 I8 ]
[ 2] 7771ff64b698, 6 + 1e698, stack, 20, [I32 I16 I8 ]
3> set 50
info: setting *0x7771ff64b090 to 0x32...
info: setting *0x7771ff64b5d8 to 0x32...
info: setting *0x7771ff64b698 to 0x32...
output
...
150
155
160
165
170
175
55
60
65
...
You're printing the actual address number, but in in decimal notation, not what is at the address.
const int *addr = (int *) 0x772d85fa1008;
printf("%d\n", *addr);
You have to declare addr as a pointer type. More specifically a pointer to an integer. Its value (0x772d85fa1008) holds the address of the integer.
Then, in the printf call you dereference it to obtain the actual integer stored at the address.
Although in practice I can't vouch for whether this is going to work, since memory in modern operating systems isn't as simple as you make it out to be. But I don't have enough knowledge to assess that.
Processes running under Linux generally have their own virtualized memory space. If you want to access memory space of another process, arrangements have been made in the Linux API, see shmctl, shmget, shmat, shmdt.
I know there are lots of questions here about functions that take a variable number of arguments. I also know there's lots of docs about stdarg.h and its macros. And I also know how printf-like functions take a variable number of arguments. I already tried each of those alternatives and they didn't help me. So, please, keep that in mind before marking this question as duplicate.
I'm working on the process management features of a little embedded operating system and I'm stuck on the design of a function that can create processes that run a function with a variable number of parameters. Here's a simplified version of how I want my API to looks like:
// create a new process
// * function is a pointer to the routine the process will run
// * nargs is the number of arguments the routine takes
void create(void* function, uint8_t nargs, ...);
void f1();
void f2(int i);
void f3(float f, int i, const char* str);
int main()
{
create(f1, 0);
create(f2, 1, 9);
create(f3, 3, 3.14f, 9, "string");
return 0;
}
And here is a pseudocode for the relevant part of the implementation of system call create:
void create(void* function, uint8_t nargs, ...)
{
process_stack = create_stack();
first_arg = &nargs + 1;
copy_args_list_to_process_stack(process_stack, first_arg);
}
Of course I'll need to know the calling convention in order to be able to copy from create's activation record to the new process stack, but that's not the problem. The problem is how many bytes do I need to copy. Even though I know how many arguments I need to copy, I don't know how much space each of those arguments occupy. So I don't know when to stop copying.
The Xinu Operating System does something very similar to what I want to do, but I tried hard to understand the code and didn't succeed. I'll transcript a very simplified version of the Xinu's create function here. Maybe someone understand and help me.
pid32 create(void* procaddr, uint32 ssize, pri16 priority, char *name, int32 nargs, ...)
{
int32 i;
uint32 *a; /* points to list of args */
uint32 *saddr; /* stack address */
saddr = (uint32 *)getstk(ssize); // return a pointer to the new process's stack
*saddr = STACKMAGIC; // STACKMAGIC is just a marker to detect stack overflow
// this is the cryptic part
/* push arguments */
a = (uint32 *)(&nargs + 1); /* start of args */
a += nargs -1; /* last argument */
for ( ; nargs > 4 ; nargs--) /* machine dependent; copy args */
*--saddr = *a--; /* onto created process's stack */
*--saddr = (long)procaddr;
for(i = 11; i >= 4; i--)
*--saddr = 0;
for(i = 4; i > 0; i--) {
if(i <= nargs)
*--saddr = *a--;
else
*--saddr = 0;
}
}
I got stuck on this line: a += nargs -1;. This should move the pointer a 4*(nargs - 1) ahead in memory, right? What if an argument's size is not 4 bytes? But that is just the first question. I also didn't understand the next lines of the code.
If you are writing an operating system, you also define the calling convention(s) right? Settle for argument sizes of sizeof(void*) and pad as necessary.
I have a rather huge recursive function (also, I write in C), and while I have no doubt that the scenario where stack overflow happens is extremely unlikely, it is still possible. What I wonder is whether you can detect if stack is going to get overflown within a few iterations, so you can do an emergency stop without crashing the program.
In the C programming language itself, that is not possible. In general, you can't know easily that you ran out of stack before running out. I recommend you to instead place a configurable hard limit on the recursion depth in your implementation, so you can simply abort when the depth is exceeded. You could also rewrite your algorithm to use an auxillary data structure instead of using the stack through recursion, this gives you greater flexibility to detect an out-of-memory condition; malloc() tells you when it fails.
However, you can get something similar with a procedure like this on UNIX-like systems:
Use setrlimit to set a soft stack limit lower than the hard stack limit
Establish signal handlers for both SIGSEGV and SIGBUS to get notified of stack overflows. Some operating systems produce SIGSEGV for these, others SIGBUS.
If you get such a signal and determine that it comes from a stack overflow, raise the soft stack limit with setrlimit and set a global variable to identify that this occured. Make the variable volatile so the optimizer doesn't foil your plains.
In your code, at each recursion step, check if this variable is set. If it is, abort.
This may not work everywhere and required platform specific code to find out that the signal came from a stack overflow. Not all systems (notably, early 68000 systems) can continue normal processing after getting a SIGSEGV or SIGBUS.
A similar approach was used by the Bourne shell for memory allocation.
Heres a simple solution that works for win-32. Actually resembles what Wossname already posted but less icky :)
unsigned int get_stack_address( void )
{
unsigned int r = 0;
__asm mov dword ptr [r], esp;
return r;
}
void rec( int x, const unsigned int begin_address )
{
// here just put 100 000 bytes of memory
if ( begin_address - get_stack_address() > 100000 )
{
//std::cout << "Recursion level " << x << " stack too high" << std::endl;
return;
}
rec( x + 1, begin_address );
}
int main( void )
{
int x = 0;
rec(x,get_stack_address());
}
Here's a naive method, but it's a bit icky...
When you enter the function for the first time you could store the address of one of your variables declared in that function. Store that value outside your function (e.g. in a global). In subsequent calls compare the current address of that variable with the cached copy. The deeper you recurse the further apart these two values will be.
This will most likely cause compiler warnings (storing addresses of temporary variables) but it does have the benefit of giving you a fairly accurate way of knowing exactly how much stack you're using.
Can't say I really recommend this but it would work.
#include <stdio.h>
char* start = NULL;
void recurse()
{
char marker = '#';
if(start == NULL)
start = ▮
printf("depth: %d\n", abs(&marker - start));
if(abs(&marker - start) < 1000)
recurse();
else
start = NULL;
}
int main()
{
recurse();
return 0;
}
An alternative method is to learn the stack limit at the start of the program, and each time in your recursive function to check whether this limit has been approached (within some safety margin, say 64 kb). If so, abort; if not, continue.
The stack limit on POSIX systems can be learned by using getrlimit system call.
Example code that is thread-safe: (note: it code assumes that stack grows backwards, as on x86!)
#include <stdio.h>
#include <sys/time.h>
#include <sys/resource.h>
void *stack_limit;
#define SAFETY_MARGIN (64 * 1024) // 64 kb
void recurse(int level)
{
void *stack_top = &stack_top;
if (stack_top <= stack_limit) {
printf("stack limit reached at recursion level %d\n", level);
return;
}
recurse(level + 1);
}
int get_max_stack_size(void)
{
struct rlimit rl;
int ret = getrlimit(RLIMIT_STACK, &rl);
if (ret != 0) {
return 1024 * 1024 * 8; // 8 MB is the default on many platforms
}
printf("max stack size: %d\n", (int)rl.rlim_cur);
return rl.rlim_cur;
}
int main (int argc, char *argv[])
{
int x;
stack_limit = (char *)&x - get_max_stack_size() + SAFETY_MARGIN;
recurse(0);
return 0;
}
Output:
max stack size: 8388608
stack limit reached at recursion level 174549
I'm trying to understand how recursion works in C. Can anyone give me an explanation of the control flow?
#include <stdio.h>
/* printd: print n in decimal */
void printd(int n)
{
if (n < 0)
{
putchar('-');
n = -n;
}
if (n / 10) printd(n / 10);
putchar(n % 10 + '0');
}
int main()
{
printd(123);
return 0;
}
The control flow looks like this (where -> is a function call)
main()
└─> printd(123)
├─> printd(12)
│ ├─> printd(1)
│ │ └─> putchar('1')
│ └─> putchar('2')
└─> putchar('3')
Call printd(123)
(123 / 10) != 0, so Call printd(12)
(12 / 10) != 0, so Call printd(1)
(1 / 10) == 0, so Call putchar "1"
Call putchar "2"
Call putchar "3"
return 0 (from main())
To understand recursion, you need to understand the storage model. Though there are several variations, basically "automatic" storage, the storage used to contain automatic variables, parameters, compiler temps, and call/return information, is arranged as a "stack". This is a storage structure starting at some location in process storage and "growing" either "up" (increasing addresses) or "down" (decreasing addresses) as procedures are called.
One might start out with a couple of variables:
00 -- Variable A -- 27
01 -- Variable B -- 45
Then we decide to call procedure X, so we generate a parameter of A+B:
02 -- Parameter -- 72
We need to save the location where we want control to return. Say instruction 104 is the call, so we'll make 105 the return address:
03 -- Return address -- 105
We also need to save the size of the above "stack frame" -- four words, 5 with the frame size itself:
04 -- Frame size -- 5
Now we begin executing in X. It needs a variable C:
05 -- Variable C -- 123
And it needs to reference the parameter. But how does it do that? Well, on entry a stack pointer was set to point at the "bottom" of X's "stack frame". We could make the "bottom" be any of several places, but let's make it the first variable in X's frame.
05 -- Variable C -- 123 <=== (Stack frame pointer = 5)
But we still need to reference the parameter. We know that "below" our frame (where the stack frame pointer is pointing) are (in decreasing address order) the frame size, return address, and then our parameter. So if we subtract 3 (for those 3 values) from 5 we get 2, which is the location of the parameter.
Note that at this point we don't really care if our frame pointer is 5 or 55555 -- we just subtract to reference parameters, add to reference our local variables. If we want to make a call we "stack" parameters, return address, and frame size, as we did with the first call. We could make call after call after call and just continue "pushing" stack frames.
To return we, load the frame size and the return address into registers. Subtract frame size from the stack frame pointer and put the return address into the instruction counter and we're back in the calling procedure.
Now this is an over-simplification, and there are numerous different ways to handle the stack frame pointer, parameter passing, and keeping track of frame size. But the basics apply regardless.
You have recursion in C (or any other programming language) by breaking a problem into 2 smaller problems.
Your example: print a number can be broken in these 2 parts
print the first part if it exists
print the last digit
To print "123", the simpler problems are then to print "12" (12 is 123 / 10) and to print "3".
To print "12", the simpler problems are then to print "1" (1 is 12 / 10) and to print "2".
To print "1", ... just print "1".
#include <stdio.h>
#define putd(d) (printf("%d", d))
#define RECURSIVE
void rprint(int n)
{
#ifndef RECURSIVE
int i = n < 0 ? -n : n;
for (; i / 10; i /= 10)
putd(i % 10);
putd(i % 10);
if (n < 0)
putchar('-');
/* Don't forget to reverse :D */
#else
if (n < 0) {
n = -n;
putchar('-');
}
int i = n / 10;
if (i)
rprint(i);
putd(n % 10);
#endif
}
int main()
{
rprint(-321);
return 0;
}
Recursion works on stack i.e, first in last out.
Recursion is a process of calling itself with different parameters until a base condition is achieved. Stack overflow occurs when too many recursive calls are performed.
Code:
main()
{print f ("stat");
main();
print f ("end") ;
}
Code:
main()
{int n, res;
pf("enter n value");
sf("%d",&n);
=fact(n);
}
int fact(int n)
{int res;
if(n==0)
{
res=1;
}
else
{
res = n*fact (n-1);
}
return res;
}
I want to skip a line in C, the line x=1; in the main section using bufferoverflow; however, I don't know why I can not skip the address from 4002f4 to the next address 4002fb in spite of the fact that I am counting 7 bytes form <main+35> to <main+42>.
I also have configured the options the randomniZation and execstack environment in a Debian and AMD environment, but I am still getting x=1;. What it's wrong with this procedure?
I have used dba to debug the stack and the memory addresses:
0x00000000004002ef <main+30>: callq 0x4002a4 **<function>**
**0x00000000004002f4** <main+35>: movl $0x1,-0x4(%rbp)
**0x00000000004002fb** <main+42>: mov -0x4(%rbp),%esi
0x00000000004002fe <main+45>: mov $0x4629c4,%edi
void function(int a, int b, int c)
{
char buffer[5];
int *ret;
ret = buffer + 12;
(*ret) += 8;
}
int main()
{
int x = 0;
function(1, 2, 3);
x = 1;
printf("x = %i \n", x);
return 0;
}
You must be reading Smashing the Stack for Fun and Profit article. I was reading the same article and have found the same problem it wasnt skipping that instruction. After a few hours debug session in IDA I have changed the code like below and it is printing x=0 and b=5.
#include <stdio.h>
void function(int a, int b) {
int c=0;
int* pointer;
pointer =&c+2;
(*pointer)+=8;
}
void main() {
int x =0;
function(1,2);
x = 3;
int b =5;
printf("x=%d\n, b=%d\n",x,b);
getch();
}
In order to alter the return address within function() to skip over the x = 1 in main(), you need two pieces of information.
1. The location of the return address in the stack frame.
I used gdb to determine this value. I set a breakpoint at function() (break function), execute the code up to the breakpoint (run), retrieve the location in memory of the current stack frame (p $rbp or info reg), and then retrieve the location in memory of buffer (p &buffer). Using the retrieved values, the location of the return address can be determined.
(compiled w/ GCC -g flag to include debug symbols and executed in a 64-bit environment)
(gdb) break function
...
(gdb) run
...
(gdb) p $rbp
$1 = (void *) 0x7fffffffe270
(gdb) p &buffer
$2 = (char (*)[5]) 0x7fffffffe260
(gdb) quit
(frame pointer address + size of word) - buffer address = number of bytes from local buffer variable to return address
(0x7fffffffe270 + 8) - 0x7fffffffe260 = 24
If you are having difficulties understanding how the call stack works, reading the call stack and function prologue Wikipedia articles may help. This shows the difficulty in making "buffer overflow" examples in C. The offset of 24 from buffer assumes a certain padding style and compile options. GCC will happily insert stack canaries nowadays unless you tell it not to.
2. The number of bytes to add to the return address to skip over x = 1.
In your case the saved instruction pointer will point to 0x00000000004002f4 (<main+35>), the first instruction after function returns. To skip the assignment you need to make the saved instruction pointer point to 0x00000000004002fb (<main+42>).
Your calculation that this is 7 bytes is correct (0x4002fb - 0x4002fb = 7).
I used gdb to disassemble the application (disas main) and verified the calculation for my case as well. This value is best resolved manually by inspecting the disassembly.
Note that I used a Ubuntu 10.10 64-bit environment to test the following code.
#include <stdio.h>
void function(int a, int b, int c)
{
char buffer[5];
int *ret;
ret = (int *)(buffer + 24);
(*ret) += 7;
}
int main()
{
int x = 0;
function(1, 2, 3);
x = 1;
printf("x = %i \n", x);
return 0;
}
output
x = 0
This is really just altering the return address of function() rather than an actual buffer overflow. In an actual buffer overflow, you would be overflowing buffer[5] to overwrite the return address. However, most modern implementations use techniques such as stack canaries to protect against this.
What you're doing here doesn't seem to have much todo with a classic bufferoverflow attack. The whole idea of a bufferoverflow attack is to modify the return adress of 'function'. Disassembling your program will show you where the ret instruction (assuming x86) takes its adress from. This is what you need to modify to point at main+42.
I assume you want to explicitly provoke the bufferoverflow here, normally you'd need to provoke it by manipulating the inputs of 'function'.
By just declaring a buffer[5] you're moving the stackpointer in the wrong direction (verify this by looking at the generated assembly), the return adress is somewhere deeper inside in the stack (it was put there by the call instruction). In x86 stacks grow downwards, that is towards lower adresses.
I'd approach this by declaring an int* and moving it upward until I'm at the specified adress where the return adress has been pushed, then modify that value to point at main+42 and let function ret.
You can't do that this way.
Here's a classic bufferoverflow code sample. See what happens once you feed it with 5 and then 6 characters from your keyboard. If you go for more (16 chars should do) you'll overwrite base pointer, then function return address and you'll get segmentation fault. What you want to do is to figure out which 4 chars overwrite the return addr. and make the program execute your code. Google around linux stack, memory structure.
void ff(){
int a=0; char b[5];
scanf("%s",b);
printf("b:%x a:%x\n" ,b ,&a);
printf("b:'%s' a:%d\n" ,b ,a);
}
int main() {
ff();
return 0;
}