Dynamic expansion of the Linux stack - c

I've noticed the Linux stack starts small and expands with page faults caused by recursion/pushes/vlas up to size getrlimit(RLIMIT_STACK,...), give or take (defaults to 8MiB on my system).
Curiously though, if I cause page faults by addressing bytes directly, within the limit, Linux will just regularly segfault without expanding the page mapping (no segfault though, if I do it after I had e.g., alloca, cause the stack expansion).
Example program:
#include <stdio.h>
#include <unistd.h>
#include <stdint.h>
#include <stdlib.h>
#define CMD "grep stack /proc/XXXXXXXXXXXXXXXX/maps"
#define CMDP "grep stack /proc/%ld/maps"
void vla(size_t Sz)
{
char b[Sz];
b[0]='y';
b[1]='\0';
puts(b);
}
#define OFFSET (sizeof(char)<<12)
int main(int C, char **V)
{
char cmd[sizeof CMD]; sprintf(cmd,CMDP,(long)getpid());
if(system(cmd)) return 1;
for(int i=0; ; i++){
printf("%d\n", i);
char *ptr = (char*)(((uintptr_t)&ptr)-i*OFFSET);
if(C>1) vla(i*OFFSET); //pass an argument to the executable to turn this on
ptr[0] = 'x';
ptr[1] = '\0';
if(system(cmd)) return 1;
puts(ptr);
}
}
What kernel code is doing this? How does it differentiate between natural stack growth and me poking around in the address space?

The linux kernel takes the content of the stack pointer as the limit (within reasonable boundaries). Accessing the stack below the stack pointer minus 65536 and the size for 32 unsigned longs is causing a segmentation violation. So, if you access the memory down the stack you have to make sure, that the stack pointer somehow decreases with the accesses to have the linux kernel enlarge the segment. See this snippet from /arch/x86/mm/fault.c:
if (sw_error_code & X86_PF_USER) {
/*
* Accessing the stack below %sp is always a bug.
* The large cushion allows instructions like enter
* and pusha to work. ("enter $65535, $31" pushes
* 32 pointers and then decrements %sp by 65535.)
*/
if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < regs->sp)) {
bad_area(regs, sw_error_code, address);
return;
}
}
The value of the stack pointer register is key here!

Related

Finding return address of stack

I have this simple code:
#include <sys/types.h>
#include <netinet/in.h>
#include <stdio.h>
#include <ctype.h>
main(argc, argv)
char *argv[];
{
char line[512];
gets(line);
}
my goal is to find the distance between the end of the buffer and the return address of the stack.
So if my buffer (line) is 512 bytes, I could find the starting address, and add 512 and know where the start of that distance would be.. but how would I find the return address of the stack?
Basically I am just trying to figure out how to find the return address of the stack and the buffers start address.. I couldn't find it when disassembling main
#include <stdio.h>
int main()
{
long l, k;
asm("mov %%rsp,%0" : "=r"(l));
asm("mov %%rbp,%0" : "=r"(k));
printf("Stack pointer: 0x%16.16lX\n", l);
printf("Stack frame base: 0x0%16.16lX\n", k);
printf("Distance to return address: %ld\n", k-l);
}
snow ~ $ ./test
Stack pointer: 0x00007FFC95B793C0
Stack frame base: 0x000007FFC95B793D0
Distance to return address: 16
Obviously this is not portable, I'm assuming x64 and gcc here.
Caveat: BP isn't always going to point to the return address. Sometimes it's not used as a stack frame pointer, and some functions may not return their values on the stack. Register optimizations will break it. Local variables may break it. Variable word alignments may break it. Basically, don't count on it working. (I believe depending on the compiler/compile-time options you may need to add a constant offset to this, as well.)
I do really wonder if there isn't a better way to do whatever it is that you are trying to do... =)

simple stack-based machine in C

I have to create a simple stack-based machine. The instruction set consists of 5 instructions; push, pop, add, mult, end. I accept a source code file that has an instruction section (.text) and a data section (.data) and then i must store these in memory by simulating a memory system that uses 32-bit addresses.
An example source code file that I have to store in memory might be
.text
main:
push X
push Y
add //remove top two words in stack and add them then put result on top of stack
pop (some memory address) // stores result in the address
end
.data
X: 3 // allocate memory store the number 3
Y: 5
Any suggestion on how to do the memory system? I should probably store data in one section (maybe an array?) and then instructions in another but i can't just use array indexes since I need to use 32-bit addresses in my code.
Edit: Also is there a way to replace the X and Y with the actual address once I've assigned the number 3 and 5 to a space in memory (in my data array)? . . . kind of like a two pass assembler might do it.
What's wrong with arrays? If you know the size you need, they should work.
An address in your machine code would actually be an index in the array.
Using a 32-bit index with an array isn't a problem. Of course, not all indexes would be valid - only those from 0 to the size of the array. But do you need to simulate 4GB of memory, or can you set a limit on the memory size?
Just to add to the ugoren' answer (and a bit OT), I think a relatively interesting approach could be to extend your specification space with a .stack section, to be initialized by default to empty (like in your example).
That can be used to describe the expected intermediate stages of the computation (save/restore the actual state at some point).
To implement, I would use very simple code, like
file stack.h:
#ifndef STACK
#define STACK
#include <stdio.h>
/* here should be implemented the constraint about 32 bits words... */
typedef int word;
typedef struct { int top; word* mem; int allocated; } stack;
typedef stack* stackp;
stackp new_stack();
void free_stack(stackp);
void push(stackp s, word w);
word pop(stackp p);
/* extension */
stackp read(FILE*);
void write(stackp, FILE*);
#endif
file stack.c:
/* example implementation, use - arbitrary - chunks of 2^N */
#include <stdlib.h>
#include "stack.h"
/* blocks are 256 words */
#define N (1 << 8)
stackp new_stack() {
stackp s = calloc(1, sizeof(stack));
s->mem = malloc((s->allocated = N) * sizeof(word));
return s;
}
void free_stack(stackp s) {
free(s->mem);
free(s);
}
void push(stackp s, int w) {
if (s->top == s->allocated) {
s->allocated += N;
s->mem = realloc(s->mem, s->allocated * sizeof(word));
}
s->mem[s->top++] = w;
}
word pop(stackp s) {
if (s->top == 0) { /* exception */ }
return s->mem[--(s->top)];
}
file main.c:
#include "stack.h"
int main() {
stackp s = new_stack();
word X = 3;
word Y = 5;
push(s, X);
push(s, Y);
word Z = pop(s) + pop(s);
printf("Z=%d\n", Z);
free_stack(s);
}
file makefile:
main: main.c stack.c
to build:
make
to test:
./main
Z=8
It's worth noting some difference WRT ugoren' answer: I stress on data hiding, a valuable part of implementation, keeping details about actual functions in a separate file. There we can add many details, for instance about a maximum stack size (actually not enforced there), error handling, etc...
edit: to get the 'address' of a pushed word
word push(stackp s, int w) {
if (s->top == s->allocated) {
s->allocated += N;
s->mem = realloc(s->mem, s->allocated * sizeof(word));
}
s->mem[s->top] = w;
return s->top++;
}
Key to the memory system is to limit the range of the memory. In OS you can access only several sections of the memory.
So in you particular program you can say, that valid programs can contain addressees starting at 0x00004000 and the memory available to your machine is for example 4 MB.
Then in your program you create virtual memory space, of size 4MB and store it's beginning.
Below is an example; bear in mind it's an example, you have to adjust the parameters accordingly.
virtual memory start - 0x00006000 (get from malloc, or static initialization. or whatever)
stack machine memory start - 0x00004000
offset - 0x2000 (to align addresses in you OS and in your stack machine, you have to add 0x2000 to the stack machine address to get pointer to your array (in reality the offset can be negative).
If you actually need an index to array, just subtract beginning of your virtual memory from the pointer.

Accessing specific memory locations in C

In assembly language we have instructions like:
movl ax, [1000]
This allows us to access specific memory locations.
But in C can we do something similar to this?
I know inline assembly code using asm() will allow you to do this,
but I would like to know about some C specific technique to achieve this.
I tried the following code and got segmentation error:
int *ptr=0xFE1DB124;
*ptr;
This again was confusing as the memory location was identified by the code given below:
int var;
printf("\nThe Address is %x",&var);
So the memory location is available, but I am still getting a segmentation fault.
Why?
Common C compilers will allow you to set a pointer from an integer and to access memory with that, and they will give you the expected results. However, this is an extension beyond the C standard, so you should check your compiler documentation to ensure it supports it. This feature is not uncommonly used in kernel code that must access memory at specific addresses. It is generally not useful in user programs.
As comments have mentioned, one problem you may be having is that your operating system loads programs into a randomized location each time a program is loaded. Therefore, the address you discover on one run will not be the address used in another run. Also, changing the source and recompiling may yield different addresses.
To demonstrate that you can use a pointer to access an address specified numerically, you can retrieve the address and use it within a single program execution:
#include <inttypes.h>
#include <stdio.h>
#include <stdint.h>
int main(void)
{
// Create an int.
int x = 0;
// Find its address.
char buf[100];
sprintf(buf, "%" PRIuPTR, (uintptr_t) &x);
printf("The address of x is %s.\n", buf);
// Read the address.
uintptr_t u;
sscanf(buf, "%" SCNuPTR, &u);
// Convert the integer value to an address.
int *p = (int *) u;
// Modify the int through the new pointer.
*p = 123;
// Display the int.
printf("x = %d\n", x);
return 0;
}
Obviously, this is not useful in a normal program; it is just a demonstration. You would use this sort of behavior only when you have a special need to access certain addresses.
For accessing Specific memory from user space, we have to map the memory Address to Programs Virtual Address using mmap(), the below C code shows the implementation:
Take a file "test_file" containing "ABCDEFGHIJ".
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <fcntl.h>
int main(void)
{
char *map_base_addr; // Maping Base address for file
int fd; // File descriptor for open file
int size = 10;
fd= open("test_file", O_RDWR); //open the file for reading and writing
map_base_addr= mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);// Maping file into memory
char *ch= map_base_addr;
int i;
/*Printing first 10 char*/
for(i=0; i<size; i++)
fputc(*(ch+i),stdout);
printf("\n");
*(ch+1) = 'b';
*(ch+4) = 'z';
*(ch+7) = 'x';
/*Printing char after modification*/
for(i=0; i<size; i++)
fputc(*(ch+i),stdout);
printf("\n");
/* Finally unmap the file. This will flush out any changes. */
munmap(map_base_addr, size);
exit(0);
}
The output will be:
ABCDEFGHIJ
AbCDzFGxIJ
It works for me:
#include <stdio.h>
int main(int argc, char**argv) {
int var = 7456;
printf("Adress of var = %x, var=%d\n", &var, var);
int *ptr = (int*)0x22cd28;
printf(" ptr points to %x\n", ptr);
*ptr = 123;
printf("New value of var=%d\n", var);
return 0;
}
Program output:
Adress of var = 22cd28, var=7456
ptr points to 22cd28
New value of var=123
Note:
The address is usually not the same on every execution. When I tried my example I had to run it three times before I got the address to match.
char* can point to any adress (because sizeof (char) = 1). Pointers to larger objects must often be aligned on even adresses (usually one divisible by 4).
Your question doesn't really make much sense if you are running on linux/windows/mac/whatever
http://en.wikipedia.org/wiki/Virtual_memory
You can do that only if you are programming a device without virtual memory, or if you are programming the operating system itself.
Otherwise the addresses you see are not the "real" addresses on the RAM, the operating system translates them to real addresses and if there is not a map to translate your virtual address to a real one, then you can get a segmentation fault. Keep in mind that there are other reasons that can cause a segmentation fault.

malloc under linux, implicit limit

Sorry if the title isn't as descriptive as it should be, the problem is hard to put in a few words. I am trying to find out how much mem i have available by malloc'ing and if that worked, writing to that segment. On certain systems (all linux on x86_64) i see segfaults when writing to the 2049th mib. The code is:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <sys/mman.h>
int main (int argc, char **argv) {
void *addr;
int megasize = 2100
/// allocate the memory via mmap same result
//addr = mmap ((void *) 0, (size_t) megasize << 20, PROT_READ | PROT_WRITE,
// MAP_PRIVATE | MAP_ANONYMOUS, (int) -1, (off_t) 0);
addr = malloc(megasize << 20);
if (addr == MAP_FAILED) {
fprintf (stderr, "alloc of %d megabytes failed %s\n", megasize,
strerror (errno));
exit (1);
};
printf ("got %d megabytes at %p\n", megasize, addr);
{
int i;
char *p = addr;
printf("touching the %d Mb memory:\n", megasize);
for (i = 0; i < megasize; i++) {
p[i << 20] = 0;
putchar('.');
if (i%64==63) // commenting out this line i see that it really is the 2049th mb
printf(" #%d\n", i);
fflush(stdout);
};
putchar('\n');
};
/// free the memory
munmap (addr, (size_t) megasize << 20);
return 0;
}
It segfaults reliably on some systems, whereas on others it works fine. Reading the logs for the systems where it fails tells me it's not the oom killer. There are values for megasize that i can choose which will cause malloc to fail but those are larger.
The segfault occurs reliably for any size bigger than 2gib and smaller than the limit where malloc returns -1 for those systems.
I believe there is a limit i am hitting that isn't observed by malloc and i can't figure out what it is. I tried reading out a few of the limits via getrlimit that seemed relevant like RLIMIT_AS and RLIMIT_DATA but those were way bigger.
This is the relevant part of my valgrindlog
==29126== Warning: set address range perms: large range [0x39436000, 0xbc836000) (defined)
==29126== Invalid write of size 1
==29126== at 0x400AAD: main (in /home/max/source/scratch/memorytest)
==29126== Address 0xffffffffb9436000 is not stack'd, malloc'd or (recently) free'd
Can anybody please give me a clue as to what the problem is?
You'll be getting an overflow when counting via int i, as int is 4 bytes wide here:
p[i << 20] = ...
Change
int i;
to be
size_t i;
size_t is the preferred type when addressing memory.
An 32-bit int cannot store the value 2049 mb. You're invoking undefined behavior via signed integer overflow, and happen to be getting a negative number. On most 32-bit machines, when added to a pointer that wraps back around and ends up giving you the address you wanted, by accident. On 64-bit machines, that gives you an address roughly 2047 mb below the start of your block of memory (or wrapped around to the top of the 64-bit memory space).
Use the proper types. Here, i should have type size_t.

Memory allocation for a matrix in C

Why is the following code resulting in Segmentation fault? (I'm trying to create two matrices of the same size, one with static and the other with dynamic allocation)
#include <stdio.h>
#include <stdlib.h>
//Segmentation fault!
int main(){
#define X 5000
#define Y 6000
int i;
int a[X][Y];
int** b = (int**) malloc(sizeof(int*) * X);
for(i=0; i<X; i++){
b[i] = malloc (sizeof(int) * Y);
}
}
Weirdly enough, if I comment out one of the matrix definitions, the code runs fine. Like this:
#include <stdio.h>
#include <stdlib.h>
//No Segmentation fault!
int main(){
#define X 5000
#define Y 6000
int i;
//int a[X][Y];
int** b = (int**) malloc(sizeof(int*) * X);
for(i=0; i<X; i++){
b[i] = malloc (sizeof(int) * Y);
}
}
or
#include <stdio.h>
#include <stdlib.h>
//No Segmentation fault!
int main(){
#define X 5000
#define Y 6000
int i;
int a[X][Y];
//int** b = (int**) malloc(sizeof(int*) * X);
//for(i=0; i<X; i++){
// b[i] = malloc (sizeof(int) * Y);
//}
}
I'm running gcc on Linux on a 32-bit machine.
Edit: Checking if malloc() succeeds:
#include <stdio.h>
#include <stdlib.h>
//No Segmentation fault!
int main(){
#define X 5000
#define Y 6000
int i;
int a[X][Y];
int* tmp;
int** b = (int**) malloc(sizeof(int*) * X);
if(!b){
printf("Error on first malloc.\n");
}
else{
for(i=0; i<X; i++){
tmp = malloc (sizeof(int) * Y);
if(tmp)
b[i] = tmp;
else{
printf("Error on second malloc, i=%d.\n", i);
return;
}
}
}
}
Nothing is printed out when I run it (expect of course for "Segmentation fault")
Your a variable requires, on a 32-bit system, 5000 * 6000 * 4 = 120 MB of stack space. It's possible that this violates some limit, which causes the segmentation fault.
Also, it's of course possible that malloc() fails at some point, which might casue you to dereference a NULL pointer.
You are getting a segmentation fault which means that your program is attempting to access a memory address that has not been assigned to its process. The array a is a local variable and thus allocated memory from the stack. As unwind pointed out a requires 120 Mbytes of storage. This is almost certainly larger than the stack space that the OS has allocated to your process. As soon as the for loop walks off the end of the stack you get a segmentation fault.
In Linux the stack size is controlled by the OS not the compiler so try the following:-
$ ulimit -a
In the response you should see a line something like this:-
stack size (kbytes) (-s) 10240
This means that each process gets 10Mbyte of storage, nowhere near enough for your large array.
You can adjust the stack size with a ulimit -s <stack size> command but I suspect it will not allow you to select a 120Mbyte stack size!
The simplest solution is to make a a global variable instead of an local variable.
Try to increase heap and stack limits in GCC:
gcc -Wl,--stack=xxxxx -Wl,--heap=yyyyy
Those are sizable allocations. Have you tried checking to make sure malloc() succeeds?
You might use malloc() for all your arrays, and check to make sure it succeeds each time.
A stack overflow (how appropriate!) can result in a segmentation fault which is what it seems you're seeing here.
In your third case the stack pointer is being moved to an invalid address but isn't being used for anything since the program then exits. If you put any operation after the stack allocation you should get a segfault.
Perhaps the compiler is just changing the stack pointer to some large value but never using it, and thus never causing a memory access violation.
Try initializing all of the elements of A in your third example? Your first example tries to allocate B after A on the stack, and accessing the stack that high (on the first assignment to B) might be what's causing the segfault.
Your 3rd code doesn't work either (on my system at least).
Try allocating memory to array a on the heap rather(when dimensions are large).
Both matrices don't fit in the limits of your memory. You can allocate only one at a time.
If you define Y as 3000 instead of 6000, your program should not issue segfault.

Resources