C, get the memory segment given the pointer - c

Is it possible to get memory segment given a pointer / immidiate address value.
Is there a solution already avaliable in GDB ?.
If not a custom gcc (non portable) function implementation should be good too.
Ex :
int data = 100;
int main(void) {
int ldata = 100;
int *hdata = malloc(10 * sizeof(int));
}
getMemSeg(&data) should return "DATA"
getMemSeg(&ldata) should return "STACK"
getMemSeg(hdata) should return "HEAP"

You can read /proc/self/maps to see the details of the memory layout of the process. A C library has a symbol which indicates the beginning of the heap. In glibc it's calles _end. And anything on the stack is always below anyting in the current stack frame (at least in x86 and other processors where the stack grows downwards). So at least for a simple program these heuristics would work:
extern int _end;
const char* getMemSeg(void* p){
int stackframe;
if(p>(void*)&stackframe)
return "STACK";
if(p<(void*)&_end)
return "DATA";
return "HEAP";
}

Related

Obtain size of array via write permission check

To obtain the length of a null terminated string,we simply write len = strlen(str) however,i often see here on SO posts saying that to get the size of an int array for example,you need to keep track of it on your own and that's what i do normally.But,i have a question,could we obtain the size by using some sort of write permission check,that checks if we have writing permissions to a block of memory? for example :
#include <stdio.h>
int getSize(int *arr);
bool permissionTo(int *ptr);
int main(void)
{
int arr[3] = {1,2,3};
int size = getSize(arr) * sizeof(int);
}
int getSize(int *arr)
{
int *ptr = arr;
int size = 0;
while( permissionTo(ptr) )
{
size++;
ptr++;
}
return size;
}
bool permissionTo(int *ptr)
{
/*............*/
}
No, you can't. Memory permissions don't have this granularity on most, if not all, architectures.
Almost all CPU architectures manage memory in pages. On most things you'll run into today one page is 4kB. There's no practical way to control permissions on anything smaller than that.
Most memory management is done by your libc allocating a large:ish chunk of memory from the kernel and then handing out smaller chunks of it to individual malloc calls. This is done for performance (among other things) because creating, removing or modifying a memory mapping is an expensive operation especially on multiprocessor systems.
For the stack (as in your example), allocations are even simpler. The kernel knows that "this large area of memory will be used by the stack" and memory accesses to it just simply allocates the necessary pages to back it. All tracking your program does of stack allocations is one register.
If you are trying to achive, that an allocation becomes comfortable to use by carrying its own size around then do this:
Wrap malloc and free by prefixing the memory with its size internally (written from memory, not tested yet):
void* myMalloc(long numBytes) {
char* mem = malloc(numBytes+sizeof(long));
((long*)mem)[0] = numBytes;
return mem+sizeof(long);
}
void myFree(void* memory) {
char* mem = (char*)memory-sizeof(long);
free(mem)
}
long memlen(void* memory) {
char* mem = (char*)memory-sizeof(long);
return ((long*)mem)[0];
}

alloca() in caller's space

Thinking about returning dynamic or automatic arrays. Not really C-related.
The usual technique to return an array is: A) callee allocates on heap and returns, B) caller allocates on stack and passes to callee.
// A
void caller(void) {
int *a = callee();
free(a);
}
int *callee(void) {
int *a = malloc(10 * sizeof(*a));
return a;
}
// B
void caller(void) {
int a[10]; callee(a, sizeof(a) / sizeof(a[0]));
}
void callee(int *a, size_t n) {
//
}
Case A may lead to unnecessary allocate-free cycle, while case B requires syntactic garbage in caller. In B we also can't compute n in callee, because it comes predefined. We also can't return automatic storage because it will be destroyed on return (accessing it is UB in general).
But what if we introduce new return_auto operator that will return from callee, but leave it's stack frame intact, as if caller did all the job on it's own stack?
// C
void caller(void) {
int *a = callee();
}
int *callee() {
int a[compute_n()];
return_auto a;
}
I mean, caller could inherit callee's stack frame and all issues disappear. It's stack frame would look like this after return_auto:
[caller frame]
arguments
ret-pointer
locals
int *a = callee.a
[callee frame] (defunct)
arguments
ret-pointer
locals
int a[n] (still alive)
[end-of-callee-frame]
[end-of-caller-frame]
In machine code (x86 at least) this may be implemented by jumping to ret-pointer at ss:ebp instead of mov esp, ebp / ret n. We already have VLAs in modern C, and this looks very similar but slightly complex.
Of course that should be used with care, because series of return_auto's would leave pretty huge dump on stack, that will be "collected" only when outermost caller returns (normally). But stack allocations are insanely cheap, and in theory some algorithms could benefit from not calling malloc/free at all. This is also interesting in code structuring perspective, not just performance.
Does anyone know where this technique is implemented / stack frames joined?
(C is just an example here)
Okay, it needs a simple example.
void caller(Context *ct) {
char *s = make_s(ct);
printf("%s\n", s);
}
void make_s(Context *ct) {
const char *tag = "?", *name = "*";
if (ct->use_tag) tag = ct->tag;
else if (ct->app) tag = ct->app->tag;
if (ct->app) name = ct->app->name;
char s[strlen(tag)+strlen(name)+10];
snprintf(s, len, "%s.object(%s)", name, tag);
return_auto s;
}
Obviously, for now we need to explode that in caller's body (probably via macro to feel all caveats) or do asprintf/malloc in callee and free in caller.
This seems a very bad idea for any non-trivial scenario. Just remember a stack frame contains all the local variables along with return address, saved base pointer, and so on. In your model, a caller would need to "inherit" the whole frame as part of its own frame. Then think about you might pass this returned value to some OTHER function. So what if this function wants to return more than just an integral value? You would easily end up with a huge stack frame for main(). Any heap implementation is probably more space efficient.

How to save stack and heap

How can I save (and restore) the stack and the heap of a program at a specific point in the program?
Consider a program like this:
int main()
{
int a;
int *b;
b = (int*)malloc(sizeof(int))
a = 1;
*b = 2;
save_stack_and_heap(); // looking for this...
a = 3;
*b = 4;
restore_stack_and_heap() // ... and this
printf("%d %d\n",a,*b);
return 0;
}
The output should be:
1 2
It boils down to (am I right?): How do I get the pointers to the stack and to the heap and their sizes?
edit
I want to use this for a number of things. One of them is to write code that can handle hardware failure by checkpointing and being able to restart at a checkpointed state.
Let's focus on the stack, as heap allocations can be tracked otherwise (good old malloc preload for instance).
The code should be reusable. There can be any possible number and type of variables on the stack.
The best would be standard C99. Next best Posix conform. Next best Linux conform.
I am usually using GCC but I would prefer not to use built ins...
int main()
{
int a = 1;
int *b = malloc(sizeof(int));
*b = 2;
if (fork() == 0) {
a = 3;
*b = 4;
return 0;
}
int status;
wait(&status);
printf("%d %d\n",a,*b);
return 0;
}
So you haven't given a lot of scope of what you are trying to achieve but I will try and tackle a few perspectives and at least something that can get you started.
It boils down to (am I right?): How do I get the pointers to the stack
and to the heap and their sizes?
The stack is a large thing, and often expandable in size. I'm going to skip the heap bit as you are going to struggle to save all the heaps (that kinda doesn't make any sense). Getting a pointer to the stack is as easy as declaring a variable and taking a reference to it like so.
int a = 5;
void *stack_ptr = &a;
void *another_stack_ptr = &stack_ptr;
// We could could go on forever with this....
That is not the base address of the stack however. If you want to find that there may be many methods, and even API's (I think there is on Windows). You can even just walk in both directions from an address on the stack until you get a page fault. That is likely to mark the beginning and end of the stack. The following might work, no guarantees. You will need to set up an exception handler to handle the page fault so your app doesn't crash.
int variable = 5;
int *stack_start = &variable;
int *stack_end = stack_start;
int *last_good_address = NULL;
// Setup an exception handler
...
// Try accessing addresses lower than the variable address
for(;;)
{
int try_read = stack_start[0];
// The read didn't trigger an exception, so store the address
last_good_address = stack_start
stack_start--;
}
// Catch exception
... stack_start = last_good_address
// Setup an exception handler
...
// Try accessing addresses higher than the variable address
for(;;)
{
int try_read = stack_end[0];
// The read didn't trigger an exception, so store the address
last_good_address = stack_end
stack_end--;
}
// Catch exception
... stack_end = last_good_address
So if you have the base and end address of the stack you can now memcpy it into some memory (I'd advise against the stack though!).
If you just want to copy a few variables, because copying the entire stack would be crazy, the conventional method would be to save them prior to a call
int a = 5;
int b = 6;
int c = 7;
// save old values
int a_old = a;
int b_old = b;
int c_old = c;
some_call(&a, &b, &c);
// do whatever with old values
I'll assume that you have written a function that has 10,000 variables on the stack, and you don't want to have to save them all manually. The following should work in this case. It uses _AddressOfReturnAddress to get the highest possible address for the current functions stack and allocates some stack memory to get the lowest current value. It then copies everything in between.
Disclaimer: This has not been compiled, and is unlikely to work out of the box, but I believe the theory is sound.
// Get the address of the return address, this is the highest address in the current stack frame.
// If you over-write this you are in trouble
char *end_of_function_stack = _AddressOfReturnAddress();
// Allocate some fresh memory on the stack
char *start_of_function_stack = alloca(16);
// Calculate the difference between our freshly allocated memory and the address of the return address
// Remember to subtract the size of our allocation from this to not include it in the stack size.
ptrdiff_t stack_size = (end_of_function_stack - start_of_function_stack) - 16);
// Calculation should not be negative
assert(stack_size > 0)
// Allocate some memory to save stack variables
void *save_the_stack = malloc(stack_size);
// Copy the variables
memcpy(save_the_stack, &start_of_function_stack[16], stack_size);
That's about all I can offer you with the limited information in your question.
I think you're looking to reuse the variable names a and b in this case? You should declare new variable of the same name on different scope!
int main()
{
int a=1;
int *b = (int*)malloc(sizeof(int));
*b=2;
{
int a=3;
int *b = (int*)malloc(sizeof(int));
*b=4
}//beware, other lang such as C# may persist stack variables after this point
//old a,b should be reachable here
}

Why is alloca different from just creating a local variable?

I read that there is a funciton called alloca that allocates memory from the stack frame of the current function rather than the heap. The memory is automatically destroyed when the function exits.
What is the point of this, and how is it any different from just crating an array of a structure or a local variable within the function? They would go on the stack and would be destroyed at the end of the function as well.
PS: I saw the other alloca question and it didn't answer how these two things are different :)
When you use alloca, you get to specify how many bytes you want at run time. With a local variable, the amount is fixed at compile time. Note that alloca predates C's variable-length arrays.
With alloca you can create a dynamic array (something that normally requires malloc) AND it's VERY fast. Here there are the advantages and disadvantages of GCC alloca:
http://www.gnu.org/s/hello/manual/libc/Variable-Size-Automatic.html#Variable-Size-Automatic
I think the following are different:
void f()
{
{
int x;
int * p = &x;
}
// no more x
}
void g()
{
{
int * p = alloca(sizeof(int));
}
// memory still allocated
}
Until gcc and C99 adopted Variable-length arrays, alloca offered significantly more power than simple local variables in that you could allocate arrays whose length is not known until runtime.
The need for this can arise at the boundary between two data representations. In my postscript interpreter, I use counted strings internally; but if I want to use a library function, I have to convert to a nul-terminated representation to make the call.
OPFN_ void SSsearch(state *st, object str, object seek) {
//char *s, *sk;
char s[str.u.c.n+1], sk[seek.u.c.n+1]; /* VLA */
//// could also be written:
//char *s,*sk;
//s = alloca(str.u.c.n+1);
//sk = alloca(seek.u.c.n+1);
char *r;
//if (seek.u.c.n > str.u.c.n) error(st,rangecheck);
//s = strndup(STR(str), str.u.c.n);
//sk = strndup(STR(seek), seek.u.c.n);
memcpy(s, STR(str), str.u.c.n); s[str.u.c.n] = '\0';
memcpy(sk, STR(seek), seek.u.c.n); sk[seek.u.c.n] = '\0';
r = strstr(s, sk);
if (r != NULL) { int off = r-s;
push(substring(str, off + seek.u.c.n, str.u.c.n - seek.u.c.n - off)); /* post */
push(substring(str, off, seek.u.c.n)); /* match */
push(substring(str, 0, off)); /* pre */
push(consbool(true));
} else {
push(str);
push(consbool(false));
}
//free(sk);
//free(s);
}
There is also a dangerous usage of alloca, which is easily avoided by prefering VLAs. You cannot use alloca safely within the argument list of a function call. So don't ever do this:
char *s = strcpy(alloca(strlen(t)+1, t);
That's what VLAs are for:
char s[strlen(t)+1];
strcpy(s,t);

Basic question: C function to return pointer to malloc'ed struct

About C structs and pointers...
Yesterday I wrote sort of the following code (try to memorize parts of it out of my memory):
typedef struct {
unsigned short int iFrames;
unsigned short int* iTime; // array with elements [0..x] holding the timing for each frame
} Tile;
Tile* loadTile(char* sFile)
{
// expecting to declare enough space for one complete Tile structure, of which the base memory address is stored in the tmpResult pointer
Tile* tmpResult = malloc(sizeof(Tile));
// do things that set values to the Tile entity
// ...
// return the pointer for further use
return tmpResult;
}
void main()
{
// define a tile pointer and set its value to the returned pointer (this should also be allowed in one row)
// Expected to receive the VALUE of the pointer - i.e. the base memory address at where malloc made space available
Tile* tmpTile;
tmpTile = loadTile("tile1.dat");
// get/set elements of the tile
// ...
// free the tile
free(tmpTile);
}
What I see: I cán use the malloced Tile structure inside the function, but once I try to access it in Main, I get an error from Visual Studio about the heap (which tells me that something is freed after the call is returned).
If I change it so that I malloc space in Main, and pass the pointer to this space to the loadTile function as an argument (so that the function does no longer return anything) then it does work but I am confident that I should also be able do let the loadTile function malloc the space and return a pointer to that space right?!
Thanks!!
There's nothing wrong with what you're trying to do, or at least not from the code here. However, I'm concerned about this line:
unsigned short int* iTime; // array with elements [0..x] holding the timing for each frame
That isn't true unless you're also mallocing iTime somewhere:
Tile* tmpResult = malloc(sizeof(Tile));
tmpResult->iTime = malloc(sizeof(short) * n);
You will need to free it when you clean up:
free(tmpTile->iTime);
free(tmpTile);
You are probably writing over memory you don't own. I guess that in this section:
// do things that set values to the Tile entity
you're doing this:
tmpResult->iFrames = n;
for (i = 0 ; i < n ; ++n)
{
tmpResult->iTime [i] = <some value>;
}
which is wrong, you need to allocate separate memory for the array:
tmpResult->iTime = malloc (sizeof (short int) * n);
before writing to it. This make freeing the object more complex:
free (tile->iTime);
free (tile);
Alternatively, do this:
typedef struct {
unsigned short int iFrames;
unsigned short int iTime [1]; // array with elements [0..x] holding the timing for each frame
} Tile;
and malloc like this:
tile = malloc (sizeof (Tile) + sizeof (short int) * (n - 1)); // -1 since Tile already has one int defined.
and the for loop remains the same:
for (i = 0 ; i < n ; ++n)
{
tmpResult->iTime [i] = <some value>;
}
but freeing the tile is then just:
free (tile);
as you've only allocated one chunk of memory, not two. This works because C (and C++) does not do range checking on arrays.
You code, with as little changes as I could live with, works for me:
#include <stdio.h>
#include <stdlib.h>
typedef struct {
unsigned short int iFrames;
unsigned short int* iTime;
} Tile;
Tile *loadTile(char* sFile) {
Tile *tmpResult = malloc(sizeof *tmpResult);
if (!tmpResult) return NULL;
/* do things that set values to the Tile entity */
/* note that iTime is uninitialized */
tmpResult->iFrames = 42;
(void)sFile; /* used parameter */
return tmpResult;
}
int main(void) {
Tile* tmpTile;
tmpTile = loadTile("tile1.dat");
if (!tmpTile) return 1;
printf("value: %d\n", tmpTile->iFrames);
free(tmpTile);
return 0;
}
The code you showed looks OK, the error must be in the elided code.
Whatever problem you are having, it is not in the code shown in this question. Make sure you are not clobbering the pointer before returning it.
This should work fine... could just be a warning from VisualStudio that you are freeing a pointer in a different function than it was malloced in.
Technically, your code will work on a C compiler. However, allocating dynamically inside functions and returning pointers to the allocated data is an excellent way of creating memory leaks - therefore it is very bad programming practice. A better way is to allocate the memory in the caller (main in this case). The code unit allocating the memory should be the same one that frees it.
Btw if this is a Windows program, main() must be declared to return int, or the code will not compile on a C compiler.

Resources