C - How to restrict address access in heap? - c

For variables store in the stack, we can use static to avoid accessing from other files. Is there anyway to avoid pointer from other files accessing certain address?

First, to get things out of the way, static variables are never allocated on the stack because they are essentially global variables, they simply don't pollute the global namespace. It's trivial to get a pointer to a static variable and change it, statics are a compiler enforced construct.
Back to the actual question though, no you cannot try to examine the memory access directly. How would you even know if the memory you're accessing is valid or not? You can do something along the line though. You can for example, wrap malloc and free with your own memory management functions, and keep track of the memory allocated and freed along with metadata. You can then use another wrapper function that takes care of pointer dereferencing, and checks the metadata as you desire. You still can use raw pointers to wreak havoc if you want, so it isn't really much though.

Related

Is there a standard C malloc like function that takes double pointer to avoid fragmentation?

The problem with the current malloc function is that is does not keep track of the variable that stores the returned memory location. As such fragmentation can occur because it is harder to move around memory.
The MMU can solve this only to a certain extent. Lets say that instead malloc took a double pointer and kept track of the variable. Calls to free would allow free to move around memory and change the memory location.
It is highly unlikely that I am the first to think about this so I am wondering if there is a standard C function that does this or POSIX function?
I understand that this idea is not perfect. The program would have to pass around the same variable instead of copying it however it does solve the issue of fragmentation which does matter to me as I work with low memory devices.
Of course, the realloc() function will, if necessary, move a previously allocated block of memory to a new (larger) location. However, it does not necessarily impact fragmentation.
My solution (in C) has been (in low-memory conditions) to do my own memory management.
Though not the same mechanism, the closest thing to what you are speaking of are smart pointers as implemented with the boost libraries. However, they are built for C++.
Smart pointers are 'smart' in that they don't hang around after they aren't needed (and you don't have to free them) so they avoid most of the fragmentation problems you cite.
Keeping track of which variable points to which memory location cannot be done by just passing a pointer to a pointer. What if the address of the allocated memory is copied to another variable?
C is unlike Java that keeps track of object references. In your case, you may be better off managing memory on your own by preallocating a large chunk of memory and splitting it as needed, keeping track of usage, in brief, implementing your own memory management.
The idea is not that attractive when you start thinking about implementation details. You can of course write a function that returns a double pointer. What about the intermediate pointer? Where should it be stored? The program itself clearly cannot store it, because the intermediate oointer should have exactly the same lifetime as the pointed-to memory. So the allocator itself should store and manage it. And the memory where the intermediate pointer lives is not movable. But the killer misfeature is that the scheme is not thread safe. As pointers are free to change under the hood, each pointer dereference now requires a lock.

Freeing other variable types in C

C does not have garbage collection, hence whenever we allocate memory using malloc/calloc/realloc, we need to manually free it after its use is over. How are variables of other data types like int, char etc. handled by C? How is the memory allocated to those variables released?
That depends. If you allocate any of those data types with malloc/calloc/realloc you will still need to free them.
On the other side, if a variable is declared inside a function, they are called automatic variables and whenever that function ends they'll be automatically collected.
The point here is not the data type per se, is the storage location. malloc/calloc/realloc allocate memory in the heap whereas automatic variables (variables declared inside functions) are allocated in the stack.
The heap is completely managed by the programmer, while the stack works in a way that when a function ends, the stack frame is shrink and every variable occupying that frame will be automatically overwritten when another function is called.
To grasp a better feeling of these, take a look at the memory layout of a C program. Other useful references might be free(3) man page and Wikipedia page for Automatic variables.
Hope this helps!
Resources (such as memory) have nothing to do with variables. You never have to think about variables. You only have to think about the resource itself, and you need to manage the lifetime of the resource. There are function calls that acquire a resource (such as malloc) and give you a handle for the resource (such as a void pointer), and you have to call another function (such as free) later on with that handle to release the resource.
Memory is only one example, C standard I/O-files work the same way, as do mutexes, sockets, window handles, etc. (In C++, add "dynamically allocated object" to the list.) But the central concept is that of the resource, the thing that needs acquiring and releasing. Variables have nothing to do with it except for the trivial fact that you can use variables to store the resource handles.

Which memory locations to use for variable storage

Higher level languages such as javascript don't give the programmer a
choice as to where variables are stored. But C does. My question is:
are there any guidelines as to where to store variables, eg dependent
on size, usage, etc.
As far as I understand, there are three possible locations to store
data (excluding code segment used for actual code):
DATA segment
Stack
Heap
So transient small data items should be stored on the stack?
What about data items which must be shared between functions. These
items could be stored on the heap or in the data segment. How do you
decide which to choose?
You're looking through the wrong end of the telescope. You don't specify particular memory segments in which to store a variable (particularly since the very concept of a "memory segment" is highly platform-dependent).
In C code, you decide a variable's lifetime, visibility, and modifiability based on what makes sense for the code, and based on that the compiler will generate the machine code to store the object in the appropriate segment (if applicable)
For example, any variables declared at file scope (outside of any function) or with the keyword static will have static storage duration, meaning they are allocated at program startup and held until the program terminates; these objects may be allocated in a data segment or bss segment. Variables declared within a function or block without the static keyword have automatic storage duration, and are (typically) allocated on the stack.
String literals and other compile-time constant objects are often (but not always!) allocated in a readonly segment. Numeric literals like 3.14159 and character constants like 'A' are not objects, and do not (typically) have memory allocated for them; rather, those values are embedded directly in the machine code instructions.
The heap is reserved for dynamic storage, and variables as such are not stored there; instead, you use a library call like malloc to grab a chunk of the heap at runtime, and assign the resulting pointer value to a variable allocated as described above. The variable will live in either the stack or a data segment, while the memory it points to lives on the heap.
Ideally, functions should communicate solely through parameters, return values, and exceptions (where applicable); functions should not share data through an external variable (i.e., a global). Function parameters are usually allocated on the stack, although some platforms may pass parameters via registers.
You should prefer local/stack variables to global or heap variables when those variables are small, used often and in a relatively small/limited scope. That will give the compiler more opportunities to optimize the code using them as it'll know they aren't going to change between function calls unless you pass around pointers to them.
Also, the stack is usually relatively small and allocating large structures or arrays on it may lead to stack overflows, especially so in recursive code.
Another thing to consider is the use of global variables in multithreaded programs. You want to minimize chances of race conditions and one strategy for that is maiking functions thread-safe and re-enterant by not using any global resources in them directly (if malloc() is thread-safe, if errno is per-thread, etc you can use them, of course).
Btw, using local variables instead of global variables also improves code readability as the variables are located close to the place where they're used and you can quickly find out their type and where and how they're used.
Other than that, if your code is correct, there shouldn't be much practical difference between making variables local or global or in the heap (of course, malloc() can fail and you should remember about it:).
C only allows you to specify where data is stored indirectly... via the scope of the variable and/or allocation. i.e., a local variable to a function is typically a stack variable unless it is declared static in which case it will likely be DATA/BSS. Variables created dynamically via new/malloc will typically be heap.
However, there's no guarantee of any of that... only the implication of it.
That said, the one thing that is guaranteed to be a bad idea is to declare large local variables in functions... common source of strange errors and stack overflows. Very large arrays and structures are best suited to dynamic allocation and keep the pointers in local/global as required.

There is pointer fast as stack in C? (Without indirections but still heap)

I am creating a decompiler from IL (Compiled C#\VB code). Is there any way to create reference in C?
Edit:
I want something faster than pointer like stack. Is there a thing like that?
A reference is just a syntactically sugar-coated pointer–a pointer will do just fine.
Stack and pointer are two completely independent concepts.
A reference is just like a pointer, a way to access/pass a variable without copying it.
On the other hand, stack and heap are two different places where variables live.
The decision whether or not a variable should live on the stack or on the heap is totally independent from the way you pass it around.
If you need a local variable, with a lifetime automatically coupled to your function scope declare it on the stack. Allocation is fast, but the object is gone when the function scope ends. Taking this into account, you can pass the variable by value or by pointer to other functions.
If you need a variable that survives the function scope, you need to make it global (or static), or to put the variable dynamically on the heap. Allocation is a bit slower, but once it's there you can use it like the other. You can pass it by value or by pointer then, either. (Bear in mind, that you need to de-allocate dynamically created objects eventually.)
If heap allocation is indeed a performance bottleneck, you should make sure that you use automatic variables (on stack) where possible. Then, do profiling of your allocation patterns. And finally optimize your allocation strategy.

Checking if something was malloced

Given a pointer to some variable.. is there a way to check whether it was statically or dynamically allocated??
Quoting from your comment:
im making a method that will basically get rid of a struct. it has a data member which is a pointer to something that may or may not be malloced.. depending on which one, i would like to free it
The correct way is to add another member to the struct: a pointer to a deallocation function.
It is not just static versus dynamic allocation. There are several possible allocators, of which malloc() is just one.
On Unix-like systems, it could be:
A static variable
On the stack
On the stack but dynamically allocated (i.e. alloca())
On the heap, allocated with malloc()
On the heap, allocated with new
On the heap, in the middle of an array allocated with new[]
On the heap, within a struct allocated with malloc()
On the heap, within a base class of an object allocated with new
Allocated with mmap
Allocated with a custom allocator
Many more options, including several combinations and variations of the above
On Windows, you also have several runtimes, LocalAlloc, GlobalAlloc, HeapAlloc (with several heaps which you can create easily), and so on.
You must always release memory with the correct release function for the allocator you used. So, either the part of the program responsible for allocating the memory should also free the memory, or you must pass the correct release function (or a wrapper around it) to the code which will free the memory.
You can also avoid the whole issue by either requiring the pointer to always be allocated with a specific allocator or by providing the allocator yourself (in the form of a function to allocate the memory and possibly a function to release it). If you provide the allocator yourself, you can even use tricks (like tagged pointers) to allow one to also use static allocation (but I will not go into the details of this approach here).
Raymond Chen has a blog post about it (Windows-centric, but the concepts are the same everywhere): Allocating and freeing memory across module boundaries
The ACE library does this all over the place. You may be able to check how they do it. In general you probably shouldn't need to do this in the first place though...
Since the heap, the stack, and the static data area generally occupy different ranges of memory, it is possible with intimate knowledge of the process memory map, to look at the address and determine which allocation area it is in. This technique is both architecture and compiler specific, so it makes porting your code more difficult.
Most libc malloc implementations work by storing a header before each returned memory block which has fields (to be used by the free() call) which has information about the size of the block, as well as a 'magic' value. This magic value is to protect against the user accidently deleting a pointer which wasn't alloc'd (or freeing a block which was overwritten by the user). It's very system specific so you'd have to look at the implementation of your libc library to see exactly what magic value was there.
Once you know that, you move the given pointer back to point at header and then check it for the magic value.
Can you hook into malloc() itself, like the malloc debuggers do, using LD_PRELOAD or something? If so, you could keep a table of all the allocated pointers and use that. Otherwise, I'm not sure. Is there a way to get at malloc's bookkeeping information?
Not as a standard feature.
A debug version of your malloc library might have some function to do this.
You can compare its address to something you know to be static, and say it's malloced only if it's far away, if you know the scope it should be coming from, but if its scope is unknown, you can't really trust that.
1.) Obtain a map file for the code u have.
2.) The underlying process/hardware target platform should have a memory map file which typically indicates - starting address of memory(stack, heap, global0, size of that block, read-write attributes of that memory block.
3.) After getting the address of the object(pointer variable) from the mao file in 1.) try to see which block that address falls into. u might get some idea.
=AD

Resources