Should I release data after the use of get_user_pages_fast? - c

I'm using the get_user_pages_fast, which I allocate a memory buffer in the user and create a pages in the kernel space.
Should I free the struct pages** after the use this memory? or call to specific release function?
Thanks!

From documentation on get_user_pages() (which has similar functionality, but with more parameters, and needs a semaphore held):
Each page returned must be released with a put_page call when it is finished with. vmas will only remain valid while mmap_sem is held.
As a side node, if you were to be freeing something, it'd be a struct pages * being passed (freeing a struct pages), not struct pages **, since a pointer to struct pages * is passed to be used as a return value.
However, you typically shouldn't be assuming that you should free arbitrary things in the kernel without knowing that you're supposed to. In general, the kernel provides functions to create and destroy whatever objects you're working with.
Often, when you're given a pointer, there's a lot more going on behind the scenes. There could be semaphores, counts of references, etc. That may also be a pointer to a "real" used object in the kernel, not just some structure made for you, so freeing it could rip the rug out form under other pieces of code.

Related

Why all system calls in Linux passes arguments to kernel using "call by reference"?

If we look at the syscalls.h file in Linux kernel, we can see that all most all the arguments of the system calls are passed by reference. For example
asmlinkage long sys_open_by_handle_at(int mountdirfd,
struct file_handle __user *handle,
int flags);
Here, file_handle is passed as a pointer. Why not simple the value is not passed to kernel?
Efficiency.
Many (most?) systems implement function calls by pushing argument values onto a stack. If you pass a struct or any other complex data type by value, you'd need to copy it to the stack. There's no reason to do this, since the kernel has access to the entire memory space of the process. Aside from the copy cost, you'd also increase the stack space needed.
In addition, the kernel will need to copy any data it needs to retain into the kernel memory space. The kernel can't rely on user space code behavior. (It's also not going to free anything obtained from user space, which eliminates some any concerns over mixing up responsibility for reclaiming memory.)
Finally, realistically, coders working in the kernel need to be very comfortable with working with pointers. There's really no advantage to passing by value once you're completely comfortable with pointers.
This part is a bit more of an opinion, but I think there's also a strong legacy effect. The Unix kernel and C developed somewhat in tandem. See https://en.wikipedia.org/wiki/C_(programming_language) for some of the history. It's been a long time, but if I recall correctly, older versions of C wouldn't allow you to pass a struct by value. Regardless, working with pointers was highly idiomatic in C (and I would say still is). In other words, this is just how things have always been done.
The memory space for user mode and kernel mode are different. When you make a system call, the MMU of the Linux subsystem makes sure that proper memory mapping of the user space process running in their own Virtual address space is done to the Physical address space of the kernel.
Variables in the user mode stay in the process' virtual address space. They can't just be passed in system calls and expected to get mapped in the physical address space .
This is what my understanding is. Would love to discuss and clarify if needed.
Principally I understand that the struct file_handle parameter of the function sys_open_by_handle_at(() is an "in" parameter, i.e. it is not modified by the function. Therefore it could as well be passed by value. I see about three reasons why this is not done. All reasons are surely valid for this particular function; at least the last argument (K&R) applies to all struct arguments, in all system calls.
The struct can have a size of e.g. 128 bytes which would be slow to copy to the stack.
Passing a pointer obviates the need to know the struct definition on the caller side. The struct is an "opaque handle" filled by a previous call to [sys_]name_to_handle_at(). The caller doesn't want to and actually shouldn't be burdened with the details of the struct's contents. (Leaving the caller innocent obviates the need to recompile the program because the struct's layout changes. I can also imagine that the contents differs between file system types.)
Unix and even its open source complement Linux is older than C99. I suppose that for the longest time K&R C was the smallest common denominator C standard the kernel sources adhered to. In K&R C it is simply not possible to pass structs by value.

Which data structure works best in shared memory scenario and fast lookup

I am still at a conceptual stage of a project. Yet to start code implementation. A subtask is this :
2 Processes will request data from a commonly accessed DLL. This DLL would be storing this data in a buffer in memory. If I just instantiate a structure within the DLL and store data in it, then each process instance will have a seperate structure and the data won't be common. So I need to have a shared memory implementation. Now another requirement that I have is of fast lookup time within the data. I am not sure how an AVL tree can be stored within a shared memory space. Is there an implementation available on the internet for an AVL tree/Hashmap that can be stored in shared memory space ? Also, is this the right approach to the problem ? Or should I be using something else altogether ?
TIA!
Whether this is the right approach depends on various factors, such as how expensive the data is to produce, whether the processes need to communicate with each other concerning the data, and so on. The rest of this answer assumes that you really do need a lookup structure in shared memory.
You can use any data structure, provided that you can allocate storage for both your data and the data structure's internals in your shared memory space. This typically means that you won't be able to use malloc for it, since each process' heap usually remains private. You will need your own custom allocator.
Let's say you chose AVL trees. Here's a library that implements them: https://github.com/fbuihuu/libtree. It looks like in this library, the "internal" AVL node data is stored intrusively in your "objects." Intrusive means that you reserve fields to be used by the library when declaring your object struct. So, as long as you allocate space for your objects in shared memory, using your custom allocator, and also allocate space for the root tree struct there as well, the whole tree should be accessible to multiple processes. You just have to make sure that the shared memory itself is mapped to the same address range in each process.
If you used a non-intrusive AVL implementation, meaning that each node is represented by an internal struct which then points to a separate struct containing your data, the library or your implementation would have to allow you to specify the allocator for the internal struct somehow, so that you could make sure the space will be allocated in shared memory.
As for how to write the custom allocator, that really depends on your usage and the system. You need to consider if you will ever need to "resize" the shared memory region, whether the system allows you to do that, whether you will allocate only fixed-width blocks inside the region, or you need to support blocks with arbitrary length, whether it's acceptable to spread your data structures over multiple shared memory regions, how your processes can synchronize and communicate, and so on. If you go this route, you should ask a new question on the topic. Be sure to mention what system you are using (Windows?) and what your constraints are.
EDIT
Just to further discourage you from doing this unless it's necessary: if, for example, your data is expensive to produce but you don't care whether the processes build up their own independent lookup structures once the data is available to them, then you can, for example, have the DLL write the data to a simple ring buffer in shared memory, and the rest of the code take it from there. Building up two AVL trees isn't really a problem unless they are going to be very large.
Also, if you only care about concurrency, and it's not important for there to be two processes, you may be able to make them both threads of one process.
In the case of Windows, Microsoft's recommended functions return what can be different pointer values to shared memory for each process. This means that within the shared memory, offsets (from the start of shared memory) have to be used instead of pointers. For example in a linked list, there is a next offset instead of a next pointer. You may want to create macros to convert offsets to pointers, and pointers to offsets.

How to write a simple malloc function in c

As an assignment in operating systems we have to write our own code for malloc and free in C programming language, I know if i asked the code for it there is no point of me to study. i'm facing the problem of not knowing where to include initializing char array with 50000 bytes and making two lists free and used. in my function i can't trigger malloc or free to happen automatically. and a 3rd party main program will be used to test my functions.....
if my file is mymalloc.c or what ever
void* myalloc(size_t size)
{
//code for allocating memory
}
void myfree(void *ptr)
{
//code for free the memory
}
where do the code for initiating memory space and lists will go..
I will provide you with the basic concept which you can use to write your own code for malloc() and free() functions using C.
Assume that we have a contiguous block of memory of a certain size. It will be our abstract sense of memory which will carry all the requested memory allocations plus the data structures that are used to hold data about those allocated blocks.
We use a simple linked list to carry the data related to the allocated as well as free blocks of memory.
Its structure is as follows.
struct block{
size_t size; /*Specifies the size of the block to which it refers*/
int free; /*This is the flag used to identify whether a block is free
or not*/
struct block *next; /*This points to the next metadata block*/
};
You will need 2 source files for this purpose. One is mymalloc.h which is the header file which contains the initialization parts and the function prototypes of the rest of the functions that we are going to implement. The other is the mymalloc.c source file which contains all the necessary function implementations.
There needs to be a function to initialize the first free memory block.
And another function to split a block of memory which has more than enough space to give to the requested size. And another method to scan through the linked list and merge any consecutive blocks that are free, so that it prevents external fragmentation.
Note: We use the First-fit-algorithm to find a free block to allocate memory.
I think this will help anyone who is in search of a simple way to write their own malloc and free functions using C. Please follow the following link for a detailed explanation.
http://tharikasblogs.blogspot.com/p/how-to-write-your-own-malloc-and-free.html
I think you only have to implement a memory manager. So you don't have to use brk, sbrk, ...
Just put used memory in a simple array and fragment it somehow. Since it's homework you want to make it as simple as possible or else you run into problems due to complexity/time constraints of your assignment.
You only have to decide which tactic you want to use. I'd suggest to use the buddy system. Though it's a bit more complicated than the most simple ones.. maybe fixed sized fragmentation is simpler..
Maybe this is also a good read.
Don't do something low-level as suggested in the other answers..
The implementation greatly depends upon operating system and architecture, anyhow you may take a look at this: http://www.raspberryginger.com/jbailey/minix/html/lib_2ansi_2malloc_8c-source.html
(and study how it works!).
If you are on a unix system you can look the manual of brk and sbrk. Those system calls "push/set" the limit of the heap.
Using those you can manage your memory pages, allocating them as you need.
I would advise a chained-list to manage your different allocated spaces and building functions to split them or to merge them if they are free.
If you need to try your code with high-level applications, you can name your functions malloc/free, compile them to a shared-object (.so) and then use LD_PRELOAD and LD_LIBRARY_PATH environment variables to load your .so and replace system's malloc.
Every command you call then will use your shared object and thus your malloc, telling you if your malloc is stable or if it fails to comply with reality.
If you need a clear example of this i'd be happy to put some code here, but I do not want to make my answer too hard to read.
First, you could make a fake malloc which always fail
/* fake malloc */
void* myalloc(size_t sz)
{ return NULL; }
but that is "cheating". You want to make a malloc which is useful.
You probably want to make a system call which asks the kernel for memory. Of course, you'll need the symetrical syscall to release memory. On Linux and many Posix systems you'll often use mmap and munmap syscalls.
(You could also use sbrk, but using mmap with munmap is easier and more general)
The idea is that you get big chunks of memory (with mmap) and then you manage smaller memory zones inside. The interesting detail is how to manage these smaller zones. You may want to deal with large malloc differently than "small" allocations.
You really want to read wikipedia page on memory allocation
You could have a global static variable that is initialized to zero. Then check that variable at the start of your malloc and free function. In your malloc function, if the variable is zero then initialize whatever you need, and then set the variable to non-zero. In your free function, just return if the variable is zero.
More like that, is a simple malloc :
void* my_malloc(size_t size)
{
return (sbrk(size));
}
man sbrk will help you.
The problem now is to create a free and to create a efficient malloc :-)
if you want to test your malloc you can do like this :
$> LD_PRELOAD=/mypath/my_malloc.so /bin/ls
but you need to create a dynamic library before because malloc is a .so

Patterns for freeing memory in C?

I'm currently working on a C based application am a bit stuck on freeing memory in a non-antipattern fashion. I am a memory-management amateur.
My main problem is I declare memory structures in various different scopes, and these structures get passed around by reference to other functions. Some of those functions may throw errors and exit().
How do I go about freeing my structures if I exit() in one scope, but not all my data structures are in that scope?
I get the feeling I need to wrap it all up in a psuedo exception handler and have the handler deal with freeing, but that still seems ugly because it would have to know about everything I may or may not need to free...
Consider wrappers to malloc and using them in a disciplined way. Track the memory that you do allocate (in a linked list maybe) and use a wrapper to exit to enumerate your memory to free it. You could also name the memory with an additional parameter and member of your linked list structure. In applications where allocated memory is highly scope dependent you will find yourself leaking memory and this can be a good method to dump the memory and analyze it.
UPDATE:
Threading in your application will make this very complex. See other answers regarding threading issues.
You don't need to worry about freeing memory when exit() is called. When the process exits, the operating system will free all of the associated memory.
I think to answer this question appropriately, we would need to know about the architecture of your entire program (or system, or whatever the case may be).
The answer is: it depends. There are a number of strategies you can use.
As others have pointed out, on a modern desktop or server operating system, you can exit() and not worry about the memory your program has allocated.
This strategy changes, for example, if you are developing on an embedded operating system where exit() might not clean everything up. Typically what I see is when individual functions return due to an error, they make sure to clean up anything they themselves have allocated. You wouldn't see any exit() calls after calling, say, 10 functions. Each function would in turn indicate an error when it returns, and each function would clean up after itself. The original main() function (if you will - it might not be called main()) would detect the error, clean up any memory it had allocated, and take the appropriate actions.
When you just have scopes-within-scopes, it's not rocket science. Where it gets difficult is if you have multiple threads of execution, and shared data structures. Then you might need a garbage collector or a way to count references and free the memory when the last user of the structure is done with it. For example, if you look at the source to the BSD networking stack, you'll see that it uses a refcnt (reference count) value in some structures that need to be kept "alive" for an extended period of time and shared among different users. (This is basically what garbage collectors do, as well.)
You can create a simple memory manager for malloc'd memory that is shared between scopes/functions.
Register it when you malloc it, de-register it when you free it. Have a function that frees all registered memory before you call exit.
It adds a bit of overhead, but it helps keep track of memory. It can also help you hunt down pesky memory leaks.
Michael's advice is sound - if you are exiting, you don't need to worry about freeing the memory since the system will reclaim it anyway.
One exception to that is shared memory segments - at least under System V Shared Memory. Those segments can persist longer than the program that creates them.
One option not mentioned so far is to use an arena-based memory allocation scheme, built on top of standard malloc(). If the entire application uses a single arena, your cleanup code can release that arena, and all is freed at once. (APR - Apache Portable Runtime - provides a pools feature which I believe is similar; David Hanson's "C Interfaces and Implementations" provides an arena-based memory allocation system; I've written one that you could use if you wanted to.) You can think of this as "poor man's garbage collection".
As a general memory discipline, every time you allocate memory dynamically, you should understand which code is going to release it and when it can be released. There are a few standard patterns. The simplest is "allocated in this function; released before this function returns". This keeps the memory largely under control (if you don't run too many iterations on the loop that contains the memory allocation), and scopes it so that it can be made available to the current function and the functions it calls. Obviously, you have to be reasonably sure that the functions you call are not going to squirrel away (cache) pointers to the data and try to reuse them later after you've released and reused the memory.
The next standard pattern is exemplified by fopen() and fclose(); there's a function that allocates a pointer to some memory, which can be used by the calling code, and then released when the program has finished with it. However, this often becomes very similar to the first case - it is usually a good idea to call fclose() in the function that called fopen() too.
Most of the remaining 'patterns' are somewhat ad hoc.
People have already pointed out that you probably don't need to worry about freeing memory if you're just exiting (or aborting) your code in case of error. But just in case, here's a pattern I developed and use a lot for creating and tearing down resources in case of error. NOTE: I'm showing a pattern here to make a point, not writing real code!
int foo_create(foo_t *foo_out) {
int res;
foo_t foo;
bar_t bar;
baz_t baz;
res = bar_create(&bar);
if (res != 0)
goto fail_bar;
res = baz_create(&baz);
if (res != 0)
goto fail_baz;
foo = malloc(sizeof(foo_s));
if (foo == NULL)
goto fail_alloc;
foo->bar = bar;
foo->baz = baz;
etc. etc. you get the idea
*foo_out = foo;
return 0; /* meaning OK */
/* tear down stuff */
fail_alloc:
baz_destroy(baz);
fail_baz:
bar_destroy(bar);
fail_bar:
return res; /* propagate error code */
}
I can bet I'm going to get some comments saying "this is bad because you use goto". But this is a disciplined and structured use of goto that makes code clearer, simpler, and easier to maintain if applied consistently. You can't achieve a simple, documented tear-down path through the code without it.
If you want to see this in real in-use commercial code, take a look at, say, arena.c from the MPS (which is coincidentally a memory management system).
It's a kind of poor-man's try...finish handler, and gives you something a bit like destructors.
I'm going to sound like a greybeard now, but in my many years of working on other people's C code, lack of clear error paths is often a very serious problem, especially in network code and other unreliable situations. Introducing them has occasionally made me quite a bit of consultancy income.
There are plenty of other things to say about your question -- I'm just going to leave it with this pattern in case that's useful.
Very simply, why not have a reference counted implementation, so when you create an object and pass it around you increment and decrement the reference counted number (remember to be atomic if you have more than one thread).
That way, when an object is no longer used (zero references) you can safely delete it, or automatically delete it in the reference count decrement call.
This sounds like a task for a Boehm garbage collector.
http://www.hpl.hp.com/personal/Hans_Boehm/gc/
Depends on the system of course whether you can or should afford to use it.

Checking if something was malloced

Given a pointer to some variable.. is there a way to check whether it was statically or dynamically allocated??
Quoting from your comment:
im making a method that will basically get rid of a struct. it has a data member which is a pointer to something that may or may not be malloced.. depending on which one, i would like to free it
The correct way is to add another member to the struct: a pointer to a deallocation function.
It is not just static versus dynamic allocation. There are several possible allocators, of which malloc() is just one.
On Unix-like systems, it could be:
A static variable
On the stack
On the stack but dynamically allocated (i.e. alloca())
On the heap, allocated with malloc()
On the heap, allocated with new
On the heap, in the middle of an array allocated with new[]
On the heap, within a struct allocated with malloc()
On the heap, within a base class of an object allocated with new
Allocated with mmap
Allocated with a custom allocator
Many more options, including several combinations and variations of the above
On Windows, you also have several runtimes, LocalAlloc, GlobalAlloc, HeapAlloc (with several heaps which you can create easily), and so on.
You must always release memory with the correct release function for the allocator you used. So, either the part of the program responsible for allocating the memory should also free the memory, or you must pass the correct release function (or a wrapper around it) to the code which will free the memory.
You can also avoid the whole issue by either requiring the pointer to always be allocated with a specific allocator or by providing the allocator yourself (in the form of a function to allocate the memory and possibly a function to release it). If you provide the allocator yourself, you can even use tricks (like tagged pointers) to allow one to also use static allocation (but I will not go into the details of this approach here).
Raymond Chen has a blog post about it (Windows-centric, but the concepts are the same everywhere): Allocating and freeing memory across module boundaries
The ACE library does this all over the place. You may be able to check how they do it. In general you probably shouldn't need to do this in the first place though...
Since the heap, the stack, and the static data area generally occupy different ranges of memory, it is possible with intimate knowledge of the process memory map, to look at the address and determine which allocation area it is in. This technique is both architecture and compiler specific, so it makes porting your code more difficult.
Most libc malloc implementations work by storing a header before each returned memory block which has fields (to be used by the free() call) which has information about the size of the block, as well as a 'magic' value. This magic value is to protect against the user accidently deleting a pointer which wasn't alloc'd (or freeing a block which was overwritten by the user). It's very system specific so you'd have to look at the implementation of your libc library to see exactly what magic value was there.
Once you know that, you move the given pointer back to point at header and then check it for the magic value.
Can you hook into malloc() itself, like the malloc debuggers do, using LD_PRELOAD or something? If so, you could keep a table of all the allocated pointers and use that. Otherwise, I'm not sure. Is there a way to get at malloc's bookkeeping information?
Not as a standard feature.
A debug version of your malloc library might have some function to do this.
You can compare its address to something you know to be static, and say it's malloced only if it's far away, if you know the scope it should be coming from, but if its scope is unknown, you can't really trust that.
1.) Obtain a map file for the code u have.
2.) The underlying process/hardware target platform should have a memory map file which typically indicates - starting address of memory(stack, heap, global0, size of that block, read-write attributes of that memory block.
3.) After getting the address of the object(pointer variable) from the mao file in 1.) try to see which block that address falls into. u might get some idea.
=AD

Resources