Better to lock on a shared resource, or have a thread to fulfill requests? - c

I have a shared memory pool from which many different threads may request an allocation. Requesting an allocation from this will occur a LOT in every thread, however the amount of threads is likely to be small, often with only 1 thread running. I am not sure which of the following ways to handle this are better.
Ultimately I may need to implement both and see which produces more favorable results... I also fear that even thinking of #2 may be premature optimization at this point as I don't actually have the code that uses this shared resource written yet. But the problem is so darn interesting that it continues to distract me from the other work.
1) Create a mutex and have a thread attempt to lock it before obtaining the allocation, then unlocking it.
2) Have each thread register a request slot, when it needs an allocation it puts the request in the slot, then blocks(while (result == NULL) { usleep() }) waiting for the request slot to have a result. A single thread continuously iterates request slots making the allocations and assigning them to the result in the request slot.
Number 1 is the simple solution, but a single thread could potentially hog the lock if the timing is right. The second is more complex, but ensures fairness among threads when pulling from the resource. However it still blocks the requesting threads, and if there are many threads the iteration could burn cycles without doing any actual allocations until it finds a request to fulfill.
NOTE: C on Linux using pthreads

Solution 2 is bogus. It's an ugly hack and it does not ensure memory synchronization.
I would say go with solution 1, but I'm a little bit skeptical of the fact that you mentioned "memory pool" to begin with. Are you just trying to allocate memory, or is there some other resource you're managing (e.g. slots in some special kind of memory, memory-mapped file, textures in video memory, etc.)?
If you are just allocating memory, then you're completely right to be worried about premature optimization. The whole problem is premature optimization, and the system malloc will do as well as or better than your memory pool will do. (Or if your code will be running on one of the few systems with a pathologically broken malloc like some video game consoles, just drop in a replacement only on those known-broken systems.)
If you really do have a special resource you need to manage, start with solution 1 and see how it works. If you have problems, you might find you can improve it with a condition variable where the resource manager notifies you when a slot can be allocated, but I really doubt this will be necessary.

Related

Freeing memory after the program return EXIT_FAILURE and terminates [duplicate]

Let's say I have the following C code:
int main () {
int *p = malloc(10 * sizeof *p);
*p = 42;
return 0; //Exiting without freeing the allocated memory
}
When I compile and execute that C program, ie after allocating some space in memory, will that memory I allocated be still allocated (ie basically taking up space) after I exit the application and the process terminates?
It depends on the operating system. The majority of modern (and all major) operating systems will free memory not freed by the program when it ends.
Relying on this is bad practice and it is better to free it explicitly. The issue isn't just that your code looks bad. You may decide you want to integrate your small program into a larger, long running one. Then a while later you have to spend hours tracking down memory leaks.
Relying on a feature of an operating system also makes the code less portable.
In general, modern general-purpose operating systems do clean up after terminated processes. This is necessary because the alternative is for the system to lose resources over time and require rebooting due to programs which are poorly written or simply have rarely-occurring bugs that leak resources.
Having your program explicitly free its resources anyway can be good practice for various reasons, such as:
If you have additional resources that are not cleaned up by the OS on exit, such as temporary files or any kind of change to the state of an external resource, then you will need code to deal with all of those things on exit, and this is often elegantly combined with freeing memory.
If your program starts having a longer lifetime, then you will not want the only way to free memory to be to exit. For example, you might want to convert your program into a server (daemon) which keeps running while handling many requests for individual units of work, or your program might become a small part of a larger program.
However, here is a reason to skip freeing memory: efficient shutdown. For example, suppose your application contains a large cache in memory. If when it exits it goes through the entire cache structure and frees it one piece at a time, that serves no useful purpose and wastes resources. Especially, consider the case where the memory pages containing your cache have been swapped to disk by the operating system; by walking the structure and freeing it you're bringing all of those pages back into memory all at once, wasting significant time and energy for no actual benefit, and possibly even causing other programs on the system to get swapped out!
As a related example, there are high-performance servers that work by creating a process for each request, then having it exit when done; by this means they don't even have to track memory allocation, and never do any freeing or garbage collection at all, since everything just vanishes back into the operating system's free memory at the end of the process. (The same kind of thing can be done within a process using a custom memory allocator, but requires very careful programming; essentially making one's own notion of “lightweight processes” within the OS process.)
My apologies for posting so long after the last post to this thread.
One additional point. Not all programs make it to graceful exits. Crashes and ctrl-C's, etc. will cause a program to exit in uncontrolled ways. If your OS did not free your heap, clean up your stack, delete static variables, etc, you would eventually crash your system from memory leaks or worse.
Interesting aside to this, crashes/breaks in Ubuntu, and I suspect all other modern OSes, do have problems with "handled' resources. Sockets, files, devices, etc. can remain "open" when a program ends/crashes. It is also good practice to close anything with a "handle" or "descriptor" as part of your clean up prior to graceful exit.
I am currently developing a program that uses sockets heavily. When I get stuck in a hang I have to ctrl-c out of it, thus, stranding my sockets. I added a std::vector to collect a list of all opened sockets and a sigaction handler that catches sigint and sigterm. The handler walks the list and closes the sockets. I plan on making a similar cleanup routine for use before throw's that will lead to premature termination.
Anyone care to comment on this design?
What's happening here (in a modern OS), is that your program runs inside its own "process." This is an operating system entity that is endowed with its own address space, file descriptors, etc. Your malloc calls are allocating memory from the "heap", or unallocated memory pages that are assigned to your process.
When your program ends, as in this example, all of the resources assigned to your process are simply recycled/torn down by the operating system. In the case of memory, all of the memory pages that are assigned to you are simply marked as "free" and recycled for the use of other processes. Pages are a lower-level concept than what malloc handles-- as a result, the specifics of malloc/free are all simply washed away as the whole thing gets cleaned up.
It's the moral equivalent of, when you're done using your laptop and want to give it to a friend, you don't bother to individually delete each file. You just format the hard drive.
All this said, as all other answerers are noting, relying on this is not good practice:
You should always be programming to take care of resources, and in C that means memory as well. You might end up embedding your code in a library, or it might end up running much longer than you expect.
Some OSs (older ones and maybe some modern embedded ones) may not maintain such hard process boundaries, and your allocations might affect others' address spaces.
Yes. The OS cleans up resources. Well ... old versions of NetWare didn't.
Edit: As San Jacinto pointed out, there are certainly systems (aside from NetWare) that do not do that. Even in throw-away programs, I try to make a habit of freeing all resources just to keep up the habit.
Yes, the operating system releases all memory when the process ends.
It depends, operating systems will usually clean it up for you, but if you're working on for instance embedded software then it might not be released.
Just make sure you free it, it can save you a lot of time later when you might want to integrate it in to a large project.
That really depends on the operating system, but for all operating systems you'll ever encounter, the memory allocation will disappear when the process exits.
I think direct freeing is best. Undefined behaviour is the worst thing, so if you have access while it's still defined in your process, do it, there are lots of good reasons people have given for it.
As to where, or whether, I found that in W98, the real question was 'when' (I didn't see a post emphasising this). A small template program (for MIDI SysEx input, using various malloc'd spaces) would free memory in the WM_DESTROY bit of the WndProc, but when I transplanted this to a larger program it crashed on exit. I assumed this meant I was trying to free what the OS had already freed during a larger cleanup. If I did it on WM_CLOSE, then called DestroyWindow(), it all worked fine, instant clean exit.
While this isn't exactly the same as MIDI buffers, there is similarity in that it is best to keep the process intact, clean up fully, then exit. With modest memory chunks this is very fast. I found that many small buffers worked faster in operation and cleanup than fewer large ones.
Exceptions may exist, as someone said when avoiding hauling large memory chunks back out of a swap file on disk, but even that may be minimised by keeping more, and smaller, allocated spaces.

Is memcpy() a sleeping function?

I would like to copy the content of a an array without using a for loop. The copy is made when owning a spinlock.
Is there any chance that memcpy() can sleep?
Things that might happen with memcpy (or with really any memory access in general):
If part of the source or destination is inaccessible (invalid) memory, memcpy could crash your process, which might leave a shared spinlock in a bad state.
If part of the source memory needs to be paged in, memcpy can block while the kernel grabs the memory for you.
If part of the source or destination is memory-mapped to I/O, memcpy might block while the kernel performs that I/O. (In extreme cases, like memory-mapped network files, memcpy might block indefinitely).
The kernel is also free to swap your process out at any point during the copy, which means the copy could take arbitrarily long to actually complete.
However, memcpy does not do anything that a regular memory access wouldn't do. So, using it with a spinlock should be safe (as safe as accessing the memory normally would be, anyway).
I detect some inconsitency in your question. I'll explain myself.
A spinlock or a busy lock in general, maintains the process (or thread) that is waiting for the lock to be acquired without releasing the cpu to another process (or thread) This means a very fast unlocking and reschedule mechanism when the lock is freed, but a very expensive model for long wait times...
Once said this.... if you are using a spinlock, the reason must be that the loop the process or thread is using to check when the lock is freed should not execute more than three or four times, or the cpu will be wasted just checking once after another time if the lock has been freed.
This completely discourages doing blocking operations like the one you ask for (a memory copy normally is strange that has to deal with a non-present resource ---memory page---, but when it does, your spinlock will go into a loop of millions of checks)
spinlocks where designed to protect very small chuncks of memory, where access could signify at most two or three accesses to memory. In that case, a spinlock is going to solve the problem, as putting the thread to wait and rescheduling it will be milion times faster with the spinlock than with the wait/awake process. But this is in clear antagony to the use of memcpy(3) function, as it is a general copy function that allows for large memory copies in one shot. This means the time the resource is locked for one thread, can signify millions of checks of another thread (in a different core, as this is another reason to use a spinlock, when you have a different core that is going to wait two or three accesses to the lock to see it unlocked)
In my opinion, the only use a spinlock can have is to protect a semaphore's counter, or to protect the access to a cond variable or a mutex, but never to be used as a general memory copy or large resource protection. In those cases, it is better to use a normal, sleeping lock. If you plan to use memcpy(3) the only thing I can assume is that you use the lock to protect large amounts of memory while they are copied into.... that's better handler with a sempahore or a mutex.
In modern kernels, the awakening of a process is so efficient that makes user mode spinlocks almost unusable at all.
As a conclussion, my guess is that you don't have to consider the use of memcpy() to protect a shared memory region... but to consider to use a spinlock itself to do the protection. In most cases it will be a lost of resources, and will make your system heavier and slower.

Why do i need to free dynamic memory manually? [duplicate]

Let's say I have the following C code:
int main () {
int *p = malloc(10 * sizeof *p);
*p = 42;
return 0; //Exiting without freeing the allocated memory
}
When I compile and execute that C program, ie after allocating some space in memory, will that memory I allocated be still allocated (ie basically taking up space) after I exit the application and the process terminates?
It depends on the operating system. The majority of modern (and all major) operating systems will free memory not freed by the program when it ends.
Relying on this is bad practice and it is better to free it explicitly. The issue isn't just that your code looks bad. You may decide you want to integrate your small program into a larger, long running one. Then a while later you have to spend hours tracking down memory leaks.
Relying on a feature of an operating system also makes the code less portable.
In general, modern general-purpose operating systems do clean up after terminated processes. This is necessary because the alternative is for the system to lose resources over time and require rebooting due to programs which are poorly written or simply have rarely-occurring bugs that leak resources.
Having your program explicitly free its resources anyway can be good practice for various reasons, such as:
If you have additional resources that are not cleaned up by the OS on exit, such as temporary files or any kind of change to the state of an external resource, then you will need code to deal with all of those things on exit, and this is often elegantly combined with freeing memory.
If your program starts having a longer lifetime, then you will not want the only way to free memory to be to exit. For example, you might want to convert your program into a server (daemon) which keeps running while handling many requests for individual units of work, or your program might become a small part of a larger program.
However, here is a reason to skip freeing memory: efficient shutdown. For example, suppose your application contains a large cache in memory. If when it exits it goes through the entire cache structure and frees it one piece at a time, that serves no useful purpose and wastes resources. Especially, consider the case where the memory pages containing your cache have been swapped to disk by the operating system; by walking the structure and freeing it you're bringing all of those pages back into memory all at once, wasting significant time and energy for no actual benefit, and possibly even causing other programs on the system to get swapped out!
As a related example, there are high-performance servers that work by creating a process for each request, then having it exit when done; by this means they don't even have to track memory allocation, and never do any freeing or garbage collection at all, since everything just vanishes back into the operating system's free memory at the end of the process. (The same kind of thing can be done within a process using a custom memory allocator, but requires very careful programming; essentially making one's own notion of “lightweight processes” within the OS process.)
My apologies for posting so long after the last post to this thread.
One additional point. Not all programs make it to graceful exits. Crashes and ctrl-C's, etc. will cause a program to exit in uncontrolled ways. If your OS did not free your heap, clean up your stack, delete static variables, etc, you would eventually crash your system from memory leaks or worse.
Interesting aside to this, crashes/breaks in Ubuntu, and I suspect all other modern OSes, do have problems with "handled' resources. Sockets, files, devices, etc. can remain "open" when a program ends/crashes. It is also good practice to close anything with a "handle" or "descriptor" as part of your clean up prior to graceful exit.
I am currently developing a program that uses sockets heavily. When I get stuck in a hang I have to ctrl-c out of it, thus, stranding my sockets. I added a std::vector to collect a list of all opened sockets and a sigaction handler that catches sigint and sigterm. The handler walks the list and closes the sockets. I plan on making a similar cleanup routine for use before throw's that will lead to premature termination.
Anyone care to comment on this design?
What's happening here (in a modern OS), is that your program runs inside its own "process." This is an operating system entity that is endowed with its own address space, file descriptors, etc. Your malloc calls are allocating memory from the "heap", or unallocated memory pages that are assigned to your process.
When your program ends, as in this example, all of the resources assigned to your process are simply recycled/torn down by the operating system. In the case of memory, all of the memory pages that are assigned to you are simply marked as "free" and recycled for the use of other processes. Pages are a lower-level concept than what malloc handles-- as a result, the specifics of malloc/free are all simply washed away as the whole thing gets cleaned up.
It's the moral equivalent of, when you're done using your laptop and want to give it to a friend, you don't bother to individually delete each file. You just format the hard drive.
All this said, as all other answerers are noting, relying on this is not good practice:
You should always be programming to take care of resources, and in C that means memory as well. You might end up embedding your code in a library, or it might end up running much longer than you expect.
Some OSs (older ones and maybe some modern embedded ones) may not maintain such hard process boundaries, and your allocations might affect others' address spaces.
Yes. The OS cleans up resources. Well ... old versions of NetWare didn't.
Edit: As San Jacinto pointed out, there are certainly systems (aside from NetWare) that do not do that. Even in throw-away programs, I try to make a habit of freeing all resources just to keep up the habit.
Yes, the operating system releases all memory when the process ends.
It depends, operating systems will usually clean it up for you, but if you're working on for instance embedded software then it might not be released.
Just make sure you free it, it can save you a lot of time later when you might want to integrate it in to a large project.
That really depends on the operating system, but for all operating systems you'll ever encounter, the memory allocation will disappear when the process exits.
I think direct freeing is best. Undefined behaviour is the worst thing, so if you have access while it's still defined in your process, do it, there are lots of good reasons people have given for it.
As to where, or whether, I found that in W98, the real question was 'when' (I didn't see a post emphasising this). A small template program (for MIDI SysEx input, using various malloc'd spaces) would free memory in the WM_DESTROY bit of the WndProc, but when I transplanted this to a larger program it crashed on exit. I assumed this meant I was trying to free what the OS had already freed during a larger cleanup. If I did it on WM_CLOSE, then called DestroyWindow(), it all worked fine, instant clean exit.
While this isn't exactly the same as MIDI buffers, there is similarity in that it is best to keep the process intact, clean up fully, then exit. With modest memory chunks this is very fast. I found that many small buffers worked faster in operation and cleanup than fewer large ones.
Exceptions may exist, as someone said when avoiding hauling large memory chunks back out of a swap file on disk, but even that may be minimised by keeping more, and smaller, allocated spaces.

What are out-of-memory handling strategies in C programming?

One strategy that I though of myself is allocating 5 megabytes of memory (or whatever number you feel necessary) at the program startup.
Then when at any point program's malloc() returns NULL, you free the 5 megabytes and call malloc() again, which will succeed and let the program continue running.
What do you think about this strategy?
And what other strategies do you know?
Thanks, Boda Cydo.
Handle malloc failures by exiting gracefully. With modern operating systems, pagefiles, etc you should never pre-emptively brace for memory failure, just exit gracefully. It is unlikely you will ever encounter out of memory errors unless you have an algorithmic problem.
Also, allocating 5MB for no reason at startup is insane.
For the last few years, the (embedded) software I have been working with generally does not permit the use of malloc(). The sole exception to this is that it is permissible during the initialization phase, but once it is decided that no more memory allocations are allowed, all future calls to malloc() fail. As memory may become fragmented due to malloc()/free() it becomes difficult at best in many cases to prove that future calls to malloc() will not fail.
Such a scenario might not apply to your case. However, knowing why malloc() is failing can be useful. The following technique that we use in our code since malloc() is not generally available might (or might not) be applicable to your scenario.
We tend to rely upon memory pools. The memory for each pool is allocated during the transient startup phase. Once we have the pools, we get an entry from the pool when we need it, and release it back to the pool when we are done. Each pool is configurable, and is usually reserved for a particular object type. We can track the usage of each over time. If we run out of pool entries, we can find out why. If we don't, we have the option of making our pool smaller and save some resources.
Hope this helps.
As a method of testing that you handle out of memory situations gracefully, this can be a reasonably useful technique.
Under any other circumstance, it sounds useless at best. You're causing the out of memory situation to happen, then fixing the problem by freeing memory you didn't need to start with.
"try-again-later". Just because you're OOM now, doesn't mean you will be later when the system is less busy.
void *smalloc(size_t size) {
for(int i = 0; i < 100; i++) {
void *p = malloc(size);
if(p)
return p;
sleep(1);
}
return NULL;
}
You should of course think a lot about where you employ such a strategy as it is quite hidious, but it has saved some of our systems in various cases
It actually depends on a policy you'd like to implement, meaning, what is the expected behavior of your program when it's out of memory.
Great solution would be to allocate memory during initialization only and never during runtime. In this case you'll never run out of memory if the program managed to start.
Another could be freeing resources when you hit memory limit. It'd be difficult to implement and test.
Keep in mind that when you are getting NULL from malloc it means both physical and virtual memory have no more free space, meaning your program is swapping all the time, making it slow and the computer unresponsive.
You actually need to make sure (by estimated calculation or by checking the amount of memory in runtime) that the expected amount of free memory the computer has is enough for your program.
Generally the purpose of freeing the memory is so that you have enough to report the error before you terminate the program.
If you are just going to keep running, there is no point in preallocating the emergency reserve.
Most of modern OSes in default configuration allow memory overcommit, so your program wouldn't get NULL from malloc() at all or at least until it somehow (by error, I guess) exhausted all available address space (not memory).
And then it writes some perfectly legal memory location, gets a page fault, there is no memory page in backing store and BANG (SIGBUS) - you dead, and there is no good way out there.
So just forget about it, you can't handle it.
Yeah, this doesn't work in practice. First for a technical reason, a typical low-fragmentation heap implementation doesn't make large free blocks available for small allocations.
But the real problem is that you don't know why you ran out of virtual memory space. And if you don't know why then there's nothing you can do to prevent that extra memory from being consumed very rapidly and still crash your program with OOM. Which is very likely to happen, you've already consumed close to two gigabytes, that extra 5 MB is a drop of water on a hot plate.
Any kind of scheme that switches the app into 'emergency mode' is very impractical. You'll have to abort running code so that you can stop, say, loading an enormous data file. That requires an exception. Now you're back to what you already had before, std::badalloc.
I want to second the sentiment that the 5mb pre-allocation approach is "insane", but for another reason: it's subject to race conditions. If the cause of memory exhaustion is within your program (virtual address space exhausted), another thread could claim the 5mb after you free it but before you get to use it. If the cause of memory exhaustion is lack of physical resources on the machine due to other processes using too much memory, those other processes could claim the 5mb after you free it (if the malloc implementation returns the space to the system).
Some applications, like a music or movie player, would be perfectly justified just exiting/crashing on allocation failures - they're managing little if any modifiable data. On the other hand, I believe any application that is being used to modify potentially-valuable data needs to have a way to (1) ensure that data already on disk is left in a consistent, non-corrupted state, and (2) write out a recovery journal of some sort so that, on subsequent invocations, the user can recover any data lost when the application was forced to close.
As we've seen in the first paragraph, due to race conditions your "malloc 5mb and free it" approach does not work. Ideally, the code to synchronize data and write recovery information would be completely allocation-free; if your program is well-designed, it's probably naturally allocation-free. One possible approach if you know you will need allocations at this stage is to implement your own allocator that works in a small static buffer/pool, and use it during allocation-failure shutdown.

Safety nets in complex multi-threaded code?

As a developer who has just finished writing thousands of lines of complex multi-threaded 'C' code in a project, and which is going to be enhanced, modified etc. by several other developers unfamiliar with this code in the future, I wanted to find out what kind of safety nets do you guys try to put in such code? As an example I could do these:
Define accessor macros for lock protected
structure members, which assert that
the corresponding lock is held. This
makes it clear that these members
are lock-protected to anyone unfamiliar with this code.
Functions which are supposed to be
called with some spinlock held,
assert that the spinlock is being held.
What kind of safety nets have you put into multi-threaded code that you have written?
What kind of problems have you faced when other developers modified such code?
What kind of debugging aids have you put into such code?
Thanks for your comments.
There are a number of things we do in our product (a hypervisor designed to help you find concurrency bugs in applications) that are more generally useful. Note that we do these in our code itself (because its a highly concurrent piece of software) and that some of these are useful whether or not you are writing concurrent code.
Like you, we have the ability to assert(lock_held(...)) and use it.
We also (because we have our own scheduler) can assert(single_threaded()) for those (rare) situations where we count on no other thread being active in the system.
Memory corruption from one thread to another is pretty common (and hard to debug) so we do two things to address this: sprinkled throughout our thread stack are some magic cookies. We periodically (in our get_thread_id()) function invoke a "validate_thread_stack()" function that checks these cookies to make sure the stack is not corrupted.
Our malloc sticks magic cookies before and after a malloc block of memory and checks these on free. If anyone overruns their data these can be used to find the corruption early.
On free() we blast a well known pattern (in our case 0xdddd...) over the memory. This nicely corrupts anyone else who had a dangling pointer left over to that memory region.
We have a guard page (a memory page not mapped into the address space) near the bottom of the thread stack. If the thread overruns its stack, we catch it via page fault and drop into our debugger.
Our locks are witnessed. Checkout the FreeBSD lock witness code. Its like that but homebrew. Basically the witness code is a lightweight way of detecting potential deadlocks by looking at cycles in the lock acquisition graph.
Our locks are also wrapped with accessors that record the file/line number of acquisition and release. For double unlocks or double locks, you get pretty debug information on your screwup.
Our locks are also profiled. Once you get your code working you want it working well. We track the usual things like how many acquisitions, how long it took to acquire it.
In our system, we have an expectation that locks are not contended (we carefully designed the code this way). So if you wait for a spin lock longer than a second or two in our system you get dropped into the debugger because its most likely not a good thing.
Our variables that are meant to be updated atomically are wrapped inside of C struct's. The reason for this is to prevent sloppy code where you mix good use: atomic_increment(&var); and bad use var++. We make it very hard to write the latter code.
"volatile" is forbidden in our code base because its ambiguously implemented by compilers. Its a bad way to try and cobble together synchronization.
And of course code reviews. If you can't explain your concurrency assumptions and locking discipline to a colleague, then there's definitely issues with the code :-)
Make everything absolutely obvious, so that other developers cannot miss the synchronization scope when they view subsections of the code in isolation.
for example: don't hold a lock in code that spans multiple files.
Seems like you've answered your own question: put lots of assertions into the code. They will tell other developers what invariants and preconditions must hold.

Resources