In C, I know it is good practice to always check if a newly malloced variable is null right after allocating. If so, I can output an error e.g. perror and exit the program.
But what about in more complicated programs? E.g. I have main call a function f1(returns an int), which calls a function f2(returns a char*), which calls a function f3(returns a double), and I fail to malloc inside f3.
In this case, I can't seem to just output an error and exit(and may even have memory leaks if possible) since f3 will still force me to first return a double. Then f2 will force me to return a char*, etc. In this case, it seems very painful to keep track of the errors and exit appropriately. What is the proper way to efficiently cover these sort of errors accross functions?
The obvious solution is to design your program with care, so that every function that does dynamic allocation has some means to report errors. Most often the return value of the function is used for this purpose.
In well-designed programs, errors bounce back all the way up the call stack, so that they are dealt with at the application level.
In the specific case of dynamic memory allocation, it is always best to leave the allocation to the caller whenever possible.
It's always a problem. You need a disiplined approach.
Firstly, every dynamic pointer must be "owned" by someone. C won't help you here, you just have to specify. Generally the three patterns are
a) Function calls malloc(), then calls free():
b) We have two matching functions, one which returns a buffer or dynamic
structure, one which destroys it. The function that calls create also calls the destroy.
c) We have a set of nodes we are inserting into a graph, at random throughout the program. It needs to be managed like b, one function creates the root, then calls the delete which destroys the entire graph.
The rule is owner holds and frees.
If you return a pointer, return 0 on out of memory. If you return an integer, return -1. Errors get propagated up until some high level code knows what user-level operation has failed and aborts it.
The other answers are correct that the correct way to handle this is to make sure that every function that can allocate memory can report failure to its caller, and every caller handles the possibility. And, of course, you have a test malloc shim that arranges to test every possible allocation failure.
But in large C programs, this becomes intractable — the number of cases that need testing increases exponentially with the number of malloc callsites, for starters — so it is very common to see a function like this in a utils.c file:
void *
xmalloc(size_t n)
{
void *rv = malloc(n);
if (!rv) {
fprintf(stderr, "%s: memory exhausted\n", program_name);
exit(1);
}
return rv;
}
All other code in the program always calls xmalloc, never malloc, and can assume it always succeeds. (And you also have xcalloc, xrealloc, xstrdup, etc.)
Libraries cannot get away with this, but applications can.
The one way to switch across the functions is exception handling.
When an exception is thrown, it return the scope the catch part.
But make sure of the memory allocation across the functions, Since it directly moves to the catch blok.
The sample code for reference,
// Example program
#include <iostream>
#include <string>
using namespace std ;
int f1()
{
int *p = (int*) malloc(sizeof(int)) ;
if(p == NULL)
{
throw(1) ;
}
//Code flow continues.
return 0 ;
}
char *g()
{
char *p ;
f1() ;
cout << "Inside fun g*" << endl ;
return p;
}
int f2()
{
g() ;
cout << "Inside fun f2" << endl ;
return 0 ;
}
int main()
{
try
{
f2() ;
}
catch(int a)
{
cout << "Caught memory exception" << endl ;
}
return 0 ;
}
Related
I was working on a project for a course on Operating Systems. The task was to implement a library for dealing with threads, similar to pthreads, but much more simpler. The purpose of it is to practice scheduling algorithms. The final product is a .a file. The course is over and everything worked just fine (in terms of functionality).
Though, I got curious about an issue I faced. On three different functions of my source file, if I add the following line, for instance:
fprintf(stderr, "My lucky number is %d\n", 4);
I get a segmentation fault. The same doesn't happen if stdout is used instead, or if the formatting doesn't contain any variables.
That leaves me with two main questions:
Why does it only happen in three functions of my code, and not the others?
Could the creation of contexts using getcontext() and makecontext(), or the changing of contexts using setcontext() or swapcontext() mess up with the standard file descriptors?
My intuition says those functions could be responsible for that. Even more when given the fact that the three functions of my code in which this happens are functions that have contexts which other parts of the code switch to. Usually by setcontext(), though swapcontext() is used to go to the scheduler, for choosing another thread to execute.
Additionally, if that is the case, then:
What is the proper way to create threads using those functions?
I'm currently doing the following:
/*------------------------------------------------------------------------------
Funct: Creates an execution context for the function and arguments passed.
Input: uc -> Pointer where the context will be created.
funct -> Function to be executed in the context.
arg -> Argument to the function.
Return: If the function succeeds, 0 will be returned. Otherwise -1.
------------------------------------------------------------------------------*/
static int create_context(ucontext_t *uc, void *funct, void *arg)
{
if(getcontext(uc) != 0) // Gets a context "model"
{
return -1;
}
stack_t *sp = (stack_t*)malloc(STACK_SIZE); // Stack area for the execution context
if(!sp) // A stack area is mandatory
{
return -1;
}
uc->uc_stack.ss_sp = sp; // Sets stack pointer
uc->uc_stack.ss_size = STACK_SIZE; // Sets stack size
uc->uc_link = &context_end; // Sets the context to go after execution
makecontext(uc, funct, 1, arg); // "Makes everything work" (can't fail)
return 0;
}
This code is probably a little modified, but it is originally an online example on how to use u_context.
Assuming glibc, the explanation is that fprintf with an unbuffered stream (such as stderr by default) internally creates an on-stack buffer which as a size of BUFSIZE bytes. See the function buffered_vfprintf in stdio-common/vfprintf.c. BUFSIZ is 8192, so you end up with a stack overflow because the stack you create is too small.
When using malloc in C, I'm always told to check to see if any errors occurred by checking if it returned a NULL value. While I definitely understand why this is important, it is a bit of a bother constantly typing out 'if' statements and whatever I want inside them to check whether the memory was successfully allocated for each individual instance where I use malloc. To make things quicker, I made a function as follows to check whether it was successful.
#define MAX 25
char MallocCheck(char* Check);
char *Option1, *Option2;
int main(){
Option1 = (char *)malloc(sizeof(char) * MAX);
MallocCheck(Option1);
Option2 = (char *)malloc(sizeof(char) * MAX);
MallocCheck(Option2);
return 0;
}
char MallocCheck(char* Check){
if(Check == NULL){
puts("Memory Allocation Error");
exit(1);
}
}
However, I have never seen someone else doing something like this no matter how much I search so I assume it is wrong or otherwise something that shouldn't be done.
Is using a user-defined function for this purpose wrong and if so, why is that the case?
Error checking is a good thing.
Making a helper function to code quicker, better is a good thing.
The details depend on coding goals and your group's coding standards.
OP's approach is not bad. I prefer to handle the error with the allocation. The following outputs on stderr #EOF and does not complain of a NULL return when 0 bytes allocated (which is not a out-of-memory failure).
void *malloc_no_return_on_OOM(size_t size) {
void *p = mallc(size);
if (p == NULL && size > 0) {
// Make messages informative
fprintf(stderr, "malloc(%zu) failure\n", size);
// or
perror("malloc() failure");
exit(1);
}
return p;
}
Advanced: Could code a DEBUG version that contains the callers function and line by using a macro.
This is an addendum to #chux's answer and the comments.
As stated, DRY code is generally a good thing and malloc errors are often handled the same way within a specific implementation.
It's true that some systems (notably, Linux) offer optimistic malloc implementations, meaning malloc always returns a valid pointer (never NULL) and the error is reported using a signal the first time data is written to the returned pointer... which makes error handling slightly more complex then the code in the question.
However, moving the error check to a different function might incur a performance penalty, unless the compiler / linker catches the issue and optimizes the function call away.
This is a classic use case for inline functions (on newer compilers) or macros.
i.e.
#include <signal.h>
void handle_no_memory(int sig) {
if (sig == SIGSEGV) {
perror("Couldn't allocate or access memory");
/* maybe use longjmp to stay in the game...? Or not... */
exit(SIGSEGV);
}
}
/* Using a macro: */
#define IS_MEM_VALID(ptr) \
if ((ptr) == NULL) { \
handle_no_memory(SIGSEGV); \
}
/* OR an inline function: */
static inline void *is_mem_valid(void *ptr) {
if (ptr == NULL)
handle_no_memory(SIGSEGV);
return ptr;
}
int main(int argc, char const *argv[]) {
/* consider setting a signal handler - `sigaction` is better, but I'm lazy. */
signal(SIGSEGV, handle_no_memory);
/* using the macro */
void *data_macro = malloc(1024);
IS_MEM_VALID(data_macro);
/* using the inline function */
void *data_inline = is_mem_valid(malloc(1024));
}
Both macros and inline functions prevent code jumps and function calls, since the if statement is now part of the function instead of an external function.
When using inline, the compiler will take the assembly code and place it within the function (instead of performing a function call). For this, we must trust the compiler to so it's job properly (it usually does it's job better than us).
When using macros, the preprocessor takes care of things and we don't need to trust the compiler.
In both cases the function / macro is local to the file (notice the static key word), allowing any optimizations to be performed by the compiler (not the linker).
Good luck.
The Matasano blog calls “Checking the return value of malloc()” a “C Programming Anti-Idiom.” Instead malloc() should automatically call abort() for you if it fails. The argument is that, since you’ll usually want to abort the program if malloc() fails, that should be the default behaviour instead of something you have to laboriously type—or maybe forget to type!—every time.
Without getting into the merits of the idea, what’s the easiest way to set this up? I’m looking for something that would automatically detect memory allocation failures by other library functions such as asprintf() too. A portable solution would be wonderful, but I’d also be happy with something mac-specific.
Summarizing the best answers below:
Mac run-time solution
Set the MallocErrorAbort=1 environment variable before running your program. Automatically works for all memory allocation functions.
Mac/linux run-time solution
Use a dynamic library shim to load a custom malloc() wrapper at runtime with LD_PRELOAD or DYLD_INSERT_LIBRARIES. You will likely want to wrap calloc(), realloc(), &c. as well.
Mac/linux compiled solution
Define your own malloc() and free() functions, and access the system versions using dyld(RTLD_NEXT, "malloc") as shown here. Again, you will likely want to wrap calloc(), realloc(), &c. as well.
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
void *(*system_malloc)(size_t) = NULL;
void* malloc(size_t bytes) {
if (system_malloc == NULL) {
system_malloc = dlsym(RTLD_NEXT, "malloc");
}
void* ret = system_malloc(bytes);
if (ret == NULL) {
perror("malloc failed, aborting");
abort();
}
return ret;
}
int main() {
void* m = malloc(10000000000000000l);
if (m == NULL) {
perror("malloc failed, program still running");
}
return 0;
}
Linux compiled solution
Use __malloc_hook and __realloc_hook as described in the glibc manual.
Mac compiled solution
Use the malloc_default_zone() function to access the heap’s data structure, unprotect the memory page, and install a hook in zone->malloc:
#include <malloc/malloc.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <unistd.h>
static void* (*system_malloc)(struct _malloc_zone_t *zone, size_t size);
static void* my_malloc(struct _malloc_zone_t *zone, size_t size) {
void* ret = system_malloc(zone, size);
if (ret == NULL) {
perror("malloc failed, aborting");
abort();
}
return ret;
}
int main() {
malloc_zone_t *zone = malloc_default_zone();
if (zone->version != 8) {
fprintf(stderr, "Unknown malloc zone version %d\n", zone->version);
abort();
}
system_malloc = zone->malloc;
if (mprotect(zone, getpagesize(), PROT_READ | PROT_WRITE) != 0) {
perror("munprotect failed");
abort();
}
zone->malloc = my_malloc;
if (mprotect(zone, getpagesize(), PROT_READ) != 0) {
perror("mprotect failed");
abort();
}
void* m = malloc(10000000000000000l);
if (m == NULL) {
perror("malloc failed, program still running");
}
return 0;
}
For completeness you will likely want to wrap calloc(), realloc(), and the other functions defined in malloc_zone_t in /usr/include/malloc/malloc.h as well.
Just wrap malloc() in some my_malloc() function that does this instead. In a lot of cases it's actually possible to handle not being able to allocate memory so this type of behaviour would be undesirable. It's easy to add functionality to malloc() but not to remove it, which is probably why it behaves this way.
Another thing to keep in mind is that this is a library you're calling into. Would you like to make a library call and have the library kill your application without you being able to have a say in it?
I guess I missed the part about asprintf but libc exports some hooks you can use (what valgrind does essentially) that let you override the malloc behavior. Here's a reference to the hooks themselves, if you know C well enough it's not hard to do.
http://www.gnu.org/savannah-checkouts/gnu/libc/manual/html_node/Hooks-for-Malloc.html
man malloc on my Mac gives the following information. It looks like you want MallocErrorAbort.
ENVIRONMENT
The following environment variables change the behavior of the allocation-related functions.
MallocLogFile <f>
Create/append messages to the given file path <f> instead of writing to the standard error.
MallocGuardEdges
If set, add a guard page before and after each large block.
MallocDoNotProtectPrelude
If set, do not add a guard page before large blocks, even if the MallocGuardEdges environment variable is set.
MallocDoNotProtectPostlude
If set, do not add a guard page after large blocks, even if the MallocGuardEdges environment variable is set.
MallocStackLogging
If set, record all stacks, so that tools like leaks can be used.
MallocStackLoggingNoCompact
If set, record all stacks in a manner that is compatible with the malloc_history program.
MallocStackLoggingDirectory
If set, records stack logs to the directory specified instead of saving them to the default location (/tmp).
MallocScribble
If set, fill memory that has been allocated with 0xaa bytes. This increases the likelihood that a program making assumptions about the contents of freshly allocated memory will fail. Also if set, fill memory that has been deallocated with 0x55 bytes. This increases the likelihood that a program will fail due to accessing memory that is no longer allocated.
MallocCheckHeapStart <s>
If set, specifies the number of allocations <s> to wait before beginning periodic heap checks every <n> as specified by MallocCheckHeapEach. If MallocCheckHeapStart is set but MallocCheckHeapEach is not specified, the default check repetition is 1000.
MallocCheckHeapEach <n>
If set, run a consistency check on the heap every <n> operations. MallocCheckHeapEach is only meaningful if MallocCheckHeapStart is also set.
MallocCheckHeapSleep <t>
Sets the number of seconds to sleep (waiting for a debugger to attach) when MallocCheckHeapStart is set and a heap corruption is detected. The default is 100 seconds. Setting this to zero means not to sleep at all. Setting this to a negative number means to sleep (for the positive number of seconds) only the very first time a heap corruption is detected.
MallocCheckHeapAbort <b>
When MallocCheckHeapStart is set and this is set to a non-zero value, causes abort(3) to be called if a heap corruption is detected, instead of any sleeping.
MallocErrorAbort
If set, causes abort(3) to be called if an error was encountered in malloc(3) or free(3), such as a calling free(3) on a pointer previously freed.
MallocCorruptionAbort
Similar to MallocErrorAbort but will not abort in out of memory conditions, making it more useful to catch only those errors which will cause memory corruption. MallocCorruptionAbort is always set on 64-bit processes.
MallocHelp
If set, print a list of environment variables that are paid heed to by the allocation-related functions, along with short descriptions. The list should correspond to this documentation.
Note the comments under MallocCorruptionAbort about the behaviour of MallocErrorAbort.
For most of my own code, I use a series of wrapper functions — emalloc(), erealloc(), ecalloc(), efree(), estrdup(), etc — that check for failed allocations (efree() is a straight pass-through function for consistency) and do not return when an allocation fails. They either exit or abort. This is basically what Jesus Ramos suggests in his answer; I agree with what he suggests.
However, not all programs can afford to have that happen. I'm just in the process of fixing up some code I wrote which does use these functions so that it can be reused in a context where it is not OK to fail to on allocation error. For its original purpose (security checks during the very early stages of process startup), it was fine to exit on error, but now it needs to be usable after the system is running, when a premature exit is not allowed. So, the code has to deal with those paths where the code used to be able to assume 'no return on allocation failure'. That's a tad painful. It can still take a conservative view; an allocation failure means the request is not safe and process it appropriately. But not all code can afford to fail with abort on memory allocation failure.
I'm struggling to come up with a clean way to handle my allocated memory in C. Suppose I have something like this:
void function(int arg) {
char *foo;
foo = (char *)malloc(sizeof(char) * 100);
int i = func1(arg, &foo);
char *bar;
bar = (char *)malloc(sizeof(char) * 100);
int j = func2(&bar);
free(foo);
free(bar);
}
My problem is that func1 and func2 may encounter error and exit(1), so I need to free foo and bar when that happens.
If func1 encounters error, I just call free(foo) and I'll be good. But if func2 encounters error, I cannot just call free(bar) since I also need to free foo. This can get really complicated and I feel that this is not the right way to handle memory.
Am I missing anything here? It will be awesome if someone can point me the right direction. Thanks!
If a function calls exit you don't have to clean your memory usage at all, it will be freed by the OS. But if you need to release other resources (e.g. lock file, clean temp files, ...) then you can use the atexit function or if you use the gnu libc the on_exit function to do the job.
If func1() or func2() is going to call exit(1) on some condition, you don't have worry about freeing memory for foo or bar as the operating sytem will typically do clean up once the process exits.
As long as you free at the right time (before going out of function) during the normal course of execution, you are just fine with no memory leak.
I think so there can be a one simple approach to this problem.
Just maintain an allocCode throughout your program when allocating resources something like one given below.
There are few keypoints to remember. First is that, don't you break statement in your switch case. Increment the allocCode, for every successful allocation of resource. For every resource added, you should add a case in the switch at the top, with one higher number. So calling the function freeResourceBeforeExit() will free all your resources in correct order. Remember that, since there is no break, the switch case will enter at the correct position and free all the resource which are below its entry point.
I will write the psuedo-code.
int allocCode = 0;
int freeResourceBeforeExit()
{
switch(allocCode)
{
case 4:
free(resource3);
case 3:
free(resource2);
case 2:
free(resource1);
case 1:
free(resource0);
}
exit(0);
}
int main()
{
...
resource0 = malloc(10);
allocCode++;
func1();
...
resource1 = malloc(100);
allocCode++;
func2();
...
resource2 = malloc(1000);
allocCode++;
...
func3();
...
resource3 = malloc(10000);
allocCode++;
func4();
...
so on..
}
Hope this helps !
If you divide your work into several parts, it will be much easier to manage your resources.
void part1(int arg) {
char *foo;
foo = (char *)malloc(sizeof(char) * 100);
int i = func1(arg, &foo);
free(foo);
}
void part2(void) {
char *bar;
bar = (char *)malloc(sizeof(char) * 100);
int j = func2(&bar);
free(bar);
}
void function(int arg) {
part1(arg);
part2();
}
Now each part can free its parameter before exiting, if needed.
In principle you could install a handler with atexit that knows how to free your buffers. The handler will be called as a consequence of func1 calling exit. It's not very pleasant to use -- the handler takes no parameters, which means you need to use globals (or local static variables) to store the thing that needs to be freed. It can't be unregistered, which means you need to set those globals to null (or some other special value) to indicate that you've freed the resources yourself, but the handler will still be called. Normally you'd use the atexit handler as the "hook" from which to hang your own resource cleanup framework.
In practice, that's usually too much hassle for a few malloced buffers, because when the program exits a full-featured OS will in any case release all memory reserved by the process.
It can even be costly to free memory before exit - in order to free each allocation with free, the memory will be touched, which means it needs to be dragged into cache from main memory or even from swap. For a large number of small allocations that might take a while. When the OS does it for you, it just unmaps the memory map for the process and starts re-using that address space / memory / swap space for other things in future. There are benefits to cleaning up (for example, it makes your code easier to re-use and it makes real leaks easier to find), but also costs.
By the way, it's rather anti-social of the function func1 to call exit on error, because as you've discovered it places limits on users of the function. They can't recover even if they think their program can/should carry on despite func1 failing. func1 has in effect declared that it is too important for the program to even dream of continuing without its results. Yes, GMP, I do mean you.
One way of dealing with it:
void function() {
int size = 100;
char bar[size];
char foo[size];
// do stuff
//compiler frees bar and foo for you
}
What is the best way for unit testing code paths involving a failed malloc()? In most instances, it probably doesn't matter because you're doing something like
thingy *my_thingy = malloc(sizeof(thingy));
if (my_thingy == NULL) {
fprintf(stderr, "We're so screwed!\n");
exit(EXIT_FAILURE);
}
but in some instances you have choices other than dying, because you've allocated some extra stuff for caching or whatever, and you can reclaim that memory.
However, in those instances where you can try to recover from a failed malloc() that you're doing something tricky and error prone in a code path that's pretty unusual, making testing especially important. How do you actually go about doing this?
I saw a cool solution to this problem which was presented to me by S. Paavolainen. The idea is to override the standard malloc(), which you can do just in the linker, by a custom allocator which
reads the current execution stack of the thread calling malloc()
checks if the stack exists in a database that is stored on hard disk
if the stack does not exist, adds the stack to the database and returns NULL
if the stack did exist already, allocates memory normally and returns
Then you just run your unit test many times: this system automatically enumerates through different control paths to malloc() failure and is much more efficient and reliable than e.g. random testing.
I suggest creating a specific function for your special malloc code that you expect could fail and you could handle gracefully. For example:
void* special_malloc(size_t bytes) {
void* ptr = malloc(bytes);
if(ptr == NULL) {
/* Do something crafty */
} else {
return ptr;
}
}
Then you could unit-test this crafty business in here by passing in some bad values for bytes. You could put this in a separate library and make a mock-library that does behaves special for your testing of the functions which call this one.
This is a kinda gross, but if you really want unit testing, you could do it with #ifdefs:
thingy *my_thingy = malloc(sizeof(thingy));
#ifdef MALLOC_UNIT_TEST_1
my_thingy = NULL;
#endif
if (my_thingy == NULL) {
fprintf(stderr, "We're so screwed!\n");
exit(EXIT_FAILURE);
}
Unfortunately, you'd have to recompile a lot with this solution.
If you're using linux, you could also consider running your code under memory pressure by using ulimit, but be careful.
write your own library that implements malloc by randomly failing or calling the real malloc (either staticly linked or explicitly dlopened)
then LD_PRELOAD it
In FreeBSD I once simply overloaded C library malloc.o module (symbols there were weak) and replaced malloc() implementation with one which had controlled probability to fail.
So I linked statically and started to perform testing. srandom() finished the picture with controlled pseudo-random sequence.
Also look here for a set of good tools that you seems to need by my opinion. At least they overload malloc() / free() to track leaks so it seems as usable point to add anything you want.
You could hijack malloc by using some defines and global parameter to control it... It's a bit hackish but seems to work.
#include <stdio.h>
#include <stdlib.h>
#define malloc(x) fake_malloc(x)
struct {
size_t last_request;
int should_fail;
void *(*real_malloc)(size_t);
} fake_malloc_params;
void *fake_malloc(size_t size) {
fake_malloc_params.last_request = size;
if (fake_malloc_params.should_fail) {
return NULL;
}
return (fake_malloc_params.real_malloc)(size);;
}
int main(void) {
fake_malloc_params.real_malloc = malloc;
void *ptr = NULL;
ptr = malloc(1);
printf("last: %d\n", (int) fake_malloc_params.last_request);
printf("ptr: 0x%p\n", ptr);
return 0;
}