A project I am working on involves a flight vehicle with GNC code written in a C library (.out). We must call this C code from LabVIEW (the primary avionics software) in the form of a .out library, and the nature of the software requires static pointers to store data between successive calls to the function. We call the GNC executive function at regular intervals throughout a flight. I'm now trying to call this function using a Matlab MEX wrapper in a DLL on Windows, and this has uncovered some memory management issues.
I am declaring the structures at the beginning of the function like this:
static Nav_str *Nav_IN_OUT_ptr;
static hguid_ref *Guid_IN_OUT_ptr;
static HopControl *Control_IN_OUT_ptr;
Nav_IN_OUT_ptr = (Nav_str *)malloc(sizeof(Nav_str));
Guid_IN_OUT_ptr = (hguid_ref *)malloc(sizeof(hguid_ref));
Control_IN_OUT_ptr = (HopControl *)malloc(sizeof(HopControl));
This happens during every run of the function. However, after this function is called iteratively several times, it always crashes with a memory segmentation fault after it tries to exit. My understanding was that this memory was supposed to clean itself up, is that incorrect?
In order to clean it up manually, I added these lines to the end, to be called only on a clean-up iteration:
free(Nav_IN_OUT_ptr);
free(Guid_IN_OUT_ptr);
free(Control_IN_OUT_ptr);
Is this the correct way to free this memory? Can I free this memory? Might there be another reason for the segmentation error other than C not giving up the memory properly after the last call, or Matlab not properly managing its memory? I've searched all over for someone with a similar problem (even contacting Mathworks) without much luck, so any comments or suggestions would be much appreciated.
Failing to free memory is not going to cause a segmentation fault. It's probably likely your problem lies somewhere else. The two likely conditions are:
Overflowing a buffer
Using a pointer to memory that has previously been free'd.
Using a bad pointer value, somehow set incorrectly.
Trying to free a pointer not returned by malloc'd (or already free'd)
My understanding was that this memory
was supposed to clean itself up, is
that incorrect?
Yes, you need to call free() to release the memory back to the heap. I would also suggest that you set the pointer value to null after the free, this may help you catch condition 2, from above.
Nav_IN_OUT_ptr = (Nav_str *)malloc(sizeof(Nav_str));
This code statement is questionable. What is Nav_str type? Are you sure you don't mean to use strlen(Nav_str)+1?
I also need to ask what is the purpose for making your pointers static? Static function variables are basically globals, and only to be used in rare cases.
Your code does have a memory leak - it is allocating that memory each time the function is called. Even your current method still has the memory leak - if you only call free() once, in the final iteration, then you have only freed the most recent allocation.
However, a memory leak will not generally cause a segmentation fault (unless your memory leak exhausts all available memory, causing subsequent malloc() calls to return NULL).
If you wish to have static structures that are only allocated once and re-used, you do not need to use malloc() at all - you can simply change your declarations to:
static Nav_str Nav_IN_OUT;
static hguid_ref Guid_IN_OUT;
static HopControl Control_IN_OUT;
... and use Nav_IN_OUT.field instead of Nav_IN_OUT_ptr->field, and &Nav_IN_OUT in place of Nav_IN_OUT_ptr (if you are directly passing the pointer value to other functions).
My understanding was that this memory was supposed to clean itself up, is that incorrect?
Sorry, but you were incorrect. :) Memory allocated with malloc() will persist until you manually remove it with free(). (You did get this right in the end. Hooray. :)
Is this the correct way to free this memory? Can I free this memory?
That is the correct way to free the memory, but it might not be in the correct place. In general, try to write your free() calls the same time you write your malloc() calls.
Maybe you allocate at the start of a function and then free at the end of the function. (In that case, on-stack memory use might be better, if the memory is only ever used by functions called by the original function.)
Maybe you have a foo_init() function that calls malloc() and creates associated contexts from an API, then you pass that context into other routines that operate on that data, and then you need to place the free() calls into a foo_destroy() or foo_free() or similar routine. All your callers then need to balance the foo_init() and foo_free() calls. This would be especially appropriate if you can't just write the foo_init() and foo_destroy() calls in one function; say, your objects might need to be removed at some random point in a larger event loop.
And maybe the data should just be allocated once and live forever. That would be correct for some application designs, and it's tough to tell just from the variable names whether or not these blocks of data should live forever.
Might there be another reason for the segmentation error other than C not giving
up the memory properly after the last call, or Matlab not properly managing its memory?
There sure could be; perhaps this memory is being returned too soon, perhaps some pointer is being free()ed two or more times, or you're overwriting your buffers (that malloc(sizeof(Nav_str)) call is a little worrying; it is probably just allocating four or eight bytes, based on the pointer size on your platform; and before you replace it with strlen(), note that strlen() won't leave space for a NUL byte at the end of the string; malloc(len+1); is the usual pattern for allocating memory for a string, and I get concerned any time I don't see that +1 in the call.)
Some time with valgrind would doubtless help find memory errors, and maybe some time with Electric Fence could help. valgrind is definitely newer, and can definitely handle 'large' programs better (since electric fence will allocate a new page for every malloc(), it can be expensive).
Related
I am coding for an embedded system using ARM cross toolchain arm-none-ebi-gcc. Because the code is running freeRTOS which has its own heap memory management so I want to overwrite malloc(), free() and realloc() in the libc and wrap them simply to call the functions in freeRTOS. Only one problme, the freeRTOS does not have realloc(), that's strange, but my code definitely need it. So I want to understand, what will happen if I only overwrite malloc() and free() but still keep the realloc() the version as it be in the libc? Also, I feel providing my own realloc() that just call malloc() with the new size and do the memcopy after the new memory block got allocated looks not safe enough to my mind, because the new size usually larger than the old size in my application, so when I do a memcopy() with a size larger than the actually allocated memory block will could create some pointer access error, it that possible?
Thanks in advance.
-woody
Partially replacing the allocator (replacing some functions but not others) can't work. At worst, you will get serious heap data structure corruption from one implementation interpreting another's data structures as its own. It's possible to harden against this so that things just fail to link or fail to allocate (provide null pointer results) at runtime if this is done, and I did this in musl libc as described in these commits:
https://git.musl-libc.org/cgit/musl/commit/src/malloc?id=c9f415d7ea2dace5bf77f6518b6afc36bb7a5732
https://git.musl-libc.org/cgit/musl/commit/src/malloc?id=618b18c78e33acfe54a4434e91aa57b8e171df89
https://git.musl-libc.org/cgit/musl/commit/src/malloc?id=b4b1e10364c8737a632be61582e05a8d3acf5690
but I doubt many other implementations take the same precautions. And they won't help what you want actually work; they'd just prevent catastrophic results.
If you really need realloc, you're going to have to make a working one for the implementation you're adopting. The easiest way to do that is make it just malloc, memcpy, and free, but indeed you need a way to determine the length argument to pass to memcpy. If you just pass the new length, it might be safe on a microcontroller without MMU, as long as your lengths aren't so large they risk running over into an MMIO range or something. But the right thing to do is read the malloc implementation enough to understand where it's stored the allocated size, and write your own code to extract that. At that point you can write a correct/valid realloc using memcpy.
I have a large body of legacy code that I inherited. It has worked fine until now. Suddenly at a customer trial that I cannot reproduce inhouse, it crashes in malloc. I think that I need to add instrumentation e.g on top of malloc I have my own malloc that stores some meta information about each malloc e.g. who has made the malloc call. When it crashes, I can then look up the meta information and see what was happening. I had done something similar years ago but cannot recall it now...I am sure people have come up with better ideas. Will be glad to have inputs.
Thanks
Is memory allocation broken?
Try valgrind.
Malloc is still crashing.
Okay, I'm going to have to assume that you mean SIGSEGV (segmentation fault) is firing in malloc. This is usually caused by heap corruption. Heap corruption, that itself does not cause a segmentation fault, is usually the result of an array access outside of the array's bounds. This is usually nowhere near the point where you call malloc.
malloc stores a small header of information "in front of" the memory block that it returns to you. This information usually contains the size of the block and a pointer to the next block. Needless to say, changing either of these will cause problems. Usually, the next-block pointer is changed to an invalid address, and the next time malloc is called, it eventually dereferences the bad pointer and segmentation faults. Or it doesn't and starts interpreting random memory as part of the heap. Eventually its luck runs out.
Note that free can have the same thing happen, if the block being released or the free block list is messed up.
How you catch this kind of error depends entirely on how you access the memory that malloc returns. A malloc of a single struct usually isn't a problem; it's malloc of arrays that usually gets you. Using a negative (-1 or -2) index will usually give you the block header for your current block, and indexing past the array end can give you the header of the next block. Both are valid memory locations, so there will be no segmentation fault.
So the first thing to try is range checking. You mention that this appeared at the customer's site; maybe it's because the data set they are working with is much larger, or that the input data is corrupt (e.g. it says to allocate 100 elements and then initializes 101), or they are performing things in a different order (which hides the bug in your in-house testing), or doing something you haven't tested. It's hard to say without more specifics. You should consider writing something to sanity check your input data.
Try Asan
AddressSanitizer (aka ASan) is a memory error detector for C/C++. It finds:
Use after free (dangling pointer dereference)
Heap buffer overflow
Stack buffer overflow
Global buffer overflow
Use after return
Use after scope
Initialization order bugs
Memory leaks
Please find the links to know more and how to use it
https://github.com/google/sanitizers/wiki/AddressSanitizer and
https://github.com/google/sanitizers/wiki/AddressSanitizerFlags
I know this is old, but issues like this will continue to exist as long as we have pointers. Although valgrind is the best tool for this purpose, it has a steep learning curve and often the results are too intimidating to understand.
Assuming you are working on some *nux, another tool I can suggest is electricfence. Quote:
Electric Fence helps you detect two common programming bugs:
software that overruns the boundaries of a malloc() memory allocation,
software that touches a memory allocation that has been released by free().
Unlike other malloc() debuggers, Electric Fence will detect read accesses
as well as writes, and it will pinpoint the exact instruction that causes
an error.
Usage is amazingly simple. Just link your code with an additional library lefence
When you run the application, a corefile will be generated when memory is corrupted, instead of when corrupted memory is used.
The problem with the current malloc function is that is does not keep track of the variable that stores the returned memory location. As such fragmentation can occur because it is harder to move around memory.
The MMU can solve this only to a certain extent. Lets say that instead malloc took a double pointer and kept track of the variable. Calls to free would allow free to move around memory and change the memory location.
It is highly unlikely that I am the first to think about this so I am wondering if there is a standard C function that does this or POSIX function?
I understand that this idea is not perfect. The program would have to pass around the same variable instead of copying it however it does solve the issue of fragmentation which does matter to me as I work with low memory devices.
Of course, the realloc() function will, if necessary, move a previously allocated block of memory to a new (larger) location. However, it does not necessarily impact fragmentation.
My solution (in C) has been (in low-memory conditions) to do my own memory management.
Though not the same mechanism, the closest thing to what you are speaking of are smart pointers as implemented with the boost libraries. However, they are built for C++.
Smart pointers are 'smart' in that they don't hang around after they aren't needed (and you don't have to free them) so they avoid most of the fragmentation problems you cite.
Keeping track of which variable points to which memory location cannot be done by just passing a pointer to a pointer. What if the address of the allocated memory is copied to another variable?
C is unlike Java that keeps track of object references. In your case, you may be better off managing memory on your own by preallocating a large chunk of memory and splitting it as needed, keeping track of usage, in brief, implementing your own memory management.
The idea is not that attractive when you start thinking about implementation details. You can of course write a function that returns a double pointer. What about the intermediate pointer? Where should it be stored? The program itself clearly cannot store it, because the intermediate oointer should have exactly the same lifetime as the pointed-to memory. So the allocator itself should store and manage it. And the memory where the intermediate pointer lives is not movable. But the killer misfeature is that the scheme is not thread safe. As pointers are free to change under the hood, each pointer dereference now requires a lock.
From what I understand because malloc dynamically assigns mem , you need to free that mem so that it can be used again.
What happens if you return a char* that was created using malloc (i.e. how are you supposed to free that)
If you leave the pointer as it is
and exit the application will it be
freed.(I cant find a definite answer on this , some say yes , some say no).
The caller has to free it (or arrange for it to be freed). This means that functions that create and return resources need to document exactly how it should be freed.
Most OSes will free the memory when the program exits, as part of the definition of a "process". The C standard doesn't care what happens, it's beyond the scope of the program. Not all OSes have a full process abstraction, but desktop-style OSes certainly do.
The main reasons to free it before that are:
If you free memory as soon as possible, often a long time before process exit, your program uses less memory total.
If you don't free it, and you later want to change your program into a routine within another program, that perhaps is called many times, then suddenly you require many times as much memory as before (memory leak).
There are debugging tools that will help you identify memory leaks, by warning you about memory that is still allocated when the program exits. These don't really help much if there's a lot of deliberately-leaked junk to wade through.
If you don't free it and you hit any problems, it's much harder to go back later and find all the memory that needs freeing, than it is to do it right in the first place.
There are so many cases where you do need to free the memory (to prevent huge memory use in long-running programs), that your default strategy must be to clean pretty much everything up anyway.
The vaguely plausible reasons not to free are:
Less code.
If you have squillions of blocks to free individually, immediately before program exit, then it might be much faster to let the OS drop the whole process.
Stuff which is created on demand and stored in globals might be quite difficult to clean up safely, if you don't know exactly where it's used. Think of some kind of cache that's populated as you go along, that might have MRU rules to limit how much memory it occupies, so it's not an unlimited leak. OK, so this is one bad thing (unrestricted globals) causing another bad thing (unfreed memory), but it's worth knowing about as a reason why you might see unfreed blocks in existing code, and you can't necessarily just go in and fix them.
The reasons for freeing almost always outweigh the reasons against.
If you have a pointer to memory created by malloc, freeing that memory, using that pointer, will do the right thing. Yes, there is some magic involved; this will be taken care of by your compiler.
Yes, if you ignore the memory freeing, and exit the application, the OS will release the memory. However, it's considered bad practice to leave it unfreed. The OS may not do the right thing (especially in embedded settings), or may not do it in a timely fashion. Also, if you're running your program continuously, you may end up consuming a growing amount of memory, eventually consuming it all, and running out of memory and crashing.
Yes. If you malloc, you need to free. You are guaranteeing memory leaks while your program is running if you don't free.
Free it.
Always.
Period.
Yes, every call to malloc() has to be matched with a call to free().
To answer your specific questions:
You have to explicitly document your API telling the user whether the returned pointer has to be free()'d
The OS will free all memory allocated to the process.
If you write the function yourself: Avoid doing that.
Instead, let the caller pass a buffer, let the caller specify the buffer's size and copy the data into that buffer. That way, you can use your function from other modules that don't use the same heap (other programming languages, different C runtime...)
If you for whatever reason can not use such an interface, specify in the function's documentation that the caller has to free the returned pointer after it is done with it.
If you are using a library function: Have a look at the documentation.
If the documentation states that you have to free, do so.
If the documentation states that you don't have to, it might be some global cleanup function that has to be called to free the module's resources.
Regarding your second question, freeing before exiting is recommended. Technically it wont hurt, but when you ever want to reuse your code in a bigger project, you will be thankful that you wrote the correct cleanup in the first place.
The C standard has no concept of the system environment outside of a single program's execution, so it cannot specify what happens "after the program exits". At the same time, nowhere does it make any requirement that memory obtained with malloc should or must be released with free before a call to exit or a return from main, and I think it's pretty clear that the intention is that exiting without manually freeing memory will not leave resources tied up - much like how calling exit without closing all files first automatically closes them (including flushing them).
Now, as for whether you should or should not call free, that depends a lot on your particular program.
Any library code should free any memory that it obtained purely for internal use as soon as possible.
A library which returns allocated objects to the calling program should always provide a corresponding call to free those objects.
A library which performs any allocations as part of a global initialization (note: this is a very bad design, but sometimes inevitable) should provide a way for the application to reverse that initialization and free everything that was allocated. This is especially important if the library might ever be loaded dynamically (even as a consequence of satisfying another dynamically-loaded library's dependencies).
So far I've only talked about library code. At this point, all that's left is allocations made by the application itself or on the application's behalf by libraries. My view, and I will admit that it is unorthodox, is that freeing such objects is not just unnecessary but harmful. The main reason I say this is that most long-lived applications will have accumulated quite a bit of allocated memory which they are not making significant use of (think of the undo buffer in a word processor or the history in a browser). On a moderately loaded system, much of this data has been swapped to disk by the time the application terminates. If you want to free it, you're going to end up walking all over swapped-out memory addresses tracking down all the pointers to free,
putting useless wear on the physical components of the hard drive
making the user wait for your application to exit
causing other still-in-use applications' data to get swapped out, making them run slower
All of this in the name of a ridiculous "you must free everything you allocate" rule.
For short-lived applications, it's not so much of a big deal, but you can often simplify the implementation of short-lived applications that perform a single linear task and exit if you don't bother freeing all the memory they allocate. Think of most unix command line utilities. Is there any use to writing the loops for sed to free all its compiled regular expressions before exiting? Couldn't programmers' time be spent on something more productive?
1) The same way you'd free the memory normally, i.e.
p = func();
//...
free(p);
The trick is in making sure that you always do it...
2) Generally speaking, yes. But you should still free any memory you use as good practice. Not spending the time to figure out where to free the memory is just being lazy.
Let's take those one point at a time...
If you return a char * that you know was created with malloc, then yes, it is your responsibility to free that. You can do that with free(myCharPtr).
The OS will claim the memory back, and it won't be lost forever, but there's technically no guarantee that it will be reclaimed right when the application dies. That just depends on the operating system.
I wouldn't go so far as to say every malloc must be freed, but I would say that, no matter how long a program runs, there must be a bounded number of allocations (and total size) that won't be freed. The number need not be a static constant, but it must be specifiable in terms of something else (e.g. this program processes widgets; it will allocate one 64-byte struct for each quizzix in the largest widget). One may not know beforehand the size of the largest widget, but if e.g. one knows that the temporary storage required to process a widget is proportional to the square of its size, one might safely infer that the largest widget will be small enough that the total amount of memory stranded will be pretty slight.
I read somewhere that it is disastrous to use free to get rid of an object not created by calling malloc, is this true? why?
That's undefined behavior - never try it.
Let's see what happens when you try to free() an automatic variable. The heap manager will have to deduce how to take ownership of the memory block. To do so it will either have to use some separate structure that lists all allocated blocks and that is very slow an rarely used or hope that the necessary data is located near the beginning of the block.
The latter is used quite often and here's how i is supposed to work. When you call malloc() the heap manager allocates a slightly bigger block, stores service data at the beginning and returns an offset pointer. Smth like:
void* malloc( size_t size )
{
void* block = tryAlloc( size + sizeof( size_t) );
if( block == 0 ) {
return 0;
}
// the following is for illustration, more service data is usually written
*((size_t*)block) = size;
return (size_t*)block + 1;
}
then free() will try to access that data by offsetting the passed pointer but if the pointer is to an automatic variable whatever data will be located where it expects to find service data. Hence undefined behavior. Many times service data is modified by free() for heap manager to take ownership of the block - so if the pointer passed is to an automatic variable some unrelated memory will be modified and read from.
Implementations may vary but you should never make any specific assumptions. Only call free() on addresses returned by malloc() family functions.
By the standard, it's "undefined behavior" - i.e. "anything can happen". That's usually bad things, though.
In practice: free'ing a pointer means modifying the heap. C runtime does virtually never validate if the pointer passed comes from the heap - that would be to costly in either time or memory. Combine these two factoids, and you get "free(non-malloced-ptr) will write something somewhere" - the resutl may be some of "your" data modified behind your back, an access violation, or trashing vital runtime structures, such as a return address on the stack.
Example: A disastrous scenario:
Your heap is implemented as a simple list of free blocks. malloc means removing a suitable block from the list, free means adding it to the list again. (a typical if trivial implementation)
You free() a pointer to a local variable on the stack. You are "lucky" because the modification goes into irrelevant stack space. However, part of the stack is now on your free list.
Because of the allocator design and your allocation patterns, malloc is unlikely to return this block. Later, in an completely unrelated part of the program, you actually do get this block as malloc result, writing to it trashes some local variables up the stack, and when returning some vital pointer contains garbage and your app crashes. Symptoms, repro and location are completely unrelated to the actual cause.
Debug that.
It is undefined behaviour. And logically, if behaviour is undefined, you cannot be sure what has happened, and if the program is still operating properly.
Some people have pointed out here that this is "undefined behavior". I'm going to go farther and say that on some implementations, this will either crash your program or cause data corruption. It has to do with how "malloc" and "free" are implemented.
One possible way to implement malloc/free is to put a small header before each allocated region. On a malloc'd region, that header would contain the size of the region. When the region is freed, that header is checked and the region is added to the appropriate freelist. If this happens to you, this is bad news. For example, if you free an object allocated on the stack, suddenly part of the stack is in the freelist. Then malloc might return that region in response to a future call, and you'll scribble data all over your stack. Another possibility is that you free a string constant. If that string constant is in read-only memory (it often is), this hypothetical implementation would cause a segfault and crash either after a later malloc or when free adds the object to its freelist.
This is a hypothetical implementation I am talking about, but you can use your imagination to see how it could go very, very wrong. Some implementations are very robust and are not vulnerable to this precise type of user error. Some implementations even allow you to set environment variables to diagnose these types of errors. Valgrind and other tools will also detect these errors.
Strictly speaking, this is not true. calloc() and realloc() are valid object sources for free(), too. ;)
Please have a look at what undefined behavior means. malloc() and free() on a conforming hosted C implementation are built to standards. The standards say the behavior of calling free() on a heap block that was not returned by malloc() (or something wrapping it, e.g. calloc()) is undefined.
This means, it can do whatever you want it to do, provided that you make the necessary modifications to free() on your own. You won't break the standard by making the behavior of free() on blocks not allocated by malloc() consistent and even possibly useful.
In fact, there could be platforms that (themselves) define this behavior. I don't know of any, but there could be some. There are several garbage collecting / logging malloc() implementations that might let it fail more gracefully while logging the event. But thats implementation , not standards defined behavior.
Undefined simply means don't count on any kind of consistent behavior unless you implement it yourself without breaking any defined behavior. Finally, implementation defined does not always mean defined by the host system. Many programs link against (and ship) uclibc. In that case, the implementation is self contained, consistent and portable.
It would certainly be possible for an implementation of malloc/free to keep a list of the memory blocks thats been allocated and in the case the user tries to free a block that isn't in this list do nothing.
However since the standard says that this isn't a requirement most implementation will treat all pointers coming into free as valid.