Avoiding malloc/free Overhead in structure allocations in C

Avoiding malloc/free Overhead in structure allocations in C - c

I am reading and experimenting pointers from this book,
http://shop.oreilly.com/product/0636920028000.do
In chapter 6 of this book under Avoiding malloc/free Overhead heading
author is suggesting how to avoid malloc/free overhead when doing lots of structure memory allocations/deallocations.
Below is the way he wrote the functions,
#define LIST_SIZE 10
Person *list[LIST_SIZE];
void initializeList()
{
int i=0;
for(i=0; i<LIST_SIZE; i++)
{
list[i] = NULL;
}
}
Person *getPerson()
{
int i=0;
for(i=0; i<LIST_SIZE; i++)
{
if(list[i] != NULL)
{
Person *ptr = list[i];
list[i] = NULL;
return ptr;
}
}
Person *person = (Person*)malloc(sizeof(Person));
return person;
}
void deallocatePerson(Person *person)
{
free(person->firstName);
free(person->lastName);
free(person->title);
}
Person *returnPerson(Person *person)
{
int i=0;
for(i=0; i<LIST_SIZE; i++)
{
if(list[i] == NULL)
{
list[i] = person;
return person;
}
}
deallocatePerson(person);
free(person);
return NULL;
}
What I understood from his code, that he creates a memory pool array, pointing to struct person type and then initialize each array element with NULL.
Next we will get a memory from pool using getPerson function. This function, checks against !=NULL which I think will fail every time. So again it will be same, as doing malloc and memory is not getting assigned from the pool anytime.
Is my understanding correct?
Is this the way to handle overhead ?
What should be the correct way to do it? Any source/link would be appreciated.

Next we will get a memory from pool using getPerson function. This function, checks against !=NULL which I think will fail every time.
The check will fail every time as long as you continue calling getPerson repeatedly. However, if you do a mixture of getPerson and returnPerson, some NULL checks will succeed, because returnPerson puts non-NULL values into the array.
This observation is key to understanding the approach: the array serves as a small temporary storage for struct Person blocks that have been allocated with malloc, but are no longer in use. Rather than calling malloc again, your code grabs an available block from this special list, if there is one available.
In situations when you make thousands of allocations, but never keep more than LIST_SIZE objects active at any given time, the number of malloc calls is limited to LIST_SIZE.
Is this the way to handle overhead?
This is a variation on using lookaside lists, an optimization technique so important that Microsoft created an API for its use in driver code. A simpler approach would use Person *list[LIST_SIZE] as a stack of released blocks, i.e. with the index of the last released block and no loop.
Another approach would be to set up a linked list of such blocks, reusing the memory of the block itself to store the next pointer. This technique may be too complex for an introductory boo, though.

First of all what your writer referring to overhead here? For dynamic memory allocation we call malloc to allocate memory and free to release the memory. Also during this process Operating System need to search for available memory from heap and allocate the same as well. To avoid this overhead, he is just suggesting that at the very beginning when your application load and if you know the frequency of probable dynamic memory allocation to a struct, you can in advance reserved a pool of memory which will reduce allocation and deallocation of memory overhead significantly. That is true to a certain extent, if your server is already running a lots of application and processors are very busy you can go for this kind of approach. But there are drawbacks as well. In this case you already reserved a pool of memory from your heap in advance. If not utilized properly it will leads to poor memory management.

I think the point of the example might be that a Person object holds pointers to additional memory. We can see from the deallocatePerson function that there are 3 pointers to strings inside the struct:
void deallocatePerson(Person *person)
{
free(person->firstName);
free(person->lastName);
free(person->title);
}
It means that to construct a complete Person you need several calls to malloc (1 for the struct itself and 3 for the strings).
So by saving a complete struct, including its strings, one getPerson call replaces four calls to malloc. That makes it likely to save some execution time.
Otherwise, I would not be surprised if malloc/free internally holds a similar array or linked list of recently used memory blocks ready to be recycled. If you have just free'd a memory block of the correct size, a new call to malloc will likely locate that block very fast.
Had Person been a simple struct without pointers to additional storage, the local caching is not that likely to improve performance (but perhaps instead add overhead by doing a linear search).

Related

Why does malloc need to be used for dynamic memory allocation in C?

I have been reading that malloc is used for dynamic memory allocation. But if the following code works...
int main(void) {
int i, n;
printf("Enter the number of integers: ");
scanf("%d", &n);
// Dynamic allocation of memory?
int int_arr[n];
// Testing
for (int i = 0; i < n; i++) {
int_arr[i] = i * 10;
}
for (int i = 0; i < n; i++) {
printf("%d ", int_arr[i]);
}
printf("\n");
}
... what is the point of malloc? Isn't the code above just a simpler-to-read way to allocate memory dynamically?
I read on another Stack Overflow answer that if some sort of flag is set to "pedantic", then the code above would produce a compile error. But that doesn't really explain why malloc might be a better solution for dynamic memory allocation.

Look up the concepts for stack and heap; there's a lot of subtleties around the different types of memory. Local variables inside a function live in the stack and only exist within the function.
In your example, int_array only exists while execution of the function it is defined in has not ended, you couldn't pass it around between functions. You couldn't return int_array and expect it to work.
malloc() is used when you want to create a chunk of memory which exists on the heap. malloc returns a pointer to this memory. This pointer can be passed around as a variable (eg returned) from functions and can be used anywhere in your program to access your allocated chunk of memory until you free() it.
Example:
'''C
int main(int argc, char **argv){
int length = 10;
int *built_array = make_array(length); //malloc memory and pass heap pointer
int *array = make_array_wrong(length); //will not work. Array in function was in stack and no longer exists when function has returned.
built_array[3] = 5; //ok
array[3] = 5; //bad
free(built_array)
return 0;
}
int *make_array(int length){
int *my_pointer = malloc( length * sizeof int);
//do some error checking for real implementation
return my_pointer;
}
int *make_array_wrong(int length){
int array[length];
return array;
}
'''
Note:
There are plenty of ways to avoid having to use malloc at all, by pre-allocating sufficient memory in the callers, etc. This is recommended for embedded and safety critical programs where you want to be sure you'll never run out of memory.

Just because something looks prettier does not make it a better choice.
VLAs have a long list of problems, not the least of which they are not a sufficient replacement for heap-allocated memory.
The primary -- and most significant -- reason is that VLAs are not persistent dynamic data. That is, once your function terminates, the data is reclaimed (it exists on the stack, of all places!), meaning any other code still hanging on to it are SOL.
Your example code doesn't run into this problem because you aren't using it outside of the local context. Go ahead and try to use a VLA to build a binary tree, then add a node, then create a new tree and try to print them both.
The next issue is that the stack is not an appropriate place to allocate large amounts of dynamic data -- it is for function frames, which have a limited space to begin with. The global memory pool, OTOH, is specifically designed and optimized for this kind of usage.
It is good to ask questions and try to understand things. Just be careful that you don't believe yourself smarter than the many, many people who took what now is nearly 80 years of experience to design and implement systems that quite literally run the known universe. Such an obvious flaw would have been immediately recognized long, long ago and removed before either of us were born.
VLAs have their place, but it is, alas, small.

Declaring local variables takes the memory from the stack. This has two ramifications.
That memory is destroyed once the function returns.
Stack memory is limited, and is used for all local variables, as well as function return addresses. If you allocate large amounts of memory, you'll run into problems. Only use it for small amounts of memory.

When you have the following in your function code:
int int_arr[n];
It means you allocated space on the function stack, once the function will return this stack will cease to exist.
Image a use case where you need to return a data structure to a caller, for example:
Car* create_car(string model, string make)
{
Car* new_car = malloc(sizeof(*car));
...
return new_car;
}
Now, once the function will finish you will still have your car object, because it was allocated on the heap.

The memory allocated by int int_arr[n] is reserved only until execution of the routine ends (when it returns or is otherwise terminated, as by setjmp). That means you cannot allocate things in one order and free them in another. You cannot allocate a temporary work buffer, use it while computing some data, then allocate another buffer for the results, and free the temporary work buffer. To free the work buffer, you have to return from the function, and then the result buffer will be freed to.
With automatic allocations, you cannot read from a file, allocate records for each of the things read from the file, and then delete some of the records out of order. You simply have no dynamic control over the memory allocated; automatic allocations are forced into a strictly last-in first-out (LIFO) order.
You cannot write subroutines that allocate memory, initialize it and/or do other computations, and return the allocated memory to their callers.
(Some people may also point out that the stack memory commonly used for automatic objects is commonly limited to 1-8 mebibytes while the memory used for dynamic allocation is generally much larger. However, this is an artifact of settings selected for common use and can be changed; it is not inherent to the nature of automatic versus dynamic allocation.)

If the allocated memory is small and used only inside the function, malloc is indeed unnecessary.
If the memory amount is extremely large (usually MB or more), the above example may cause stack overflow.
If the memory is still used after the function returned, you need malloc or global variable (static allocation).
Note that the dynamic allocation through local variables as above may not be supported in some compiler.

Passing struct by reference vs after malloc

What is the difference between these two approaches in terms of data visibility and memory overhead (any other differences would be great as well):
Passing a local struct by reference
Passing a pointer to the allocated memory
typedef struct student_data_t_ {
char name[30];
int score;
} student_data_t;
void fill_values (student_data_t *data) {
snprintf(data->name, 30, "Howard");
data->score = 20;
return;
}
int main (void) {
student_data_t record;
student_data_t *ptr = NULL;
fill_values(&record); // <1> passing the struct by reference
ptr = (student_data_t *)malloc(sizeof(student_data_t));
if (!ptr) {
printf("NOMEM");
return 0;
}
fill_values(ptr); // <2> passing after allocating memory
if (ptr) {
free(ptr);
}
return 0;
}

For a local variable the memory overhead will exist entirely in the stack for that thread. If you use a large structure in say an embedded system, this can become a concern, as you risk a stack overflow. Generally you will use only the number of bytes requested, but stack space can be limited. (Some applications I work in have a stack of 512 bytes or less)
In the case of using malloc, you are allocating memory from the heap. This avoid the stack size concerns, but adds the requirement that you free the memory when you are done.
Visibility is determined by the variable you store the pointer in.
It is very dangerous to pass a local variable to a seperate thread, and can lead to undefined behavior if the local variable becomes invalid, say due to the function returning.

Passing a local structure is bound to be faster, the program can do groovey things at compilation time. From a machine code point of view, a local structure is effectively a constant memory address.
When you start to use malloc there is bound to be a processing overhead, there is also a space issue. Although both structures are the same malloc will probably "use" more memory than sizeof(struct) just to store the data. malloc also reserves space on every page to maintain and memory address size allocated memory lookup table, this allows free to only need an address as paramaters.
One of the biggest issues is development time introduction malloc and free to programs increases the chance of bugs, especially segmentation faults. Without mentioning the hard to track down "invisible " bug of a memory leak.
But using malloc and calloc is the only way to deal with user input, you never know how much data they're gonna enter, a text input buffer of say 2kb can easily over fill with a call to fgets

malloc checkpoints

I am sure someone must have implemented something like this already!
What I am looking for is the ability to "checkpoint" the heap state and then clear all allocations that have happened since the last checkpoint.
Basically what I am looking for is a natural corollary of the _CrtMemCheck Apis.
Something like(preferably cross-platform)
//we save the heap state here in s1
_CrtMemCheckpoint( &s1 );
//allocs and frees
//Get rid of all allocs since checkpoint s1 that have not been freed!
_CrtMemClearAllObjectsSince(&s1);

There is no standard way to use mark/release memory allocation in C. If you know for a fact that all malloc/free calls will be used in a LIFO fashion, you may be able to link in your ownmalloc/free` functions using something like the following:
#define MY_HEAP_SIZE 12345678
unsigned char my_mem[MY_HEAP_SIZE];
unsigned char *my_alloc_ptr = my_mem;
void *malloc(size_t size)
{
void *ret = my_alloc_ptr;
if (size <= MY_HEAP_SIZE && ((my_alloc_ptr - my_mem)+size) <= MY_HEAP_SIZE)
{
my_alloc_ptr += size;
return (void*)ret;
}
else
return (void*)0;
}
void free(void *ptr)
{
if (ptr)
my_alloc_ptr = ptr;
}
This approach requires zero bytes of overhead per allocation block, but calling free() on any block will also free all blocks that were allocated later. An alternative approach which could be used if the external code doesn't use malloc/free in LIFO order, but it would be okay if blocks don't freed until your code does so, would be to make free() do nothing, but have some other function which behaves like free above. More sophisticated variations are possible as well, but in cases where the first approach will suffice, there's no beating its efficiency. Very nice for embedded systems (though I'd usually call it something other than malloc).

You can modify malloc()/free() using hooks to remember allocated memory (for example, suppose that your record the new pointer in an array of pointers). Then your can have two functions:
int get_checkpoint(), that returns the next free array index,
void free_until(int checkpoint), that frees memory from the current stored pointer in the array backwards, until checkpoint is reached.
This way, you can do:
int cpoint = get_checkpoint();
LibraryDoSomething();
free_until(cpoint);
Of course, this technique is still dangerous; calling a C library function can have side effects that you can easily affect. The best advice is still that of Amardeep.

Another possible and interesting solution could be the use of LD_PRELOAD. As the man page for LD_PRELOAD states "This can be used to selectively override functions in other shared libraries."
Thus, you can have your own implementations of malloc and free wherein you can implement the required checks and then call the default malloc or free.
You can check the details here: http://somethingswhichidintknow.blogspot.com/2009/10/dll-injection.html

What are the main things that should be considered while deallocating the memory in C?

I tried to find some tutorial which explicitly explains,what are the things that need to keep in mind while deallocating the memory. But I could not find the such things.Can anybody let me know what are the principal things that a programmer should keep in mind while deallocting the memory in C. I am currently dealing with linked lists.There are some cases where a new linked list is created using 2 or more existing linked list.For example:
list l1;
list l2
list l3 = list_append(l1,l2)
list l4 = list_append(l3,l1)
list l5 = list_append(l3,l4)
What is the sequence of deallocation that i have to follow to deallocate the memory?
here list_append is the function that returns the copy of the list.

When using the malloc/free family of function there are two rules to be obeyed.
You can only free valid memory returned by a malloc family allocator, and freeing it renders it invalid (so double freeing is an error as is freeing memory not obtained from malloc).
It is an error to access memory after it has been freed.
And here is the important part: the allocator provides no facilities to help you obey these rules.
You have to manage it yourself. This means that you have to arrange the logic of your program to insure that these rules are always followed. This is c and huge amounts of tedious and complex responsibility are dumped on your shoulders.
Let me suggest two patterns that are fairly safe:
Allocate and free in the same context
//...
{
SomeData *p = malloc(sizeof SomeData);
if (!p) { /* handle failure to allocate */ }
// Initialize p
// use p various ways
// free any blocks allocated and assigned to members of p
free p;
}
//...
Here you know that the data p points to is allocated once and freed once and only used in between. If the initialization and freeing of SomeData's contents are non-trivial you should wrap them up in a couple of function so that this reduces to
//...
{
SomeData *p = NewSomeData(i,f,"name"/*,...*/); // this handles initialization
if (!p) { /* handle failure to allocate */ }
// use p various ways
ReleaseSomeData(p) // this handles freeing any blocks
// allocated and assigned to members of p
}
//...
Call this one "Scope Ownership". You'll note that it is not much different from local automatic variables and provides you with only a few options not available with automatic
variables.
Call the second option "Structure Ownership": Here responsibility for deleting the allocated block is handed to a larger structure:
List L = NewList();
//...
while (something) {
// ...
Node n= NewNode(nodename);
if (!n) { /* handle failure to allocate */ }
ListAdd(L,n); // <=== Here the list takes ownership of the
// node and you should only access the node
// through the list.
n = NULL; // This is not a memory leak because L knows where the new block is, and
// deleting the knowledge outside of L prevents you from breaking the
// "access only through the structure" constraint.
//...
}
// Later
RemoveListNode(L,key); // <== This routine manages the deletion of one node
// found using key. This is why keeping a separate copy
// of n to access the node would have been bad (because
// your separate copy won't get notified that the block
// no longer valid).
// much later
ReleaseList(L); // <== Takes care of deleting all remaining nodes
Given as you have a list with nodes to be added and removed, you might consider the Structure Ownership pattern, so remember: once you give the node to the structure you only access it through the structure.

The question in general makes little sense, the only reasonable answer seems rather obvious:
That the memory was dynamically allocated in the first instance
That following deallocation you do not attempt to use the memory again
That you maintain at least one pointer to the allocation until you need to deallocate it (i.e. don't let your only reference to the block go out of scope or be destroyed).
This second requirement can be assisted by setting the pointer to NULL or zero after deallocation, but the pointer may be held elsewhere to it is not fool-proof.
The third requirement is particularly an issue in complex data structures where the allocated memory may contain structures that themselves contain pointers to allocated memory. You will of course need to deallocate these prior to deallocating the higher level structure.

Can anybody let me know what are the
principal things that a programmer
should keep in mind while deallocting
the memory in C.
The basic principles are pretty straightforward: any memory allocated using the *alloc family of functions, including malloc, calloc or realloc, must be deallocated by a corresponding call to free().
When passing a pointer (memory address) to free(), keep in mind that the only valid memory addresses you can pass to free() are memory addresses which were previously returned by one of the *alloc functions. Once a memory address has been passed to free(), that memory address is no longer valid and cannot be used for any other purpose.

The first principle is:
Whatever you allocate (with calloc /
malloc) you need to free eventually.
In you case if lists are deep copied on every append, I don't see what is the problem. You need to free every list separately.

Use valgrind(1) to show you, in your particular case, what objects still exist at termination and then modify your code to ensure that unnecessary ones are freed.

Well the best answer is a question.. why the heck are you writing code in C instead of using a higher level language with better memory management?

C Memory Management

I've always heard that in C you have to really watch how you manage memory. And I'm still beginning to learn C, but thus far, I have not had to do any memory managing related activities at all.. I always imagined having to release variables and do all sorts of ugly things. But this doesn't seem to be the case.
Can someone show me (with code examples) an example of when you would have to do some "memory management" ?

There are two places where variables can be put in memory. When you create a variable like this:
int a;
char c;
char d[16];
The variables are created in the "stack". Stack variables are automatically freed when they go out of scope (that is, when the code can't reach them anymore). You might hear them called "automatic" variables, but that has fallen out of fashion.
Many beginner examples will use only stack variables.
The stack is nice because it's automatic, but it also has two drawbacks: (1) The compiler needs to know in advance how big the variables are, and (2) the stack space is somewhat limited. For example: in Windows, under default settings for the Microsoft linker, the stack is set to 1 MB, and not all of it is available for your variables.
If you don't know at compile time how big your array is, or if you need a big array or struct, you need "plan B".
Plan B is called the "heap". You can usually create variables as big as the Operating System will let you, but you have to do it yourself. Earlier postings showed you one way you can do it, although there are other ways:
int size;
// ...
// Set size to some value, based on information available at run-time. Then:
// ...
char *p = (char *)malloc(size);
(Note that variables in the heap are not manipulated directly, but via pointers)
Once you create a heap variable, the problem is that the compiler can't tell when you're done with it, so you lose the automatic releasing. That's where the "manual releasing" you were referring to comes in. Your code is now responsible to decide when the variable is not needed anymore, and release it so the memory can be taken for other purposes. For the case above, with:
free(p);
What makes this second option "nasty business" is that it's not always easy to know when the variable is not needed anymore. Forgetting to release a variable when you don't need it will cause your program to consume more memory that it needs to. This situation is called a "leak". The "leaked" memory cannot be used for anything until your program ends and the OS recovers all of its resources. Even nastier problems are possible if you release a heap variable by mistake before you are actually done with it.
In C and C++, you are responsible to clean up your heap variables like shown above. However, there are languages and environments such as Java and .NET languages like C# that use a different approach, where the heap gets cleaned up on its own. This second method, called "garbage collection", is much easier on the developer but you pay a penalty in overhead and performance. It's a balance.
(I have glossed over many details to give a simpler, but hopefully more leveled answer)

Here's an example. Suppose you have a strdup() function that duplicates a string:
char *strdup(char *src)
{
char * dest;
dest = malloc(strlen(src) + 1);
if (dest == NULL)
abort();
strcpy(dest, src);
return dest;
}
And you call it like this:
main()
{
char *s;
s = strdup("hello");
printf("%s\n", s);
s = strdup("world");
printf("%s\n", s);
}
You can see that the program works, but you have allocated memory (via malloc) without freeing it up. You have lost your pointer to the first memory block when you called strdup the second time.
This is no big deal for this small amount of memory, but consider the case:
for (i = 0; i < 1000000000; ++i) /* billion times */
s = strdup("hello world"); /* 11 bytes */
You have now used up 11 gig of memory (possibly more, depending on your memory manager) and if you have not crashed your process is probably running pretty slowly.
To fix, you need to call free() for everything that is obtained with malloc() after you finish using it:
s = strdup("hello");
free(s); /* now not leaking memory! */
s = strdup("world");
...
Hope this example helps!

You have to do "memory management" when you want to use memory on the heap rather than the stack. If you don't know how large to make an array until runtime, then you have to use the heap. For example, you might want to store something in a string, but don't know how large its contents will be until the program is run. In that case you'd write something like this:
char *string = malloc(stringlength); // stringlength is the number of bytes to allocate
// Do something with the string...
free(string); // Free the allocated memory

I think the most concise way to answer the question in to consider the role of the pointer in C. The pointer is a lightweight yet powerful mechanism that gives you immense freedom at the cost of immense capacity to shoot yourself in the foot.
In C the responsibility of ensuring your pointers point to memory you own is yours and yours alone. This requires an organized and disciplined approach, unless you forsake pointers, which makes it hard to write effective C.
The posted answers to date concentrate on automatic (stack) and heap variable allocations. Using stack allocation does make for automatically managed and convenient memory, but in some circumstances (large buffers, recursive algorithms) it can lead to the horrendous problem of stack overflow. Knowing exactly how much memory you can allocate on the stack is very dependent on the system. In some embedded scenarios a few dozen bytes might be your limit, in some desktop scenarios you can safely use megabytes.
Heap allocation is less inherent to the language. It is basically a set of library calls that grants you ownership of a block of memory of given size until you are ready to return ('free') it. It sounds simple, but is associated with untold programmer grief. The problems are simple (freeing the same memory twice, or not at all [memory leaks], not allocating enough memory [buffer overflow], etc) but difficult to avoid and debug. A hightly disciplined approach is absolutely mandatory in practive but of course the language doesn't actually mandate it.
I'd like to mention another type of memory allocation that's been ignored by other posts. It's possible to statically allocate variables by declaring them outside any function. I think in general this type of allocation gets a bad rap because it's used by global variables. However there's nothing that says the only way to use memory allocated this way is as an undisciplined global variable in a mess of spaghetti code. The static allocation method can be used simply to avoid some of the pitfalls of the heap and automatic allocation methods. Some C programmers are surprised to learn that large and sophisticated C embedded and games programs have been constructed with no use of heap allocation at all.

There are some great answers here about how to allocate and free memory, and in my opinion the more challenging side of using C is ensuring that the only memory you use is memory you've allocated - if this isn't done correctly what you end up with is the cousin of this site - a buffer overflow - and you may be overwriting memory that's being used by another application, with very unpredictable results.
An example:
int main() {
char* myString = (char*)malloc(5*sizeof(char));
myString = "abcd";
}
At this point you've allocated 5 bytes for myString and filled it with "abcd\0" (strings end in a null - \0).
If your string allocation was
myString = "abcde";
You would be assigning "abcde" in the 5 bytes you've had allocated to your program, and the trailing null character would be put at the end of this - a part of memory that hasn't been allocated for your use and could be free, but could equally be being used by another application - This is the critical part of memory management, where a mistake will have unpredictable (and sometimes unrepeatable) consequences.

A thing to remember is to always initialize your pointers to NULL, since an uninitialized pointer may contain a pseudorandom valid memory address which can make pointer errors go ahead silently. By enforcing a pointer to be initialized with NULL, you can always catch if you are using this pointer without initializing it. The reason is that operating systems "wire" the virtual address 0x00000000 to general protection exceptions to trap null pointer usage.

Also you might want to use dynamic memory allocation when you need to define a huge array, say int[10000]. You can't just put it in stack because then, hm... you'll get a stack overflow.
Another good example would be an implementation of a data structure, say linked list or binary tree. I don't have a sample code to paste here but you can google it easily.

(I'm writing because I feel the answers so far aren't quite on the mark.)
The reason you have to memory management worth mentioning is when you have a problem / solution that requires you to create complex structures. (If your programs crash if you allocate to much space on the stack at once, that's a bug.) Typically, the first data structure you'll need to learn is some kind of list. Here's a single linked one, off the top of my head:
typedef struct listelem { struct listelem *next; void *data;} listelem;
listelem * create(void * data)
{
listelem *p = calloc(1, sizeof(listelem));
if(p) p->data = data;
return p;
}
listelem * delete(listelem * p)
{
listelem next = p->next;
free(p);
return next;
}
void deleteall(listelem * p)
{
while(p) p = delete(p);
}
void foreach(listelem * p, void (*fun)(void *data) )
{
for( ; p != NULL; p = p->next) fun(p->data);
}
listelem * merge(listelem *p, listelem *q)
{
while(p != NULL && p->next != NULL) p = p->next;
if(p) {
p->next = q;
return p;
} else
return q;
}
Naturally, you'd like a few other functions, but basically, this is what you need memory management for. I should point out that there are a number tricks that are possible with "manual" memory management, e.g.,
Using the fact that malloc is guaranteed (by the language standard) to return a pointer divisible by 4,
allocating extra space for some sinister purpose of your own,
creating memory pools..
Get a good debugger... Good luck!

#Euro Micelli
One negative to add is that pointers to the stack are no longer valid when the function returns, so you cannot return a pointer to a stack variable from a function. This is a common error and a major reason why you can't get by with just stack variables. If your function needs to return a pointer, then you have to malloc and deal with memory management.

#Ted Percival:
...you don't need to cast malloc()'s return value.
You are correct, of course. I believe that has always been true, although I don't have a copy of K&R to check.
I don't like a lot of the implicit conversions in C, so I tend to use casts to make "magic" more visible. Sometimes it helps readability, sometimes it doesn't, and sometimes it causes a silent bug to be caught by the compiler. Still, I don't have a strong opinion about this, one way or another.
This is especially likely if your compiler understands C++-style comments.
Yeah... you caught me there. I spend a lot more time in C++ than C. Thanks for noticing that.

In C, you actually have two different choices. One, you can let the system manage the memory for you. Alternatively, you can do that by yourself. Generally, you would want to stick to the former as long as possible. However, auto-managed memory in C is extremely limited and you will need to manually manage the memory in many cases, such as:
a. You want the variable to outlive the functions, and you don't want to have global variable. ex:
struct pair{
int val;
struct pair *next;
}
struct pair* new_pair(int val){
struct pair* np = malloc(sizeof(struct pair));
np->val = val;
np->next = NULL;
return np;
}
b. you want to have dynamically allocated memory. Most common example is array without fixed length:
int *my_special_array;
my_special_array = malloc(sizeof(int) * number_of_element);
for(i=0; i
c. You want to do something REALLY dirty. For example, I would want a struct to represent many kind of data and I don't like union (union looks soooo messy):
struct data{
int data_type;
long data_in_mem;
};
struct animal{/*something*/};
struct person{/*some other thing*/};
struct animal* read_animal();
struct person* read_person();
/*In main*/
struct data sample;
sampe.data_type = input_type;
switch(input_type){
case DATA_PERSON:
sample.data_in_mem = read_person();
break;
case DATA_ANIMAL:
sample.data_in_mem = read_animal();
default:
printf("Oh hoh! I warn you, that again and I will seg fault your OS");
}
See, a long value is enough to hold ANYTHING. Just remember to free it, or you WILL regret. This is among my favorite tricks to have fun in C :D.
However, generally, you would want to stay away from your favorite tricks (T___T). You WILL break your OS, sooner or later, if you use them too often. As long as you don't use *alloc and free, it is safe to say that you are still virgin, and that the code still looks nice.

Sure. If you create an object that exists outside of the scope you use it in. Here is a contrived example (bear in mind my syntax will be off; my C is rusty, but this example will still illustrate the concept):
class MyClass
{
SomeOtherClass *myObject;
public MyClass()
{
//The object is created when the class is constructed
myObject = (SomeOtherClass*)malloc(sizeof(myObject));
}
public ~MyClass()
{
//The class is destructed
//If you don't free the object here, you leak memory
free(myObject);
}
public void SomeMemberFunction()
{
//Some use of the object
myObject->SomeOperation();
}
};
In this example, I'm using an object of type SomeOtherClass during the lifetime of MyClass. The SomeOtherClass object is used in several functions, so I've dynamically allocated the memory: the SomeOtherClass object is created when MyClass is created, used several times over the life of the object, and then freed once MyClass is freed.
Obviously if this were real code, there would be no reason (aside from possibly stack memory consumption) to create myObject in this way, but this type of object creation/destruction becomes useful when you have a lot of objects, and want to finely control when they are created and destroyed (so that your application doesn't suck up 1GB of RAM for its entire lifetime, for example), and in a Windowed environment, this is pretty much mandatory, as objects that you create (buttons, say), need to exist well outside of any particular function's (or even class') scope.