Proper use of malloc [closed]

Proper use of malloc [closed] - c

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
A chapter out of the book I have been reading has focused on memory management allocating space using malloc linux functions.
Before I read this I would make relatively small programs without allocating space.
Is it acceptable to not do anything in the way of memory allocation for applications whose memory footprint remains under 50MB? What are the repercussions of not doing so?

I think the answers are missing an important point. The size of memory is a relatively specific technical detail which isn't of primary interest. The crucial difference is that between automatic and dynamic storage, and the associated lifetime:
Automatic storage ends at the end of the scope.
Dynamic storage begins with malloc() and ends with free(), entirely at the discretion (and responsibility) of the user.
If you can and if it makes sense, everything should be automatic. This entails locality and well-defined interfaces. However, in C (not so much in C++) there comes a time when you need to talk about objects that aren't local to the scope. That's when we need dynamic allocation.
The prime example is your typical linked list. The list consists of nodes:
typedef struct node_tmp
{
int data;
struct node_tmp * next;
struct node_tmp * prev;
} node;
Now to talk about such a list boils down to talking about any of its nodes and brachiate along the prev/next pointers. However, the actual nodes cannot sensibly be part of any local scope, so they are usually dynamically allocated:
node * create_list()
{
node * p = malloc(sizeof node); // [1]
p->prev = p->next = 0;
return p;
}
void free_list(node * p) // call with head node
{
while (p->next)
{
node * tmp = p;
p = p->next;
free(tmp); // [2a]
}
free(p); // [2b]
}
void append_to_end(node * p, int data); // etc.
Here the list nodes exist outside any scope, and you have to bring them to life manually using malloc(), and clean them up when you're done.
You can use linked lists even in the tiniest of programs, but there's no real way around the manual allocation.
Edit: I thought of another example that should really convince you: You might think that you can just make the list with automatically allocated nodes:
node n1, n2, n3; // an automatic linked list
n1.prev = n3.next = 0;
n1.next = &n2; n2.prev = &n1; n2.next = &n3; n3.prev = &n2;
But note that you cannot do this dynamically! "Dynamic" means "at runtime", but automatic variables have to be determined entirely at compile time.
Suppose you wanted a program that reads integers from the user. If it's even, you add it to the list, if it's odd you ignore it, and if it's zero you stop. You cannot possibly realize such a program with automatic allocation, because the allocation needs are only determined at runtime.
It is in such a scenario that you require malloc().

If you can do without malloc for small applications, you're probably just not needing to use any heap space. Little utility programs or toy programs often don't. The things you might be doing wrong though to get by when you should be using the heap are:
Arrays. If you find yourself allocating large arrays 'just to make sure everything fits' then you should perhaps be using malloc. At the least, handle the error condition that everything overflows to check they really are big enough. With dynamically allocated arrays, you can make bigger ones on the fly if you find you need more space.
Doing too much recursion. C benefits from flattening out recursion sometimes into loops over arrays, because unlike function languages it can't optimise things properly. If you are getting your storage space by calling function lots to create it, that's pretty dangerous (the program might crash on you one day).
Using static pools of objects (structs, classes). Perhaps you have a ring buffer, and 15 objects that could be in it, and you have them statically allocated because you know that your buffer will never have more than 15 entries. That's kind of OK, but allowing the buffer to grow more by adding in more structs, created with malloc, might be nice.
Probably plenty more situations where programmes which don't need malloc could benefit from having it added.

The size of an application and the use of malloc() are two independant things. malloc() is used to allocate memory at runtime, when sizes are not known at compilation time.
Anyway, if you do know the maximum size of the structures that you want to play with, you can statically allocate them and build an application without using malloc(). Space critical software is an example of such applications.

You were probably allocating memory statically at compile time, but not dynamically.
The possible issue when allocating everything statically are :
you are wasting memory because you are always allocating an upper limit with a margin.
in some case, your application will run out of memory (because your estimation was wrong for example), and and since you can not add new memory ressources at runtime it can potentially be lethal.
That being said, in some cases like real-time embedded system, it is a requirement to not allocate any memory dynamically at runtime. (because you have hard memory constraints, or because allocating memory can break real time)

Related

What does someone mean by dynamically created structure when talking about the C language?

If someone is talking about solving a problem with the C programming language, and they say that dynamically created structures is the way to go, what are they likely to be referring to? Is there another name for this perhaps?

The assignment requires that you use dynamic memory allocation for your data structures. Your program may not use statically allocated memory, that is, for example int array[65536];. Instead all these needs to be allocated on demand using the malloc/calloc/realloc (and supposedly be freed using the free).

C allows you to allocate new memory while a program is running according to your needs (that's why is referred as "dynamical allocation").
For example: you have a very basic structure, a linked list, but you don't know how many nodes are required during the execution of your program. So in your code you declare that every time you need to store a new node in the list, the program must take this x amount of memory and allocate a new node (that will be attached to the existing list)
typedef struct {
int datum;
Node *next;
} Node;
then later on you can cay:
Node *new_node = (Node *)malloc(sizeof(Node);
In the same fashion you can free memory at run time:
free(new_node);

If your question is How to create structure dynamically?
Without knowing what you want to ask, I am just giving an answer, assuming you are asking about this only.
A dynamic data structure is a data structure that changes in size as a program needs it to by allocating and de-allocating memory from the heap -- a term used to describe unused memory available to the central processing unit(CPU) at any given time. A dynamic data structure lets a programmer control precisely how much memory is consumed by his or her program. When a dynamic data structure created in the C programming language allocates blocks of memory from the heap, it uses pointers to link those blocks together into a data structure of some kind. The data structure will return a block of memory to the heap when it doesn't need it any longer. This system of recycling memory blocks makes a program's use of memory very efficient.

Ways to avoid using malloc? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
With all recent noise C gets, I read that there are some ways to minimize usage of malloc in C, and that it is a very good practice. I how ever have no idea when or how or what such practices are good. So my question would be, maybe some experienced C programmers could give some examples where one could (or should) write something without malloc, but what would be really non-obvious way for newbie C programmer (and thus said newbie would simply use malloc)? Maybe you have some experiences from factoring out malloc into something else.
P.S. some posts I read were referencing Quake 3 source code and how it avoids use of malloc, so if someone has knowledge of this it would be interesting to know what is done there, since at least for know I would like to avoid digging into quakes code aimlessly. (since well if they avoid using malloc searching for malloc will not give much results I suppose, also code base is most likely not as simple as individual examples could be)

I don't know about totally avoiding malloc, but you can certainly reduce it.
The basic concept is a memory pool. That is a large buffer which you have allocated that you can use for many objects instead of requesting lots of small allocations.
You might use this in a real-world situation where you are sending events into a queue to be processed by another thread. The event objects might be smallish structures and you really need to avoid making thousands of calls to malloc every second.
The answer of course is to draw these event objects from a pool. If you need to, you can even use parts of your pool buffer to form a list so that you can quickly index memory that has been returned to the pool. These are generally known as free-lists.
You do have to be careful about memory alignment, as you can severely impact performance by having misaligned data. But you can handle all that with a little maths.
Don't freak out about these concepts. A pool doesn't actually have to be that sophisticated. Consider this:
int ** matrix = malloc( rows * sizeof(int*) );
for( int i = 0; i < rows; i++ ) {
matrix[i] = malloc( cols * sizeof(int) );
}
I see this all the time, and it's a pet peeve of mine. Why would you do that, when you can do this:
int ** matrix = malloc( rows * sizeof(int*) );
matrix[0] = malloc( rows * cols * sizeof(int) );
for( int i = 1; i < rows; i++ ) {
matrix[i] = matrix[i-1] + cols;
}
And of course, that reduces to this (beware of potential alignment issues in your first row though - I've ignored it here for the sake of clarity)
int ** matrix = malloc( rows * sizeof(int*) + rows * cols * sizeof(int) );
matrix[0] = (int*)matrix + rows;
for( int i = 1; i < rows; i++ ) {
matrix[i] = matrix[i-1] + cols;
}
The cool thing about that last example is how easy it is to delete your matrix =)
free( matrix );
Oh, and zeroing the matrix is just as easy...
memset( matrix[0], 0, rows * cols * sizeof(int) );

In the scenario where you need small, dynamic sized arrays in local scope, there is alloca() which allocates from the stack and doesn't need you to explicitly free the memory (it gets freed when the function returns), and there are variable length arrays (VLA):
void meh(int s) {
float *foo = alloca(s * sizeof(float));
float frob[s];
} // note: foo and frob are freed upon returning

If you know all the sizes of arrays, lists, stacks, trees, whatever data structures your program needs beforehand, you can allocate the required memory statically by defining arrays of constant number of elements. Pros: no memory management, no memory fragmentation, fast. Cons: limited use, wasted memory.
You can implement a custom memory allocator on top of malloc() or whatever your OS provides, allocate a big chunk of memory once and then carve it up without calling standard malloc() functions. Pros: fast. Cons: not quite trivial to implement right.
Another (and a rather perverse) way of avoiding malloc() would be to store most of your data in files instead of memory. Pros: virtually none.
You may also use local variables and deep function calls (or explicit recursion) to allocate space for data on the go if you're certain that the program's stack is going to be big enough. Pros: no memory management, easy, fast. Cons: limited use.
As an example of a working midsize project that avoids malloc() I can offer my pet project, Smaller C compiler. It statically allocates a number of arrays and it also allocates small local variables inside recursive functions. Beware, the code hasn't been beautified yet and it's not something small or easy to understand if you're fairly new to programming, C or compilers.

The primary reason for not using malloc in some particular cases is probably the fact that it employs a generic, one-size-fits-all approach to memory allocation.
Other approaches, such as memory pools and slab allocation may offer benefits in the case of having well-known allocation needs.
For example, it is much more advantageous for an allocator to assume that the allocated objects will be of a fixed size, or assume that their lifetime will be relatively short. A generic allocator cannot make such assumptions and cannot therefore perform optimally in such scenarios.
The potential benefits can include a decreased memory footprint due to the specialized allocator having a more condensed bookkeeping. A generic allocator most probably holds a larger amount of metadata for each allocated object, whereas an allocator that "knows" in advance what the object's size will be can probably omit it from the metadata.
It can also make a difference in allocation speed - a custom allocator will probably be able to find an empty slot faster.
This is all talking in relatives here, but the questions you should ask before choosing a custom allocation scheme are:
Do you need to allocate and deallocate a large number of objects having the same size? (Slab allocation)
Can these objects be disposed at once without the overhead of individual calls? (Memory pools)
Is there a logical grouping of the individually allocated objects? (Cache aware allocation)
The bottom line is, you have to inspect the allocation needs and patterns of your program carefully, and then decide whether a custom allocation scheme can be beneficial.

There are several reasons to avoid malloc - the biggest one in my mind is that "no malloc, no free" to paraphrase Bob Marley... So, no memory leaks from "forgetting" to call free.
And of course, you should always check for NULL when allocating memory dynamically. Avoiding this will reduce the amount of code and the complexity of the code.
Unfortunately, the alternative, running out of stack or global variable size is often worse, as it either crashes immediately with no meaningful error message to the user (stackoverflow) or buffer overflow in global variables - checking the boundaries in global variables will avoid this, but what do you do if you detect it? There aren't many choices.
The other part is of course that a call to malloc can be substantially expensive, compared to local variables. This is particularly the case when you hit malloc/free calls in "hot-paths" - parts of the code that is called very often. There is also memory overhead in using malloc on small memory sections - the overhead from past experience in Visual studio is around 32 bytes of "header" and rounded to 16 or 32 bytes boundaries - so an allocation of 1 byte actually takes up 64 bytes. An allocation of 17 bytes would also take up 64 bytes...
Of course, like ALL engineering/software design, it is not "you MUST NOT USE malloc", but "avoid malloc if there is a simple/suitable alternative". It's wrong to use all global variables that are several times larger than they need to be, just to avoid malloc - but it's equally wrong to call malloc/free for every frame or every object of a graphics drawing loop.
I haven't looked at the code of Quake, but i worked on some code in 3DMark 2000 [I think my name is still in the credits of the product]. That's written in C++, but it avoids using new/delete in the rendering code. It's all done in the setup/teardown of a frame, with very few exceptions.

Allocating a bigger block of memory is usually faster, so my advice would be allocate a big block and then create a memory pool from it. Implement your own functions to "free" memory back to the pool and to allocate memory from it.

In what situation am I supposed to use malloc? I'm confused by when dynamic allocation should be used in C

If I want to create a 2D array, with dimensions specified by user input, can't I just do this sequentially in the main function? Once I have the dimensions by using scanf, I then create an array with those dimensions? From what I understood, malloc is supposed to be used when the space required is not known at runtime. I wouldn't've known the space required at runtime but I didn't have to allocate the memory dynamically, and it would work anyway, right? Perhaps I'm completely misunderstanding something.

Generally there are three reaons to use dynamic allocation in C:
The size is not known until runtime (another alternative is VLA's, but that's C99 and potentially dangerous, see reason 2).
The size is (most likely) too big for the stack, risking stack overflow.
The object needs to live on the heap giving it a longer life than "automatic" storage.

malloc is generally used when the space requirements aren't known at compile-time, or when you need the object to persist beyond the scope that it was created in.
In C99 (unlike earlier versions of C), you can also define variable-length 1-dimensional arrays without using malloc. But many people consider then evil, because there's no way to catch an out-of-memory condition.
But if you want something that acts like a multidimensional array (in the sense of being able to index it like x[i][j]) where the dimensions aren't known until runtime, you will need to involve malloc somewhere.

Here's a archetypal example of the need for dynamic allocation: Making a dynamic container. Since you don't know the number of elements in advance, each element has to be allocated dynamically. Since you populate the container in a loop, each element must outlive the loop scope. This is the "zero-one-many" rule at its barest, and the "many" part of it immediately entails dynamic allocation:
int value;
node * my_list = NULL;
while (get_one_more_input(&value) == SUCCESS)
{
node * elem = malloc(sizeof(node));
elem->data = value;
elem->next = my_list;
my_list = elem;
}
The crux here is that the actual list node is only populated inside the loop which reads the input data, but clearly it must outlive that scope. Thus no automatic object can do, because it would only live to the end of the scope, and also no static element could do, because you cannot know the number of loop iterations beforehand. Dynamic lifetime and storage management is the only solution here, and that's what it is primarily intended for.
In C you will be doing a lot of this by hand, since dynamic data structures are at the very heart of computing. C++ makes a lot of this much easier and safer by wrapping all the dynamic management logic into hidden-away, reusable code that you never need to look at as a consumer (though it's still doing the exact same thing).

Freeing malloc'ed memory from circular linked list

I apologize in advance if this is an incredibly dumb question...
Currently I have a circular linked list. The number of nodes is normally held static. When I want to add to it, I malloc a number of nodes (ex. 100000 or so) and splice it in. This part works fine when I malloc the nodes one by one.
I want to attempt to allocate by blocks:
NODE *temp_node = node->next;
NODE *free_nodes = malloc( size_block * sizeof( NODE ) );
node->next = free_nodes;
for ( i = 0; i < size_block - 1; i++ ) {
free_nodes[i].src = 1;
free_nodes[i].dst = 0;
free_nodes[i].next = &free_nodes[i+1];
}
free_nodes[size_block - 1].next = temp_node;
The list works as long as I don't attempt to free anything ('glibc detected: double free or corruption' error). Intuitively, I think that is because freeing it doesn't free the single node, and looping through the normal way is attempting to free it multiple times (plus freeing the entire block probably screws up all the other pointers from the nodes that still exist?), but:
Could somebody please explain to me explicitly what is happening?
Is there a way to allocate the nodes by blocks and not break things?
The purpose of this is because I am calling malloc hundreds of thousands of times, and it would be nice if things were faster. If there is a better way around this, or I can't expect it to get faster, I would appreciate hearing that too. :)

Could somebody please explain to me explicitly what is happening?
Exactly what you said. You are allocating a single space of contiguous memory for all blocks. Then if you free it, all memory will be released.
Is there a way to allocate the nodes by blocks and not break things?
Allocate different memory segments for each block. In your code (that isn't complete) should be something like:
for ( i = 0; i < size_block ; i++ ) {
free_nodes[i] = malloc (sizeof( NODE ));
}

First, the way you allocated your nodes in blocks, you always have to free the whole block with exactly the same start address as you got from malloc. There is no way around this, malloc is designed like this.
Putting up your own ways around this is complicated and usually not worth it. Modern run-times have quite efficient garbage collection behind malloc/free (for its buffers, not for user allocations) and it will be hard for you to achieve something better, better meaning more efficient but still guaranteeing the consistency of your data.
Before losing yourself in such a project measure where the real bottlenecks of your program are. If the allocation part is a problem there is still another possibility that is more likely to be the cause, namely bad design. If you are using so many elements in your linked list such that allocation dominates, probably a linked list is just not the appropriate data structure. Think of using an array with a moving cursor or something like that.

When you free a node you free the entire allocation that the node was allocated with. You must somehow arrange to free the entire group of nodes at once.
Probably your best bet is to keep a list of "free" nodes and reuse those rather than allocating/freeing each node. And with some effort you can arrange to keep the nodes in blocks and allocate from the "most used" block first such that if an entire block goes empty you can free it.

Already freed memory

Is there any way in C to know if a memory block has previously been freed with free()? Can i do something like...
if(isFree(pointer))
{
//code here
}

Ok if you need to check whether a pointer has already been freed you may want to check your design. You should never have to either track reference count on a pointer or if it's freed. Also some pointers are not dynamically allocated memory so I hope you mean ones called with malloc(). This is my opinion but again if you have a solid design you should know when the things your pointers point to are done being used.
The only place I have seen this not work is in monolithic kernels because pages in memory need a usage count because of shared mappings among other things.
In your case simply set unused pointers to NULL and check that. This gives you a guaranteed way of knowing in the case that you have unused fields in structures that were malloced. A simple rule is wherever you free a pointer that needs to be checked in the above way just set it to NULL and replace isFree() with if pointer == NULL. This way no reference count needs to be tracked and you know for sure if your pointer is valid and not pointing to garbage.

No, there is no way.
You can, however, use a little code discipline as follows:
Always always always guard allocations with malloc:
void * vp;
if((vp = malloc(SIZE))==NULL){
/* do something dreadful here to respond to the out of mem */
exit(-1);
}
After freeing a pointer, set it to 0
free(vp); vp = (void*)0;
/* I like to put them on one line and think of them as one peration */
Anywhere you'd be tempted to use your "is freed" function, just say
if(vp == NULL)[
/* it's been freed already */
}
Update
#Jesus in comments says:
I can't really recommend this because as soon as you're done with that
memory the pointer should go out of scope immediately (or at least at
the end of the function that releases it) these dangling pointers
existence just doesn't sit right with me.
That's generally good practice when possible; the problem is that in real life in C it's often not possible. Consider as an example a text editor that contains a doubly-linked list of lines. The list is really simple:
struct line {
struct line * prev;
struct line * next;
char * contents;
}
I define a guarded_malloc function that allocates memory
void * guarded_malloc(size_t sz){
return (malloc(sz)) ? : exit(-1); /* cute, eh? */
}
and create list nodes with newLine()
struct line * newLine(){
struct line * lp;
lp = (struct line *) guarded_malloc(sizeof(struct line));
lp->prev = lp->next = lp-contents = NULL ;
return lp;
}
I add text in string s to my line
lp->contents = guarded_malloc(strlen(s)+1);
strcpy(lp->contents,s);
and don't quibble that I should be using the bounded-length forms, this is just an example.
Now, how can I implement deleting the contents of a line I created with the char * contents going out of scope after freeing?

I see nobody has addressed the reason why what you want is fundamentally impossible. To free a resource (in this case memory, but the same applies to basically any resource) means to return it to a resource pool where it's available for reuse. The only way the system could provide a reasonable answer to "Has the memory block at address X already been freed?" is to prevent this address from ever being reused, and store with it a status flag indicating whether it was "freed". But in this case, it has not actually been freed, since it is not available for reuse.
As others have said, the fact that you're trying to answer this question means you have fundamental design errors you need to address.

In general the only way to do this portably is to replace the memory allocation functions. But if you're only concerned about your own code, a fairly common technique is to set pointers to NULL after you free() them, so any subsequent use will throw an exception or segfault:
free(pointer);
pointer = NULL;

For a platform-specific solution, you may be interested in the Win32 function IsBadReadPtr (and others like it). This function will be able to (almost) predict whether you will get a segmentation fault when reading from a particular chunk of memory.
Note: IsBadReadPtr has been deprecated by Microsoft.
However, this does not protect you in the general case, because the operating system knows nothing of the C runtime heap manager, and if a caller passes in a buffer that isn't as large as you expect, then the rest of the heap block will continue to be readable from an OS perspective.
Pointers have no information with them other than where they point. The best you can do is say "I know how this particular compiler version allocates memory, so I'll dereference memory, move the pointer back 4 bytes, check the size, makes sure it matches..." and so on. You cannot do it in a standard fashion, since memory allocation is implementation defined. Not to mention they might have not dynamically allocated it at all.
On a side note, I recommend reading 'Writing Solid Code' by Steve McGuire. Excellent sections on memory management.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight