I have a big C-project and thanks to valgrind I cleaned up some mess I was doing with a memory management. I cleaned up everything except for one thing, and after a week of analysis I'm starting to think that that's valgrind's misunderstanding of my code, rather than my mistake. The program is running for good, but that means nothing (I've seen the case when the program runs for good for weeks and then stucks because of 31th flipped bit in an int).
My code uses the next idea: there's the a storage (in terms of my project a "warehouse"), which holds all the kinds of structures I desired to build. I use the trick from the Xlib to keep this as small as possible in memory:
typedef struct
{
// data
} TypeA
typedef struct
{
// data
} TypeB
typedef struct
{
// data
} TypeC
typedef union
{
TypeA typea;
TypeB typeb;
TypeC typec;
} UniType;
typedef struct
{
int type;
UniType data;
} Element;
Then I create an element:
SmlErrors SmlWhsAdd(SmlElement element, SmlIndex * index)
{
SML_CHECKPTR(index);
SmlElement * ptrold = warehouse.elem;
warehouse.elem = realloc(warehouse.elem,
(++warehouse.elemcount) * sizeof(SmlElement));
if (!(warehouse.elem))
{
warehouse.elem = ptrold;
*index = 0;
warehouse.elemcount--;
return SML_ERR_BADALLOC;
}
warehouse.elem[warehouse.elemcount - 1] = element;
*index = (warehouse.elemcount - 1);
return SML_ERR_SUCCESS;
}
For those, who thinks that ptr = realloc(ptr... is bad - look closer, I save the old one and restore it after. I'm planning to replace all alloc tools with myalloc to crash the program there instead of continuing work
This code is clear, valgrind is silent. Except for one case. One of my "TypeX" structures (if to be exact, child-of-child-of-TypeX) contains an array:
SmlIndex sprite[SML_THEMEBLOCK_SIZE];
Every sprite is also an index from warehouse, so it's mobiles in mobile, as that array is a part of one of the element of that warehouse (in the house that Jack built).
I use the aforementioned function to write the value inside of one of the sprites:
SML_CHECKLOC(SmlImageCreate(&(widget->sprite[i]),
widget->geometry.size));
// Which calls `WhsAdd` with `&(widget->sprite[i]` as `index`-parameter.
And every time I call it this way the valgrind is whining about Invalid write of size 4. Every time I'm trying to use the value from sprite[x] after that - Invalid read of size 4. If to be exact, it's whining about the following line:
*index = (warehouse.elemcount - 1);
My system is 32bit, SmlIndex is uint32_t
Please, give me a clue about where to dig. After a week of research I'm out of ideas. That's why I'm started to think that that might be valgrind's bug - I also 've heard that it's working strange with unions and structs in it.
One more thing.
widget->sprite[i] = 0; // No complainings.
SmlImageCreate(&(widget->sprite[i], ...) // Complainings.
Can someone give me a hand, please? I'm drowning in that swamp. Any suggestions about where to look. Anything.
UPD:
MCVE: http://pastebin.com/r5T5ZBPC
Regards,
Alex.
(Edit history note: the MCVE originally used uninitialized variables, however after initializing all of those, the problem persists)
In the MCVE, the issue comes from:
SmlWhsAdd(sprite, &(warehouse.elem[window].data.wdg.sprite[0]));
The second argument is a pointer into the space allocated by an earlier call to realloc.
However, inside the SmlWhsAdd function, realloc is called on this space, which allocates a new block and frees the old. This leaves the second argument pointing into freed space.
To fix this, my suggestion would be to review all uses of SmlWhsAdd and avoid passing a pointer which is under warehouse.elem.
One option might be to use a temporary variable and then assign the index after the call; another option might be to pass some other information which allows the SmlWhsAdd function to compute the location to write the index, after it performs realloc; or if the index is always at the end you don't even need to have that parameter at all because the caller can do warehouse.elemcount-1 after.
Related
I want to know how to store custom objects (not their pointers) in C. I have created a custom structure called Node
#define MAXQ 100
typedef struct {
int state[MAXQ];
int height;
} Node;
(which works) and I want to store a few of these Nodes in a container (without using pointers, since they are not stored elsewhere) so I can access them later.
The internet seems to suggest something like calloc() so my last attempt was to make a container Neighbors following this example, with numNeighbors being just an integer:
Node Neighbors = (Node*)calloc(numNeighbors, sizeof(Node));
At compilation, I got an error from this line saying
initializing 'Node' with an expression of incompatible type 'void *'
and in places where I referenced to this container (as in Neighbors[i]) I got errors of
subscripted value is not an array, pointer, or vector
Since I'm spoiled by Python, I have no idea if I've got my syntax all wrong (it should tell you something that I'm still not there after scouring a ton of tutorials, docs, and stackoverflows on malloc(), calloc() and the like), or if I am on a completely wrong approach to storing custom objects (searching "store custom objects in C" on the internet gives irrelevant results dealing with iOS and C# so I would really appreciate some help).
EDIT: Thanks for the tips everyone, it finally compiled without errors!
You can create a regular array using your custom struct:
Node Neighbors[10];
You can then reference them like any other array, for example:
Neighbors[3].height = 10;
If your C implementation supports C.1999 style VLA, simply define your array.
Node Neighbors[numNeighbors];
(Note that VLA has no error reporting mechanism. A failed allocation results in undefined behavior, which probably expresses itself as a crash.)
Otherwise, you will need dynamic allocation. calloc is suitable, but it returns a pointer representing the contiguous allocation.
Node *Neighbors = calloc(numNeighbors, sizeof(*Neighbors));
Note, do not cast the result of malloc/calloc/realloc when programming in C. It is not required, and in the worst case, can mask a fatal error.
I want to store a few of these Nodes in a container (without using pointers, since they are not stored elsewhere) so I can access them later.
If you know the amount of them at compile-time (or at the very least a reasonable maximum); then you can create an array of stack-allocated objects. For instance, say you are OK with a maximum of 10 objects:
#define MAX_NODES 10
Node nodes[MAX_NODES];
int number_nodes = 0;
Then when you add an object, you keep in sync number_nodes (so that you know where to put the next one). Technically, you will always have 10, but you only use the ones you want/need. Removing objects is similar, although more involved if you want to take out some in the middle.
However, if you don't know how many you will have (nor a maximum); or even if you know but they are way too many to fit in the stack; then you are forced to use the heap (typically with malloc() and free()):
int number_nodes; // unknown until runtime or too big
Node * nodes = malloc(sizeof(Node) * number_nodes);
...
free(nodes);
In any case, you will be using pointers in the dynamically allocated memory case, and most probably in the stack case as well.
Python is hiding and doing all this dance for you behind the scenes -- which is quite useful and time saving as you have probably already realized, as long as you do not need precise control over it (read: performance).
malloc and calloc are for dynamic allocation, and they need pointer variables. I don't see any reason for you to use dynamic allocation. Just define a regular array until you have a reason not to.
#define MAXQ 100
#define NUM_NEIGHBORS 50
typedef struct {
int state[MAXQ];
int height;
} Node;
int main(void)
{
Node Neighbors[NUM_NEIGHBORS];
Neighbors[0].state[0] = 0;
Neighbors[0].height = 1;
}
Here NUM_NEIGHBORS needs to be a constant. (Hence static) If you want it to be variable or dynamic, then you need dynamic allocations, and pointers inevitably:
#define MAXQ 100
typedef struct {
int state[MAXQ];
int height;
} Node;
int main(void)
{
int numNeighbors = 50;
Node *Neighbors;
Neighbors = (Node*)calloc(numNeighbors, sizeof(Node));
Neighbors[0].state[0] = 0;
Neighbors[0].height = 1;
}
While freeing some pointers, I get an access violation.
In order to know what's going on, I've decided to ask to free the pointers at an earlier stage in the code, even directly after memory has been allocated, and still it crashes.
It means that something is seriously wrong in the way my structures are handled in memory.
I know that in a previous version of the code, there was a keyword before the definition of some variables, but that keyword is lost (it was part of a #define clause I can't find back).
Does anybody know what's wrong in this piece of code or what the mentioned keyword should be?
typedef unsigned long longword;
typedef struct part_tag { struct part_tag *next;
__int64 fileptr;
word needcount;
byte loadflag,lock;
byte partdat[8192];
} part;
static longword *partptrs;
<keyword> part *freepart;
<keyword> part *firstpart;
void alloc_parts (void) {
part *ps;
int i;
partptrs = (longword*)malloc (number_of_parts * sizeof(longword)); // number... = 50
ps = (part*)&freepart;
for (i=0; i<number_of_parts; i++) {
ps->next = (struct part_tag*)malloc(sizeof(part));
partptrs[i] = (longword)ps->next;
ps = ps->next;
ps->fileptr = 0; ps->loadflag = 0; ps->lock = 0; ps->needcount = 0; // fill in "ps" structure
};
ps->next = nil;
firstpart = nil;
for (i=0; i<number_of_parts; i++) {
ps = (part*)partptrs[i];
free(ps); <-- here it already crashes at the first occurence (i=0)
};
}
Thanks in advance
In the comments somebody asks why I'm freeing pointers directly after allocating them. This is not how the program originally was written, but in order to know what's causing the access violation I've rewritten in that style.
Originally:
alloc_parts();
<do the whole processing>
free_parts();
In order to analyse the access violation I've adapted the alloc_parts() function into the source code excerpt I've written there. The point is that even directly after allocating memory, the freeing is going wrong. How is that even possible?
In the meanwhile I've observed another weird phenomena:
While allocating the memory, the values of ps seem to be "complete" address values. While trying to free the memory, the values of ps only contain the last digits of the memory addresses.
Example of complete address : 0x00000216eeed6150
Example of address in freeing loop : 0x00000000eeed6150 // terminating digits are equal,
// so at least something is right :-)
This problem was caused by the longword type: it seems that this type was too small to hold entire memory addresses. I've replaced this by another type (unsigned long long) but the problem still persists.
Finally, after a long time of misery, the problem is solved:
The program was originally meant as a 32-bit application, which means that the original type unsigned long was sufficient to keep memory addresses.
However, this program gets compiled now as a 64-bit application, hence the mentioned type is not sufficiently large anymore to keep 64-bit memory addresses, hence another type has been used for solving this issue:
typedef intptr_t longword;
This solves the issue.
#Andrew Henle: sorry, I didn't realise that your comment contained the actual solution to this problem.
EDIT: Updated code with new Pastebin link but it's still stopping at the info->citizens[x]->name while loop. Added realloc to loops and tidied up the code. Any more comments would be greatly appreciated
I'm having a few problems with memory allocation overflowing
http://pastebin.com/vukRGkq9 (v2)
No matter what I try, simply not enough memory is being allocated for info->citizens and gdb is often saying that it cannot access info->citizens[x]->name.
On occasion, I'll even get KERN_INVALID_ADDRESS errors directly after printf statements for strlen (Strlen is not used in the code at the point where gdb halts due to the error, but I'm assuming printf uses strlen in some way). I think it's something to do with how the structure is being allocated memory. So I was wondering if anyone could take a look?
You shouldn't do malloc(sizeof(PEOPLE*)), because it allocates exactly amount of bytes for pointer (4 bytes on 32bit arch).
Seems the thing you want to do is malloc(sizeof(PEOPLE) * N) where N is the max. number of PEOPLE you want to put into that memory chunk.
Clearly the problem lies with:
info->citizens = malloc(sizeof(PEOPLE *));
info->citizens[0] = malloc(sizeof(PEOPLE *));
info->citizens[1] = malloc(sizeof(PEOPLE *));
Think about it logically what you are trying to do here.
Your structs should almost certainly not contains members such as:
time_t *modtimes;
mode_t *modes;
bool *exists;
Instead you should simply use:
time_t modtimes;
mode_t modes;
bool exists;
In that way you do not need to dynamically allocate them, or subsequently release them. The reasons are that a) they're small and b) their size is known in advance. You would use:
char *name;
for a string field because it's not small and you don't know in advance how large it is.
Elsewhere in the code, you have the folllowing:
if(top)
{
PEOPLE *info;
info = malloc(sizeof(PEOPLE *));
}
If top is true then this code allocates a pointer and then immediately leaks it -- the scope of the second info is limited to the if statement so you can neither use it later nor can you release it later. You would need to do something like this:
PEOPLE *process(PEOPLE *info, ...)
{
if (top)
{
info = malloc(sizeof(PEOPLE));
}
info->name = strdup("Henry James");
info->exists = true;
return info;
}
It seems you have one too many levels of indirection. Why are you using **citizens instead of *?
Also, apart from the fact that you are allocating the space for a pointer, not the struct, there are a couple of weird things, such as the local variable info on line 31 means the initial allocation is out of scope once the block closes at line 34.
You need to think more clearly about what data is where.
Lots of memory allocation issues with this code. Those mentioned above plus numerous others, for example:
info->citizens[masterX]->name = malloc(sizeof(char)*strlen(dp->d_name)+1);
info->citizens[masterX]->name = dp->d_name;
You cannot copy strings in C through assignment (using =). You can write this as:
info->citizens[masterX]->name = malloc(strlen(dp->d_name)+1);
strcpy(info->citizens[masterX]->name, dp->d_name);
Or you could condense the whole allocate & copy as follows:
info->citizens[masterX]->name = strdup(dp->d_name);
Similarly at lines 143/147 (except in that case you have also allocated one byte too few in your malloc call).
OK, I hope I explain this one correctly.
I have a struct:
typedef struct _MyData
{
char Data[256];
int Index;
} MyData;
Now, I run into a problem. Most of the time MyData.Data is OK with 256, but in some cases I need to expand the amount of chars it can hold to different sizes.
I can't use a pointer.
Is there any way to resize Data at run time? How?
Code is appreciated.
EDIT 1:
While I am very thankful for all the comments, the "maybe try this..." or "do that", or "what you are dong is wrong..." comments are not helping. Code is the help here. Please, if you know the answer post the code.
Please note that:
I cannot use pointers. Please don't try to figure out why, I just can't.
The struct is being injected into another program's memory that's why no pointers can be used.
Sorry for being a bit rough here but I asked the question here because I already tried all the different approaches that thought might work.
Again, I am looking for code. At this point I am not interested in "might work..." or " have you considered this..."
Thank you and my apologies again.
EDIT 2
Why was this set as answered?
You can use a flexible array member
typedef struct _MyData
{
int Index;
char Data[];
} MyData;
So that you can then allocate the right amount of space
MyData *d = malloc(sizeof *d + sizeof(char[100]));
d->Data[0..99] = ...;
Later, you can free, and allocate another chunk of memory and make a pointer to MyData point to it, at which time you will have more / less elements in the flexible array member (realloc). Note that you will have to save the length somewhere, too.
In Pre-C99 times, there isn't a flexible array member: char Data[] is simply regarded as an array with incomplete type, and the compiler would moan about that. Here i recommend you two possible ways out there
Using a pointer: char *Data and make it point to the allocated memory. This won't be as convenient as using the embedded array, because you will possibly need to have two allocations: One for the struct, and one for the memory pointed to by the pointer. You can also have the struct allocated on the stack instead, if the situation in your program allows this.
Using a char Data[1] instead, but treat it as if it were bigger, so that it overlays the whole allocated object. This is formally undefined behavior, but is a common technique, so it's probably safe to use with your compiler.
The problem here is your statement "I can't use a pointer". You will have to, and it will make everything much easier. Hey, realloc even copies your existing data, what do you want more?
So why do you think you can't use a pointer? Better try to fix that.
You would re-arrange the structure like that
typedef struct _MyData
{
int Index;
char Data[256];
} MyData;
And allocate instances with malloc/realloc like that:
my_data = (MyData*) malloc ( sizeof(MyData) + extra_space_needed );
This is an ugly approach and I would not recommend it (I would use pointers), but is an answer to your question how to do it without a pointer.
A limitation is that it allows for only one variable size member per struct, and has to be at the end.
Let me sum up two important points I see in this thread:
The structure is used to interact between two programs through some IPC mechanism
The destination program cannot be changed
You cannot therefore change that structure in any way, because the destination program is stuck trying to read it as currently defined. I'm afraid you are stuck.
You can try to find ways to get the equivalent behavior, or find some evil hack to force the destination program to read a new structure (e.g., modifying the binary offsets in the executable). That's all pretty application specific so I can't give much better guidance than that.
You might consider writing a third program to act as an interface between the two. It can take the "long" messages and do something with them, and pass the "short" messages onward to the old program. You can inject that in between the IPC mechanisms fairly easily.
You may be able to do this like this, without allocating a pointer for the array:
typedef struct _MyData
{
int Index;
char Data[1];
} MyData;
Later, you allocate like this:
int bcount = 256;
MyData *foo;
foo = (MyData *)malloc(sizeof(*foo) + bcount);
realloc:
int newbcount = 512;
MyData *resized_foo;
resized_foo = realloc((void *)foo, sizeof(*foo) + newbcount);
It looks like from what you're saying that you definitely have to keep MyData as a static block of data. In which case I think the only option open to you is to somehow (optionally) chain these data structures together in a way that can be re-assembled be the other process.
You'd need and additional member in MyData, eg.
typedef struct _MyData
{
int Sequence;
char Data[256];
int Index;
} MyData;
Where Sequence identifies the descending sequence in which to re-assemble the data (a sequence number of zero would indicate the final data buffer).
The problem is in the way you're putting the question. Don't think about C semantics: instead, think like a hacker. Explain exactly how you are currently getting your data into the other process at the right time, and also how the other program knows where the data begins and ends. Is the other program expecting a null-terminated string? If you declare your struct with a char[300] does the other program crash?
You see, when you say "passing data" to the other program, you might be [a] tricking the other process into copying what you put in front of it, [b] tricking the other program into letting you overwrite its normally 'private' memory, or [c] some other approach. No matter which is the case, if the other program can take your larger data, there is a way to get it to them.
I find KIV's trick quite usable. Though, I would suggest investigating the pointer issue first.
If you look at the malloc implementations
(check this IBM article, Listing 5: Pseudo-code for the main allocator),
When you allocate, the memory manager allocates a control header and
then free space following it based on your requested size.
This is very much like saying,
typedef struct _MyData
{
int size;
char Data[1]; // we are going to break the array-bound up-to size length
} MyData;
Now, your problem is,
How do you pass such a (mis-sized?) structure to this other process?
That brings us the the question,
How does the other process figure out the size of this data?
I would expect a length field as part of the communication.
If you have all that, whats wrong with passing a pointer to the other process?
Will the other process identify the difference between a pointer to a
structure and that to a allocated memory?
You cant reacolate manualy.
You can do some tricks wich i was uning when i was working aon simple data holding sistem. (very simple filesystem).
typedef struct
{
int index ;
char x[250];
} data_ztorage_250_char;
typedef struct
{
int index;
char x[1000];
} data_ztorage_1000_char;
int main(void)
{
char just_raw_data[sizeof(data_ztorage_1000_char)];
data_ztorage_1000_char* big_struct;
data_ztorage_250_char* small_struct;
big_struct = (data_ztorage_1000_char*)big_struct; //now you have bigg struct
// notice that upper line is same as writing
// big_struct = (data_ztorage_1000_char*)(&just_raw_data[0]);
small_struct = (data_ztorage_250_char*)just_raw_data;//now you have small struct
//both structs starts at same locations and they share same memory
//addresing data is
small_struct -> index = 250;
}
You don't state what the Index value is for.
As I understand it you are passing data to another program using the structure shown.
Is there a reason why you can't break your data to send into chunks of 256bytes and then set the index value accordingly? e.g.
Data is 512 bytes so you send one struct with the first 256 bytes and index=0, then another with the next 256 bytes in your array and Index=1.
How about a really, really simple solution? Could you do:
typedef struct _MyData
{
char Data[1024];
int Index;
} MyData;
I have a feeling I know your response will be "No, because the other program I don't have control over expects 256 bytes"... And if that is indeed your answer to my answer, then my answer becomes: this is impossible.
I've always heard that in C you have to really watch how you manage memory. And I'm still beginning to learn C, but thus far, I have not had to do any memory managing related activities at all.. I always imagined having to release variables and do all sorts of ugly things. But this doesn't seem to be the case.
Can someone show me (with code examples) an example of when you would have to do some "memory management" ?
There are two places where variables can be put in memory. When you create a variable like this:
int a;
char c;
char d[16];
The variables are created in the "stack". Stack variables are automatically freed when they go out of scope (that is, when the code can't reach them anymore). You might hear them called "automatic" variables, but that has fallen out of fashion.
Many beginner examples will use only stack variables.
The stack is nice because it's automatic, but it also has two drawbacks: (1) The compiler needs to know in advance how big the variables are, and (2) the stack space is somewhat limited. For example: in Windows, under default settings for the Microsoft linker, the stack is set to 1 MB, and not all of it is available for your variables.
If you don't know at compile time how big your array is, or if you need a big array or struct, you need "plan B".
Plan B is called the "heap". You can usually create variables as big as the Operating System will let you, but you have to do it yourself. Earlier postings showed you one way you can do it, although there are other ways:
int size;
// ...
// Set size to some value, based on information available at run-time. Then:
// ...
char *p = (char *)malloc(size);
(Note that variables in the heap are not manipulated directly, but via pointers)
Once you create a heap variable, the problem is that the compiler can't tell when you're done with it, so you lose the automatic releasing. That's where the "manual releasing" you were referring to comes in. Your code is now responsible to decide when the variable is not needed anymore, and release it so the memory can be taken for other purposes. For the case above, with:
free(p);
What makes this second option "nasty business" is that it's not always easy to know when the variable is not needed anymore. Forgetting to release a variable when you don't need it will cause your program to consume more memory that it needs to. This situation is called a "leak". The "leaked" memory cannot be used for anything until your program ends and the OS recovers all of its resources. Even nastier problems are possible if you release a heap variable by mistake before you are actually done with it.
In C and C++, you are responsible to clean up your heap variables like shown above. However, there are languages and environments such as Java and .NET languages like C# that use a different approach, where the heap gets cleaned up on its own. This second method, called "garbage collection", is much easier on the developer but you pay a penalty in overhead and performance. It's a balance.
(I have glossed over many details to give a simpler, but hopefully more leveled answer)
Here's an example. Suppose you have a strdup() function that duplicates a string:
char *strdup(char *src)
{
char * dest;
dest = malloc(strlen(src) + 1);
if (dest == NULL)
abort();
strcpy(dest, src);
return dest;
}
And you call it like this:
main()
{
char *s;
s = strdup("hello");
printf("%s\n", s);
s = strdup("world");
printf("%s\n", s);
}
You can see that the program works, but you have allocated memory (via malloc) without freeing it up. You have lost your pointer to the first memory block when you called strdup the second time.
This is no big deal for this small amount of memory, but consider the case:
for (i = 0; i < 1000000000; ++i) /* billion times */
s = strdup("hello world"); /* 11 bytes */
You have now used up 11 gig of memory (possibly more, depending on your memory manager) and if you have not crashed your process is probably running pretty slowly.
To fix, you need to call free() for everything that is obtained with malloc() after you finish using it:
s = strdup("hello");
free(s); /* now not leaking memory! */
s = strdup("world");
...
Hope this example helps!
You have to do "memory management" when you want to use memory on the heap rather than the stack. If you don't know how large to make an array until runtime, then you have to use the heap. For example, you might want to store something in a string, but don't know how large its contents will be until the program is run. In that case you'd write something like this:
char *string = malloc(stringlength); // stringlength is the number of bytes to allocate
// Do something with the string...
free(string); // Free the allocated memory
I think the most concise way to answer the question in to consider the role of the pointer in C. The pointer is a lightweight yet powerful mechanism that gives you immense freedom at the cost of immense capacity to shoot yourself in the foot.
In C the responsibility of ensuring your pointers point to memory you own is yours and yours alone. This requires an organized and disciplined approach, unless you forsake pointers, which makes it hard to write effective C.
The posted answers to date concentrate on automatic (stack) and heap variable allocations. Using stack allocation does make for automatically managed and convenient memory, but in some circumstances (large buffers, recursive algorithms) it can lead to the horrendous problem of stack overflow. Knowing exactly how much memory you can allocate on the stack is very dependent on the system. In some embedded scenarios a few dozen bytes might be your limit, in some desktop scenarios you can safely use megabytes.
Heap allocation is less inherent to the language. It is basically a set of library calls that grants you ownership of a block of memory of given size until you are ready to return ('free') it. It sounds simple, but is associated with untold programmer grief. The problems are simple (freeing the same memory twice, or not at all [memory leaks], not allocating enough memory [buffer overflow], etc) but difficult to avoid and debug. A hightly disciplined approach is absolutely mandatory in practive but of course the language doesn't actually mandate it.
I'd like to mention another type of memory allocation that's been ignored by other posts. It's possible to statically allocate variables by declaring them outside any function. I think in general this type of allocation gets a bad rap because it's used by global variables. However there's nothing that says the only way to use memory allocated this way is as an undisciplined global variable in a mess of spaghetti code. The static allocation method can be used simply to avoid some of the pitfalls of the heap and automatic allocation methods. Some C programmers are surprised to learn that large and sophisticated C embedded and games programs have been constructed with no use of heap allocation at all.
There are some great answers here about how to allocate and free memory, and in my opinion the more challenging side of using C is ensuring that the only memory you use is memory you've allocated - if this isn't done correctly what you end up with is the cousin of this site - a buffer overflow - and you may be overwriting memory that's being used by another application, with very unpredictable results.
An example:
int main() {
char* myString = (char*)malloc(5*sizeof(char));
myString = "abcd";
}
At this point you've allocated 5 bytes for myString and filled it with "abcd\0" (strings end in a null - \0).
If your string allocation was
myString = "abcde";
You would be assigning "abcde" in the 5 bytes you've had allocated to your program, and the trailing null character would be put at the end of this - a part of memory that hasn't been allocated for your use and could be free, but could equally be being used by another application - This is the critical part of memory management, where a mistake will have unpredictable (and sometimes unrepeatable) consequences.
A thing to remember is to always initialize your pointers to NULL, since an uninitialized pointer may contain a pseudorandom valid memory address which can make pointer errors go ahead silently. By enforcing a pointer to be initialized with NULL, you can always catch if you are using this pointer without initializing it. The reason is that operating systems "wire" the virtual address 0x00000000 to general protection exceptions to trap null pointer usage.
Also you might want to use dynamic memory allocation when you need to define a huge array, say int[10000]. You can't just put it in stack because then, hm... you'll get a stack overflow.
Another good example would be an implementation of a data structure, say linked list or binary tree. I don't have a sample code to paste here but you can google it easily.
(I'm writing because I feel the answers so far aren't quite on the mark.)
The reason you have to memory management worth mentioning is when you have a problem / solution that requires you to create complex structures. (If your programs crash if you allocate to much space on the stack at once, that's a bug.) Typically, the first data structure you'll need to learn is some kind of list. Here's a single linked one, off the top of my head:
typedef struct listelem { struct listelem *next; void *data;} listelem;
listelem * create(void * data)
{
listelem *p = calloc(1, sizeof(listelem));
if(p) p->data = data;
return p;
}
listelem * delete(listelem * p)
{
listelem next = p->next;
free(p);
return next;
}
void deleteall(listelem * p)
{
while(p) p = delete(p);
}
void foreach(listelem * p, void (*fun)(void *data) )
{
for( ; p != NULL; p = p->next) fun(p->data);
}
listelem * merge(listelem *p, listelem *q)
{
while(p != NULL && p->next != NULL) p = p->next;
if(p) {
p->next = q;
return p;
} else
return q;
}
Naturally, you'd like a few other functions, but basically, this is what you need memory management for. I should point out that there are a number tricks that are possible with "manual" memory management, e.g.,
Using the fact that malloc is guaranteed (by the language standard) to return a pointer divisible by 4,
allocating extra space for some sinister purpose of your own,
creating memory pools..
Get a good debugger... Good luck!
#Euro Micelli
One negative to add is that pointers to the stack are no longer valid when the function returns, so you cannot return a pointer to a stack variable from a function. This is a common error and a major reason why you can't get by with just stack variables. If your function needs to return a pointer, then you have to malloc and deal with memory management.
#Ted Percival:
...you don't need to cast malloc()'s return value.
You are correct, of course. I believe that has always been true, although I don't have a copy of K&R to check.
I don't like a lot of the implicit conversions in C, so I tend to use casts to make "magic" more visible. Sometimes it helps readability, sometimes it doesn't, and sometimes it causes a silent bug to be caught by the compiler. Still, I don't have a strong opinion about this, one way or another.
This is especially likely if your compiler understands C++-style comments.
Yeah... you caught me there. I spend a lot more time in C++ than C. Thanks for noticing that.
In C, you actually have two different choices. One, you can let the system manage the memory for you. Alternatively, you can do that by yourself. Generally, you would want to stick to the former as long as possible. However, auto-managed memory in C is extremely limited and you will need to manually manage the memory in many cases, such as:
a. You want the variable to outlive the functions, and you don't want to have global variable. ex:
struct pair{
int val;
struct pair *next;
}
struct pair* new_pair(int val){
struct pair* np = malloc(sizeof(struct pair));
np->val = val;
np->next = NULL;
return np;
}
b. you want to have dynamically allocated memory. Most common example is array without fixed length:
int *my_special_array;
my_special_array = malloc(sizeof(int) * number_of_element);
for(i=0; i
c. You want to do something REALLY dirty. For example, I would want a struct to represent many kind of data and I don't like union (union looks soooo messy):
struct data{
int data_type;
long data_in_mem;
};
struct animal{/*something*/};
struct person{/*some other thing*/};
struct animal* read_animal();
struct person* read_person();
/*In main*/
struct data sample;
sampe.data_type = input_type;
switch(input_type){
case DATA_PERSON:
sample.data_in_mem = read_person();
break;
case DATA_ANIMAL:
sample.data_in_mem = read_animal();
default:
printf("Oh hoh! I warn you, that again and I will seg fault your OS");
}
See, a long value is enough to hold ANYTHING. Just remember to free it, or you WILL regret. This is among my favorite tricks to have fun in C :D.
However, generally, you would want to stay away from your favorite tricks (T___T). You WILL break your OS, sooner or later, if you use them too often. As long as you don't use *alloc and free, it is safe to say that you are still virgin, and that the code still looks nice.
Sure. If you create an object that exists outside of the scope you use it in. Here is a contrived example (bear in mind my syntax will be off; my C is rusty, but this example will still illustrate the concept):
class MyClass
{
SomeOtherClass *myObject;
public MyClass()
{
//The object is created when the class is constructed
myObject = (SomeOtherClass*)malloc(sizeof(myObject));
}
public ~MyClass()
{
//The class is destructed
//If you don't free the object here, you leak memory
free(myObject);
}
public void SomeMemberFunction()
{
//Some use of the object
myObject->SomeOperation();
}
};
In this example, I'm using an object of type SomeOtherClass during the lifetime of MyClass. The SomeOtherClass object is used in several functions, so I've dynamically allocated the memory: the SomeOtherClass object is created when MyClass is created, used several times over the life of the object, and then freed once MyClass is freed.
Obviously if this were real code, there would be no reason (aside from possibly stack memory consumption) to create myObject in this way, but this type of object creation/destruction becomes useful when you have a lot of objects, and want to finely control when they are created and destroyed (so that your application doesn't suck up 1GB of RAM for its entire lifetime, for example), and in a Windowed environment, this is pretty much mandatory, as objects that you create (buttons, say), need to exist well outside of any particular function's (or even class') scope.