Related
So I'm looking at a solution to some coding interview type questions, and there's an array inside a struct
#define MAX_SIZE 1000000
typedef struct _heap {
int data[MAX_SIZE];
int heap_size;
}heap;
heap* init(heap* h) {
h = (heap*)malloc(sizeof(heap));
h->heap_size = 0;
return h;
}
This heap struct is later created like so
heap* max_heap = NULL;
max_heap = init(max_heap);
First of all, I'd wish this was written in C++ style than C, but secondly if I'm just conscerned about the array, I'm assuming it is equivalent to solely analyze the array portion by changing the code like this
int* data = NULL;
data = (int*)malloc(1000000 * sizeof(int));
Now in that case, is there any problems with declaring the array with the max size if you are probably just using a little bit of it?
I guess this boils down to the question of when an array is created in the heap, how does the system block out that portion of the memory? In which case does the system prevent you from accessing memory that is part of the array? I wouldn't want a giant array holding up space if I'm not using much of it.
is there any problems with declaring the array with the max size if you are probably just using a little bit of it?
Yes. The larger the allocation size the greater the risk of an out-of-memory error. If not here, elsewhere in code.
Yet some memory allocation systems handle this well as real memory allocations do not immediately occur, but later when needed.
I guess this boils down to the question of when an array is created in the heap, how does the system block out that portion of the memory?
That is an implementation defined issue not defined by C. It might happen immediately or deferred.
For maximum portability, code would take a more conservative approach and allocate large memory chunks only as needed, rather than rely on physical allocation occurring in a delayed fashion.
Alternative
In C, consider a struct with a flexible member array.
typedef struct _heap {
size_t heap_size;
int data[];
} heap;
The setup
Let's say I have a struct father which has member variables such as an int, and another struct(so father is a nested struct). This is an example code:
struct mystruct {
int n;
};
struct father {
int test;
struct mystruct M;
struct mystruct N;
};
In the main function, we allocate memory with malloc() to create a new struct of type struct father, then we fill it's member variables and those of it's children:
struct father* F = (struct father*) malloc(sizeof(struct father));
F->test = 42;
F->M.n = 23;
F->N.n = 11;
We then get pointers to those member variables from outside the structs:
int* p = &F->M.n;
int* q = &F->N.n;
After that, we print the values before and after the execution of free(F), then exit:
printf("test: %d, M.n: %d, N.n: %d\n", F->test, *p, *q);
free(F);
printf("test: %d, M.n: %d, N.n: %d\n", F->test, *p, *q);
return 0;
This is a sample output(*):
test: 42, M.n: 23, N.n: 11
test: 0, M.n: 0, N.n: 1025191952
*: Using gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Full code on pastebin: https://pastebin.com/khzyNPY1
The question
That was the test program that I used to test how memory is deallocated using free(). My idea(from reading K&R "8.7 Example - A Storage Allocator", in which a version of free() is implemented and explained) is that, when you free() the struct, you're pretty much just telling the operating system or the rest of the program that you won't be using that particular space in memory that was previously allocated with malloc(). So, after freeing those memory blocks, there should be garbage values in the member variables, right? I can see that happening with N.n in the test program, but, as I ran more and more samples, it was clear that in the overwhelming majority of cases, these member variables are "reset" to 0 more than any other "random" value. My question is: why is that? Is it because the stack/heap is filled with zeroes more frequently than any other value?
As a last note, here are a few links to related questions but which do not answer my particular question:
C - freeing structs
What REALLY happens when you don't free after malloc?
After calling free, the pointers F, p and q no longer point to valid memory. Attempting to dereference those pointers invokes undefined behavior. In fact, the values of those pointers become indeterminate after the call to free, so you may also invoke UB just by reading those pointer values.
Because dereferencing those pointers is undefined behavior, the compiler can assume it will never happen and make optimizations based on that assumption.
That being said, there's nothing that states that the malloc/free implementation has to leave values that were stored in freed memory unchanged or set them to specific values. It might write part of its internal bookkeeping state to the memory you just freed, or it might not. You'd have to look at the source for glibc to see exactly what it's doing.
Apart from undefined behavior and whatever else the standard might dictate, since the dynamic allocator is a program, fixed a specific implementation, assuming it does not make decisions based on external factors (which it does not) the behavior is completely deterministic.
Real answer: what you are seeing here is the effect of the internal workings of glibc's allocator (glibc is the default C library on Ubuntu).
The internal structure of an allocated chunk is the following (source):
struct malloc_chunk {
INTERNAL_SIZE_T mchunk_prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T mchunk_size; /* Size in bytes, including overhead. */
struct malloc_chunk* fd; /* double links -- used only if free. */
struct malloc_chunk* bk;
/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk* bk_nextsize;
};
In memory, when the chunk is in use (not free), it looks like this:
chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if unallocated (P clear) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes |A|M|P| flags
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User data starts here... |
Every field except mchunk_prev_size and mchunk_size is only populated if the chunk is free. Those two fields are right before the user usable buffer. User data begins right after mchunk_size (i.e. at the offset of fd), and can be arbitrarily large. The mchunk_prev_size field holds the size of the previous chunk if it's free, while the mchunk_size field holds the real size of the chunk (which is at least 16 bytes more than the requested size).
A more thorough explanation is provided as comments in the library itself here (highly suggested read if you want to know more).
When you free() a chunk, there are a lot of decisions to be made as to where to "store" that chunk for bookkeeping purposes. In general, freed chunks are sorted into double linked lists based on their size, in order to optimize subsequent allocations (that can get already available chunks of the right size from these lists). You can see this as a sort of caching mechanism.
Now, depending on your glibc version, they could be handled slightly differently, and the internal implementation is quite complex, but what is happening in your case is something like this:
struct malloc_chunk *victim = addr; // address passed to free()
// Add chunk at the head of the free list
victim->fd = NULL;
victim->bk = head;
head->fd = victim;
Since your structure is basically equivalent to:
struct x {
int a;
int b;
int c;
}
And since on your machine sizeof(struct malloc_chunk *) == 2 * sizeof(int), the first operation (victim->fd = NULL) is effectively wiping out the contents of the first two fields of your structure (remember, user data begins exactly at fd), while the second one (victim->bk = head) is altering the third value.
The Standard specifies nothing about the behavior of a program that uses a pointer to allocated storage after it has been freed. Implementations are free to extend the language by specifying the behavior of more programs than required by the Standard, and the authors of the Standard intended to encourage variety among implementations which would support popular extensions on a quality-of-implementation basis directed by the marketplace. Some operations with pointers to dead objects are widely supported (e.g. given char *x,*y; the Standard would allow conforming implementations to behave in arbitrary fashion if a program executes free(x); y=x; in cases where x had been non-null, without regard for whether anything ever does anything with y after its initialization, but most implementations would extend the language to guarantee that such code would have no effect if y is never used) but dereferencing of such pointers generally isn't.
Note that if one were to pass two copies of the same pointer to a freed object to:
int test(char *p1, char *p2)
{
char *q;
if (*p1)
{
q = malloc(0):
free(q);
return *p1+*p2;
}
else
return 0;
}
it is entirely possible that the act of allocating and freeing q would disturb the bit patterns in the storage that had been allocated to *p1 (and also *p2), but a compiler would not be required to allow for that possibility. A compiler might plausibly return the sum of the value that was read from *p1 before the malloc/free, and a value that was read from *p2 after it; this sum could be an odd number even though if p1 and p2 are equal, *p1+*p2 should always be even.
Two things happen when you call free:
In the C model of computing, any pointer values that point to the freed memory (either its beginning, such as your F, or things within it, such as your p and q) are no longer valid. The C standard does not define what happens when you attempt to use these pointer values, and optimization by the compiler may have unexpected effects on how your program behaves if you attempt to use them.
The freed memory is released for other purposes. One of the most common other purposes for which it is used is tracking memory that is available for allocation. In other words, the software that implements malloc and free needs data structures to record which blocks of memory have been freed and other information. When you free memory, that software often uses some of the memory for this purpose. That can result in the changes you saw.
The freed memory may also be used by other things in your program. In a single-threaded program without signal handlers or similar things, generally no software would run between the free and the preparation of the arguments to the printf you show, so nothing else would reuse the memory so quickly—reuse by the malloc software is the most likely explanation for what you observed. However, in a multithreaded program, the memory might be reused immediately by another thread. (In practice, this may be a bit unlikely, as the malloc software may keep preferentially separate pools of memory for separate threads, to reduce the amount of inter-thread synchronization that is necessary.)
When a dynamically allocated object is freed, it no longer exists. Any subsequent attempt to access it has undefined behavior. The question is therefore nonsense: the members of an allocated struct cease to exist at the end of the host struct's lifetime, so they cannot be set or reset to anything at that point. There is no valid way to attempt to determine any values for such no-longer-existing objects.
This question already has answers here:
How does free know how much to free?
(11 answers)
Closed 9 years ago.
we allocate memory dynamically in C using malloc() and we receive a pointer to a location in the heap.
now we use free() to deallocate the memory, passing the same pointer value as its argumnet.
the Question now is how does free() know how much to deallocate.. considering the fact that we can always resize the memory block allocated by malloc().
is there anything related to Hash Tables here?
A typical implementation will store information just before the address returned by malloc. That information will include the information that realloc or free needs to know to do their work, but the details of what exactly is stored there depends on the implementation.
The original technique was to allocate a slightly larger block and store the size at the beginning, a part the application didn't see. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and also create dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
A simplist implementation is the one in the famous K&R C Bible,page 186 - 188.
The memory block we get actually is more (a struct head's or a union head's size) than we apply for.The struct may be like this:
typedef long Align;
union header
{
struct
{
union header* ptr; // next block
unsigned size; // size of this block , times of head size
}s;
Align x;
};
A figure to demonstrate it:
When we call the free function, the behaviour may be like this:
void free(void* ptr)
{
Header *bp, *p;
bp = (Header *)ptr - 1;
/* ..... */
/*return the memory to the linked list */
}
In visual studio, we have two models: release version and debug version,we could even use
the head to store debug message to make debug easier.The header in debug version is called _CrtMemBlockHeader, the definition is as below :
typedef struct _CrtMemBlockHeader
{
struct _CrtMemBlockHeader * pBlockHeaderNext;
struct _CrtMemBlockHeader * pBlockHeaderPrev;
char * szFileName;
int nLine;
size_t nDataSize;
int nBlockUse;
long lRequest;
unsigned char gap[nNoMansLandSize];
} _CrtMemBlockHeader;
Then the memory lalout is:
A memory manager uses tables to store additional data based on a pointer, sometimes right before the pointer, sometimes elsewhere. With C being very simple, the data is most likely pointer-2 or pointer-4, as int or long type. The correct details depend on the compiler.
When we use malloc ,a block will get reserve whose size will be littile more than what we have requested and in return to this malloc we get a pointer to start of this block.
AS i told you size of this block will be littile more than what exactly you needed.This extra space will be used to keep actual requested size of block,pointer to next free block and some data which checks "if you trying to access more than allocated block".
So whenever we call free using the pointer we want to deallocate, this free will search for the extra information given in the block space, Where it gets final size to deallocate.
Well, I can't understand when and why it is needed to allocate memory using malloc.
Here is my code:
#include <stdlib.h>
int main(int argc, const char *argv[]) {
typedef struct {
char *name;
char *sex;
int age;
} student;
// Now I can do two things
student p;
// Or
student *ptr = (student *)malloc(sizeof(student));
return 0;
}
Why is it needed to allocate memory when I can just use student p;?
malloc is used for dynamic memory allocation. As said, it is dynamic allocation which means you allocate the memory at run time. For example, when you don't know the amount of memory during compile time.
One example should clear this. Say you know there will be maximum 20 students. So you can create an array with static 20 elements. Your array will be able to hold maximum 20 students. But what if you don't know the number of students? Say the first input is the number of students. It could be 10, 20, 50 or whatever else. Now you will take input n = the number of students at run time and allocate that much memory dynamically using malloc.
This is just one example. There are many situations like this where dynamic allocation is needed.
Have a look at the man page malloc(3).
You use malloc when you need to allocate objects that must exist beyond the lifetime of execution of the current block (where a copy-on-return would be expensive as well), or if you need to allocate memory greater than the size of that stack (i.e., a 3 MB local stack array is a bad idea).
Before C99 introduced VLAs, you also needed it to perform allocation of a dynamically-sized array. However, it is needed for creation of dynamic data structures like trees, lists, and queues, which are used by many systems. There are probably many more reasons; these are just a few.
Expanding the structure of the example a little, consider this:
#include <stdio.h>
int main(int argc, const char *argv[]) {
typedef struct {
char *name;
char *sex;
char *insurance;
int age;
int yearInSchool;
float tuitionDue;
} student;
// Now I can do two things
student p;
// Or
student *p = malloc(sizeof *p);
}
C is a language that implicitly passes by value, rather than by reference. In this example, if we passed 'p' to a function to do some work on it, we would be creating a copy of the entire structure. This uses additional memory (the total of how much space that particular structure would require), is slower, and potentially does not scale well (more on this in a minute). However, by passing *p, we don't pass the entire structure. We only are passing an address in memory that refers to this structure. The amount of data passed is smaller (size of a pointer), and therefore the operation is faster.
Now, knowing this, imagine a program (like a student information system) which will have to create and manage a set of records in the thousands, or even tens of thousands. If you pass the whole structure by value, it will take longer to operate on a set of data, than it would just passing a pointer to each record.
Let's try and tackle this question considering different aspects.
Size
malloc allows you to allocate much larger memory spaces than the one allocated simply using student p; or int x[n];. The reason being malloc allocates the space on heap while the other allocates it on the stack.
The C programming language manages memory statically, automatically, or dynamically. Static-duration variables are allocated in main memory, usually along with the executable code of the program, and persist for the lifetime of the program; automatic-duration variables are allocated on the stack and come and go as functions are called and return. For static-duration and automatic-duration variables, the size of the allocation must be compile-time constant (except for the case of variable-length automatic arrays[5]). If the required size is not known until run-time (for example, if data of arbitrary size is being read from the user or from a disk file), then using fixed-size data objects is inadequate. (from Wikipedia)
Scope
Normally, the declared variables would get deleted/freed-up after the block in which it is declared (they are declared on the stack). On the other hand, variables with memory allocated using malloc remain till the time they are manually freed up.
This also means that it is not possible for you to create a variable/array/structure in a function and return its address (as the memory that it is pointing to, might get freed up). The compiler also tries to warn you about this by giving the warning:
Warning - address of stack memory associated with local variable 'matches' returned
For more details, read this.
Changing the Size (realloc)
As you may have guessed, it is not possible by the normal way.
Error detection
In case memory cannot be allocated: the normal way might cause your program to terminate while malloc will return a NULL which can easily be caught and handled within your program.
Making a change to string content in future
If you create store a string like char *some_memory = "Hello World"; you cannot do some_memory[0] = 'h'; as it is stored as string constant and the memory it is stored in, is read-only. If you use malloc instead, you can change the contents later on.
For more information, check this answer.
For more details related to variable-sized arrays, have a look at this.
malloc = Memory ALLOCation.
If you been through other programming languages, you might have used the new keyword.
Malloc does exactly the same thing in C. It takes a parameter, what size of memory needs to be allocated and it returns a pointer variable that points to the first memory block of the entire memory block, that you have created in the memory. Example -
int *p = malloc(sizeof(*p)*10);
Now, *p will point to the first block of the consecutive 10 integer blocks reserved in memory.
You can traverse through each block using the ++ and -- operator.
In this example, it seems quite useless indeed.
But now imagine that you are using sockets or file I/O and must read packets from variable length which you can only determine while running. Or when using sockets and each client connection need some storage on the server. You could make a static array, but this gives you a client limit which will be determined while compiling.
I've always heard that in C you have to really watch how you manage memory. And I'm still beginning to learn C, but thus far, I have not had to do any memory managing related activities at all.. I always imagined having to release variables and do all sorts of ugly things. But this doesn't seem to be the case.
Can someone show me (with code examples) an example of when you would have to do some "memory management" ?
There are two places where variables can be put in memory. When you create a variable like this:
int a;
char c;
char d[16];
The variables are created in the "stack". Stack variables are automatically freed when they go out of scope (that is, when the code can't reach them anymore). You might hear them called "automatic" variables, but that has fallen out of fashion.
Many beginner examples will use only stack variables.
The stack is nice because it's automatic, but it also has two drawbacks: (1) The compiler needs to know in advance how big the variables are, and (2) the stack space is somewhat limited. For example: in Windows, under default settings for the Microsoft linker, the stack is set to 1 MB, and not all of it is available for your variables.
If you don't know at compile time how big your array is, or if you need a big array or struct, you need "plan B".
Plan B is called the "heap". You can usually create variables as big as the Operating System will let you, but you have to do it yourself. Earlier postings showed you one way you can do it, although there are other ways:
int size;
// ...
// Set size to some value, based on information available at run-time. Then:
// ...
char *p = (char *)malloc(size);
(Note that variables in the heap are not manipulated directly, but via pointers)
Once you create a heap variable, the problem is that the compiler can't tell when you're done with it, so you lose the automatic releasing. That's where the "manual releasing" you were referring to comes in. Your code is now responsible to decide when the variable is not needed anymore, and release it so the memory can be taken for other purposes. For the case above, with:
free(p);
What makes this second option "nasty business" is that it's not always easy to know when the variable is not needed anymore. Forgetting to release a variable when you don't need it will cause your program to consume more memory that it needs to. This situation is called a "leak". The "leaked" memory cannot be used for anything until your program ends and the OS recovers all of its resources. Even nastier problems are possible if you release a heap variable by mistake before you are actually done with it.
In C and C++, you are responsible to clean up your heap variables like shown above. However, there are languages and environments such as Java and .NET languages like C# that use a different approach, where the heap gets cleaned up on its own. This second method, called "garbage collection", is much easier on the developer but you pay a penalty in overhead and performance. It's a balance.
(I have glossed over many details to give a simpler, but hopefully more leveled answer)
Here's an example. Suppose you have a strdup() function that duplicates a string:
char *strdup(char *src)
{
char * dest;
dest = malloc(strlen(src) + 1);
if (dest == NULL)
abort();
strcpy(dest, src);
return dest;
}
And you call it like this:
main()
{
char *s;
s = strdup("hello");
printf("%s\n", s);
s = strdup("world");
printf("%s\n", s);
}
You can see that the program works, but you have allocated memory (via malloc) without freeing it up. You have lost your pointer to the first memory block when you called strdup the second time.
This is no big deal for this small amount of memory, but consider the case:
for (i = 0; i < 1000000000; ++i) /* billion times */
s = strdup("hello world"); /* 11 bytes */
You have now used up 11 gig of memory (possibly more, depending on your memory manager) and if you have not crashed your process is probably running pretty slowly.
To fix, you need to call free() for everything that is obtained with malloc() after you finish using it:
s = strdup("hello");
free(s); /* now not leaking memory! */
s = strdup("world");
...
Hope this example helps!
You have to do "memory management" when you want to use memory on the heap rather than the stack. If you don't know how large to make an array until runtime, then you have to use the heap. For example, you might want to store something in a string, but don't know how large its contents will be until the program is run. In that case you'd write something like this:
char *string = malloc(stringlength); // stringlength is the number of bytes to allocate
// Do something with the string...
free(string); // Free the allocated memory
I think the most concise way to answer the question in to consider the role of the pointer in C. The pointer is a lightweight yet powerful mechanism that gives you immense freedom at the cost of immense capacity to shoot yourself in the foot.
In C the responsibility of ensuring your pointers point to memory you own is yours and yours alone. This requires an organized and disciplined approach, unless you forsake pointers, which makes it hard to write effective C.
The posted answers to date concentrate on automatic (stack) and heap variable allocations. Using stack allocation does make for automatically managed and convenient memory, but in some circumstances (large buffers, recursive algorithms) it can lead to the horrendous problem of stack overflow. Knowing exactly how much memory you can allocate on the stack is very dependent on the system. In some embedded scenarios a few dozen bytes might be your limit, in some desktop scenarios you can safely use megabytes.
Heap allocation is less inherent to the language. It is basically a set of library calls that grants you ownership of a block of memory of given size until you are ready to return ('free') it. It sounds simple, but is associated with untold programmer grief. The problems are simple (freeing the same memory twice, or not at all [memory leaks], not allocating enough memory [buffer overflow], etc) but difficult to avoid and debug. A hightly disciplined approach is absolutely mandatory in practive but of course the language doesn't actually mandate it.
I'd like to mention another type of memory allocation that's been ignored by other posts. It's possible to statically allocate variables by declaring them outside any function. I think in general this type of allocation gets a bad rap because it's used by global variables. However there's nothing that says the only way to use memory allocated this way is as an undisciplined global variable in a mess of spaghetti code. The static allocation method can be used simply to avoid some of the pitfalls of the heap and automatic allocation methods. Some C programmers are surprised to learn that large and sophisticated C embedded and games programs have been constructed with no use of heap allocation at all.
There are some great answers here about how to allocate and free memory, and in my opinion the more challenging side of using C is ensuring that the only memory you use is memory you've allocated - if this isn't done correctly what you end up with is the cousin of this site - a buffer overflow - and you may be overwriting memory that's being used by another application, with very unpredictable results.
An example:
int main() {
char* myString = (char*)malloc(5*sizeof(char));
myString = "abcd";
}
At this point you've allocated 5 bytes for myString and filled it with "abcd\0" (strings end in a null - \0).
If your string allocation was
myString = "abcde";
You would be assigning "abcde" in the 5 bytes you've had allocated to your program, and the trailing null character would be put at the end of this - a part of memory that hasn't been allocated for your use and could be free, but could equally be being used by another application - This is the critical part of memory management, where a mistake will have unpredictable (and sometimes unrepeatable) consequences.
A thing to remember is to always initialize your pointers to NULL, since an uninitialized pointer may contain a pseudorandom valid memory address which can make pointer errors go ahead silently. By enforcing a pointer to be initialized with NULL, you can always catch if you are using this pointer without initializing it. The reason is that operating systems "wire" the virtual address 0x00000000 to general protection exceptions to trap null pointer usage.
Also you might want to use dynamic memory allocation when you need to define a huge array, say int[10000]. You can't just put it in stack because then, hm... you'll get a stack overflow.
Another good example would be an implementation of a data structure, say linked list or binary tree. I don't have a sample code to paste here but you can google it easily.
(I'm writing because I feel the answers so far aren't quite on the mark.)
The reason you have to memory management worth mentioning is when you have a problem / solution that requires you to create complex structures. (If your programs crash if you allocate to much space on the stack at once, that's a bug.) Typically, the first data structure you'll need to learn is some kind of list. Here's a single linked one, off the top of my head:
typedef struct listelem { struct listelem *next; void *data;} listelem;
listelem * create(void * data)
{
listelem *p = calloc(1, sizeof(listelem));
if(p) p->data = data;
return p;
}
listelem * delete(listelem * p)
{
listelem next = p->next;
free(p);
return next;
}
void deleteall(listelem * p)
{
while(p) p = delete(p);
}
void foreach(listelem * p, void (*fun)(void *data) )
{
for( ; p != NULL; p = p->next) fun(p->data);
}
listelem * merge(listelem *p, listelem *q)
{
while(p != NULL && p->next != NULL) p = p->next;
if(p) {
p->next = q;
return p;
} else
return q;
}
Naturally, you'd like a few other functions, but basically, this is what you need memory management for. I should point out that there are a number tricks that are possible with "manual" memory management, e.g.,
Using the fact that malloc is guaranteed (by the language standard) to return a pointer divisible by 4,
allocating extra space for some sinister purpose of your own,
creating memory pools..
Get a good debugger... Good luck!
#Euro Micelli
One negative to add is that pointers to the stack are no longer valid when the function returns, so you cannot return a pointer to a stack variable from a function. This is a common error and a major reason why you can't get by with just stack variables. If your function needs to return a pointer, then you have to malloc and deal with memory management.
#Ted Percival:
...you don't need to cast malloc()'s return value.
You are correct, of course. I believe that has always been true, although I don't have a copy of K&R to check.
I don't like a lot of the implicit conversions in C, so I tend to use casts to make "magic" more visible. Sometimes it helps readability, sometimes it doesn't, and sometimes it causes a silent bug to be caught by the compiler. Still, I don't have a strong opinion about this, one way or another.
This is especially likely if your compiler understands C++-style comments.
Yeah... you caught me there. I spend a lot more time in C++ than C. Thanks for noticing that.
In C, you actually have two different choices. One, you can let the system manage the memory for you. Alternatively, you can do that by yourself. Generally, you would want to stick to the former as long as possible. However, auto-managed memory in C is extremely limited and you will need to manually manage the memory in many cases, such as:
a. You want the variable to outlive the functions, and you don't want to have global variable. ex:
struct pair{
int val;
struct pair *next;
}
struct pair* new_pair(int val){
struct pair* np = malloc(sizeof(struct pair));
np->val = val;
np->next = NULL;
return np;
}
b. you want to have dynamically allocated memory. Most common example is array without fixed length:
int *my_special_array;
my_special_array = malloc(sizeof(int) * number_of_element);
for(i=0; i
c. You want to do something REALLY dirty. For example, I would want a struct to represent many kind of data and I don't like union (union looks soooo messy):
struct data{
int data_type;
long data_in_mem;
};
struct animal{/*something*/};
struct person{/*some other thing*/};
struct animal* read_animal();
struct person* read_person();
/*In main*/
struct data sample;
sampe.data_type = input_type;
switch(input_type){
case DATA_PERSON:
sample.data_in_mem = read_person();
break;
case DATA_ANIMAL:
sample.data_in_mem = read_animal();
default:
printf("Oh hoh! I warn you, that again and I will seg fault your OS");
}
See, a long value is enough to hold ANYTHING. Just remember to free it, or you WILL regret. This is among my favorite tricks to have fun in C :D.
However, generally, you would want to stay away from your favorite tricks (T___T). You WILL break your OS, sooner or later, if you use them too often. As long as you don't use *alloc and free, it is safe to say that you are still virgin, and that the code still looks nice.
Sure. If you create an object that exists outside of the scope you use it in. Here is a contrived example (bear in mind my syntax will be off; my C is rusty, but this example will still illustrate the concept):
class MyClass
{
SomeOtherClass *myObject;
public MyClass()
{
//The object is created when the class is constructed
myObject = (SomeOtherClass*)malloc(sizeof(myObject));
}
public ~MyClass()
{
//The class is destructed
//If you don't free the object here, you leak memory
free(myObject);
}
public void SomeMemberFunction()
{
//Some use of the object
myObject->SomeOperation();
}
};
In this example, I'm using an object of type SomeOtherClass during the lifetime of MyClass. The SomeOtherClass object is used in several functions, so I've dynamically allocated the memory: the SomeOtherClass object is created when MyClass is created, used several times over the life of the object, and then freed once MyClass is freed.
Obviously if this were real code, there would be no reason (aside from possibly stack memory consumption) to create myObject in this way, but this type of object creation/destruction becomes useful when you have a lot of objects, and want to finely control when they are created and destroyed (so that your application doesn't suck up 1GB of RAM for its entire lifetime, for example), and in a Windowed environment, this is pretty much mandatory, as objects that you create (buttons, say), need to exist well outside of any particular function's (or even class') scope.