I want to ask about malloc and array.
int *x;
x = (int*)malloc(sizeof(int));
and
int x[4];
what is the difference between them?
The most important difference between int *xp; and int xa[4]; is sizeof(xp) != sizeof(xa) the size of the declared object.
You can pass the xa object as int *pparam to a function, but you cannot pass xp as a int aparam[4] to a function, as aparam describes the whole 4 int object and pparam describes a pointer to an object that might have any length.
Also xa will be reserved in the data area of the linked program, while the pointer malloc(sizeof(int)*4) will be allocated by a system call at runtime and on the heap. Check the vast address difference in a debugger!
Well, there are multiple differences.
This allocates a buffer of one int on the heap...
int *x;
x = (int*)malloc(sizeof(int));
And this allocates an array of four ints either on the stack or in global memory, or perhaps declares it as a member of a struct or class, if it appears within the definition of a struct or class...
int x[4];
Other than the location of the allocations, the first allocated space for one int and the second allocated space for four ints. But assuming you meant to do this instead...
int *x;
x = (int*)malloc(sizeof(int) * 4);
...then in that case, both allocations are a block of memory that is four times the size of an int on your platform. And from a usage standpoint, you can then use them both in pretty much the same way; x[0] is the first int in either case, and since neither are declared const, you can read or write to either in the same way.
So now we get to the difference in the allocation characteristics & lifetime of the two different ways of allocating that memory:
In the malloc()'ed case, memory for that request is allocated on the heap, and its lifetime is however long you want to keep it until you call free() on it. In the other case, if you declared it as a local variable inside a method/function, its lifetime is until program execution exits the scope in which it was declared. If you declared it as a global variable, its lifetime is the lifetime of the entire application. And if you declared it as a member variable of a struct or class, then its lifetime is that of its enclosing struct/class, whatever that may be.
Related
What is the difference between declaring an array "dynamically",
[ie. using realloc() or malloc(), etc... ]
vs
declaring an array within main() with Global scope?,
eg.
int main()
{
int array[10];
return 0;
}
I am learning, and at the moment it feels that there is not much differnce between
declaring a variable (array, whatever) -with Global scope,
when compared to a
dynamically allocated variable (array, whatever) -AND never calling free() on it AND allowing it to be 'destoryed' when the program ends'
What are the consequences of either option?
EDIT
Thank you for your responses.
Global scope should have been 'local scope' -local to main()
When you declare an array like int arr[10] in a function, the space for the array is allocated on the stack. The memory will be freed when your function exits.
When you declare an array or any other data structure using malloc() or realloc(), you allocated the space on the heap and the memory will only be freed afer the program exits. So when the program is running, you are responsible for freeing it using free() after you no longer want to use it. If you don't free it and make your array pointer point to something else, you will create a memory leak. However, your computer will always be able to retrieve all the program's used memory after the program ends because of virtual memory.
As kaylum said in comment below your question, the array in your second example does not have global scope. Its scope is limited to main(), and it is inaccessible in other scopes unless main() explicitly makes it available (e.g. passes it by argument to another function).
Dynamic memory allocation means that the programmer explicitly allocates memory when needed, and explicitly releases it when no longer needed. Because of that, the amount of memory allocated can be determined at run time (e.g. calculated from user input). Also, if the programmer forgets to release the memory, or reallocates it inappropriately, memory can be leaked (still allocated by the program, but not accessible by the program). For example;
/* within a function */
char *p = malloc(100);
p = malloc(200);
free(p);
leaks 100 bytes, every time this code is executed, because the result of the first malloc() call is never released, and it is then inaccessible to the program because its value is not stored anywhere.
Your second example is actually an array of automatic storage duration. As far as your program is concerned, it only exists until the end of the scope in which it is created. In your case, as main() returns, the array will cease to exist.
An example of an array with global scope is
int array[10];
void f() {array[0] = 42;}
int main()
{
array[0] = 10;
f();
/* array[0] will be 42 here */
}
The difference is that this array exists and is accessible to every function that has visibility of the declaration, within the same compilation unit.
One other important difference is that global arrays are (usually) zero initialised - a global array of int will have all elements zero. A dynamically allocated array will not have elements initialised (unless created with calloc(), which does initialise to zero). Similarly, an automatic array will not have elements initialised. It is undefined behaviour to access the value of something (including an array element) that is uninitialised.
So
#include <stdio.h>
int array[10];
int main()
{
int *array2;
int array3[10];
array2 = malloc(10*sizeof(*array2));
printf("%d\n", array[0]); /* okay - will print 0 */
printf("%d\n", array2[0]); /* undefined behaviour. array2[0] is uninitialised */
printf("%d\n", array3[0]); /* undefined behaviour. array3[0] uninitialised */
return 0;
}
Obviously the way to avoid undefined behaviour is to initialise array elements to something valid before trying to access their value (e.g. printing them out, in the example above).
Suppose I have the following C code:
#include <stdio.h>
#include <stdlib.h>
#define NUM_PEOPLE 24
typedef struct {
char **name;
int age;
} person_t;
void get_person_info(person_t *person);
int main(int argc, char **argv) {
for (int i = 0; i < NUM_PEOPLE; i++) {
person_t new_person;
get_person_info(&new_person);
}
return 0;
}
where get_person_info() just fills out the person_t struct to which a pointer is passed in. Is it necessary to malloc() memory for new_person within main()? That is, should the line
person_t new_person;
instead be
person_t *new_person = (person_t *) malloc(sizeof(person_t));
and then change get_person_info() to accept a person_t ** instead of a person_t *?
Sorry if this question is confusing -- I'm not sure whether or not this is a case where it is necessary to reserve memory, given that a pointer to that memory is passed into get_person_info() to avoid causing a segmentation fault.
Both are correct, it depends on where you want to use the person_info.
Allocating on the stack :
for (int i = 0; i < NUM_PEOPLE; i++) {
person_t new_person;
get_person_info(&new_person);
}
Creates a person_t object on the stack and fills the new_person object with data, because the loop only does that, the object goes out of scope on the next loop iteration and the data is lost.
Using malloc :
for (int i = 0; i < NUM_PEOPLE; i++) {
person_t *new_person = malloc(sizeof(person_t));
get_person_info(new_person);
}
Creates a person_t object on the heap and fills it with data, because its allocated on the heap the new_person object will outlive the loop scope which currently means that you're leaking memory because you have no pointer pointing at the data of the person_t object of the previous loop cycle.
Both ways are correct !!
person_t *new_person = (person_t *) malloc(sizeof(person_t));
and then change get_person_info() to accept a person_t ** instead of a person_t *?
you don't need to change parameter of function -void get_person_infperson_t *person);.Just pass pointer to it in main like this -
get_person_info(new_person);
But in previous way without allocating memory , you won't be able to use it outside the block it is defined in whereas if your program depend on its life you can allocate memory to it on heap.
In your code you posted new_person is used inside loop only so if you don't intend to use to outside loop you probably won't need dynamic allocation .
But if you want to use it outside loop also you should use dynamic allocation. But don't forget to free it.
Not sure whether or not to malloc memory for a struct?
The short answer is: no need to do it in your case. If you want to use your object outside the forloop you could do it by dynamically allocated memory, namely:
person_t *new_person = malloc(sizeof(person_t));
and then call it with:
get_person_info(new_person);
In you example, the object is used within the loop, thus there is no need to do it.
Note:
when you use dynamically allocated memory you should always free it, at the end to avoid memory leaks.
Edit:
As pointed out by #Johann Gerell, after removing the redundancy of the casting of the return type of malloc, in C, the allocation would look like:
person_t *new_person = malloc(sizeof(person_t));
malloc returns a void pointer (void *), which indicates that it is a pointer to a region of unknown data type. The use of casting is required in C++ due to the strong type system, whereas this is not the case in C.
Your confusion stems from not understanding object storage duration and pointers well. Let's see each one separately to get some clarity.
Storage Duration
An object can have automatic or dynamic storage duration.
Automatic
Automatic, as the name says, would be managed by the compiler for you. You just define a variable, use it and when it goes out of scope the object is destroyed automatically for you. A simple example:
if (flag) {
int i = 0;
/* some calc. involving i */
}
// i is dead here; it cannot be accessed and its storage is reclaimed
When the control enters the if's scope, memory large enough to hold an int will be allocated automatically and assigned the value 0. Once your use of i is over, when the control exits the scope, the name i goes out of scope and thus will no longer be accessible by the program and also its storage area allocated automatically for you would be reclaimed.
Dynamic
Lets say you want to have objects dynamically allocated i.e. you want to manage the storage and thereby the lifetime of the object without the scope or the compiler coming in your way, then you'd go on by requesting storage space from the platform using malloc
malloc(sizeof(int));
Notice that we're not assigning the return value of malloc to any pointer as you're used to seeing. We'll get to pointers in a bit, lets finish dynamic objects now. Here, space large enough to hold an int is handed over to you by malloc. It's up to you to free it when you're done with it. Thus the lifetime of this unnamed int object is in your hands and would live beyond the scope of the code that created it. It would end only when you explicitly call free. Without a matching free call getting called, you'd have the infamous memory leak.
Pointers
A pointer is just what its name says - an object that can refer to another object. A pointer is never what it is pointing at (pointee). A pointer is an object and its pointee is another separate, independent object. You may make a pointer point to another named object, unnamed object, or nothing (NULL).
int i = 0;
int *ptr1 = &i; // ptr1 points to the automatic int object i
int *ptr2 = malloc(sizeof(int)); // ptr2 points to some unnamed int object
int *ptr3 = NULL; // ptr3 points to nothing
Thus the reason most people confuse pointers for dynamically allocated pointees comes from this: the pointee, here, doesn't have a name and hence they're referred to always via their pointers; some people mistake one for the other.
Function Interface
The function taking a pointer is appropriate here, since from the caller's viewpoint it's a flexible function: it can take both automatic and dynamic objects. I can create an automatic variable and pass it in, or I can pass a dynamic variable too:
void get_person_info(person_t *person);
person_t o { };
get_person_info(&a);
person_t *p = malloc(sizeof(person_t));
get_person_info(p);
free(p);
Is it necessary to malloc() memory for new_person within main()?
No. You can define an automatic variable and pass it to the function. In fact it's recommended that you try to minimize your usage of dynamic objects and prefer automatic objects since
It minimizes the chances of memory leaks in your code. Even seasoned programmers miss calling the matching free to a malloc thereby introducing a memory leak.
Dynamic object allocation/deallocation is far slower than automatic variable allocation/deallocation.
A lot of dynamic allocation deallocation causes memory fragmentation.
However, automatic variables are generally allocated in the stack and thus the upper limit on the number and size on how much you can create on the stack is relatively lower than what you can allocate dynamically (generally from the heap).
change get_person_info() to accept a person_t ** instead of a person_t *?
No, if you did so, the option of passing automatic variables would still be possible but cumbersome:
void foo(int **o);
int i = 0;
int *p = &i; // p is redundant
foo(&p);
int *p = malloc(sizeof(int));
foo(&p);
As opposed the simpler
void bar(int *o);
int i = 0;
bar(&i);
int *p = malloc(sizeof(int));
bar(p);
How are variables really stored in memory? I ask this because say you malloc a segment of memory and assign it to a pointer e.g.
int *p = malloc(10 * sizeof(int));
and then run a for loop to assign integers through p - this seems different to declaring an int variable and assigning an integer to it like:
int x = 10;
Because it's a more explicit declaration that you want an int stored in memory, whereas in malloc it's just a chunk of memory you're traversing through pointer arithmetic.
Am I missing something here? Much thanks.
when you need an array of data, for example when you receive numbers from the user but don't know the length you can't use a fixed number of integers, you need a dynamic way to crate memory for those integers. malloc and his friends let you do that. among other things:
malloc let you create memory dynamically in the size you need right now.
while using malloc the memory will not be freed when exiting the scope.
using malloc for let's say array of 10 item or create an array of 10 items on the stack there is no difference in the sense of " explicit declaration that you want an int stored in memory", there is just differences in what i've written here and some more
here is an article on the differences between heap and stack
i'm writing the pros of each way:
Stack
very fast access
don't have to explicitly de-allocate variables
space is managed efficiently by CPU, memory will not become fragmented
local variables only
limit on stack size (OS-dependent)
variables cannot be resized
Heap
variables can be accessed globally
no limit on memory size
(relatively) slower access
no guaranteed efficient use of space, memory may become fragmented over time as blocks of memory are allocated, then freed
you must manage memory (you're in charge of allocating and freeing variables)
variables can be resized using realloc()
Variables declared in C like int x = 10 are located on the stack, which is accessible only until the function it's declared in returns (if it's declared outside functions, we call it global, and it's available through the whole runtime of the application).
Memory allocated with malloc and similar functions are located in the heap, which is accessible either until it's freed explicitly (e.g. calling free(...)) or the application terminates (which in case of servers might take weeks/months/years).
Both the stack and the heap is part of the memory, the main difference is in the method of allocation. In C, the * and & unary operators can blur the line between the two, so for example in case of a declaration like int x = 10 you can get the address like int* y = &x, and at the same time, you can assign a value like *y = 15 in case of a pointer as well.
Well, when you're doing
int x = 10;
Compiler does whatever needs to be done. But when you're using malloc(), you're in charge of maintaining that memory block, you can use it at your wish, this also gives you the burdon of cleaning it up properly.
1.If you know the array size . Use int array[10] is faster and more safe than int *array = malloc(10*sizeof(int)) . Only if you didn.t know the size before run-time then you need malloc thing.
2.The declare int x = 10 , x stored in stack memory. If you declare int *p = malloc(10*sizeof(int)); the p is stored in stack memory but the memory p pointer is in heap.
3.When you use int *p = malloc(10*sizeof(int)); , the function alloc a block memory , it only have the right size. In fact you can store type you want in this memory , thoungh not encourage to do this.
4.If you use int x = 10 , the memory will be freed auto , just after the variable out of its scope. If you use malloc, you should free the memory by yourself,or memory leak!
malloc asigns an memoryblock and returns an pointer to it, its life time is as of dynamicaly allocated memory. You can store any objects of its type in it.
An int
int x = 10;
is automatic storage and is an lvalue not an pointer.
So you dont have to access it by it's adress, as you would have to by an value a pointer is pointing to.
You can access and assign it's value by it's identifyer name.
And its also cleaned up, when you leave it's scope.
I summarize your questions about from the deleted answer here again for you: malloc() returns a raw chunk of data, which isn't preserved for any type. Even if you assign it to an int lvalue the data chunk isn't of type int, untill you derefference it as thoose. (means you'r using it with data of a special type.) The chunk you are getting is described by the parameter parsed to the function in Bytes. sizeof() represents the size of the type.
so you are getting in this case a chunk that has place for 10 integers. but if you havent used it with the int * ptr, you could also asign the address to a pointer of type char, and use the block as memory block for 40 char variables. But the first time you "put something in there" its then preserved for that type.
This is actually a much more concise, much more clear question than the one I had asked here before(for any who cares): C Language: Why does malloc() return a pointer, and not the value? (Sorry for those who initially think I'm spamming... I hope it's not construed as the same question since I think the way I phrased it there made it unintentionally misleading)
-> Basically what I'm trying to ask is: Why does a C programmer need a pointer to a dynamically-allocated variable/object? (whatever the difference is between variable/object...)
If a C programmer has the option of creating just 'int x' or just 'int *x' (both statically allocated), then why can't he also have the option to JUST initialize his dynamically-allocated variable/object as a variable (and NOT returning a pointer through malloc())?
*If there are some obscure ways to do what I explained above, then, well, why does malloc() seem the way that most textbooks go about dynamic-allocation?
Note: in the following, byte refers to sizeof(char)
Well, for one, malloc returns a void *. It simply can't return a value: that wouldn't be feasible with C's lack of generics. In C, the compiler must know the size of every object at compile time; since the size of the memory being allocated will not be known until run time, then a type that could represent any value must be returned. Since void * can represent any pointer, it is the best choice.
malloc also cannot initialize the block: it has no knowledge of what's being allocated. This is in contrast with C++'s operator new, which does both the allocation and the initialization, as well as being type safe (it still returns a pointer instead of a reference, probably for historical reasons).
Also, malloc allocates a block of memory of a specific size, then returns a pointer to that memory (that's what malloc stands for: memory allocation). You're getting a pointer because that's what you get: an unitialized block of raw memory. When you do, say, malloc(sizeof(int)), you're not creating a int, you're allocating sizeof(int) bytes and getting the address of those bytes. You can then decide to use that block as an int, but you could also technically use that as an array of sizeof(int) chars.
The various alternatives (calloc, realloc) work roughly the same way (calloc is easier to use when dealing with arrays, and zero-fills the data, while realloc is useful when you need to resize a block of memory).
Suppose you create an integer array in a function and want to return it. Said array is a local variable to the function. You can't return a pointer to a local variable.
However, if you use malloc, you create an object on the heap whose scope exceeds the function body. You can return a pointer to that. You just have to destroy it later or you will have a memory leak.
It's because objects allocated with malloc() don't have names, so the only way to reference that object in code is to use a pointer to it.
When you say int x;, that creates an object with the name x, and it is referenceable through that name. When I want to set x to 10, I can just use x = 10;.
I can also set a pointer variable to point to that object with int *p = &x;, and then I can alternatively set the value of x using *p = 10;. Note that this time we can talk about x without specifically naming it (beyond the point where we acquire the reference to it).
When I say malloc(sizeof(int)), that creates an object that has no name. I cannot directly set the value of that object by name, since it just doesn't have one. However, I can set it by using a pointer variable that points at it, since that method doesn't require naming the object: int *p = malloc(sizeof(int)); followed by *p = 10;.
You might now ask: "So, why can't I tell malloc to give the object a name?" - something like malloc(sizeof(int), "x"). The answer to this is twofold:
Firstly, C just doesn't allow variable names to be introduced at runtime. It's just a basic restriction of the language;
Secondly, given the first restriction the name would have to be fixed at compile-time: if this is the case, C already has syntax that does what you want: int x;.
You are thinking about things wrong. It is not that int x is statically allocated and malloc(sizeof(int)) is dynamic. Both are allocated dynamically. That is, they are both allocated at execution time. There is no space reserved for them at the time you compile. The size may be static in one case and dynamic in the other, but the allocation is always dynamic.
Rather, it is that int x allocates the memory on the stack and malloc(sizeof(int)) allocates the memory on the heap. Memory on the heap requires that you have a pointer in order to access it. Memory on the stack can be referenced directly or with a pointer. Usually you do it directly, but sometimes you want to iterate over it with pointer arithmetic or pass it to a function that needs a pointer to it.
Everything works using pointers. "int x" is just a convenience - someone, somewhere got tired of juggling memory addresses and that's how programming languages with human-readable variable names were born.
Dynamic allocation is... dynamic. You don't have to know how much space you are going to need when the program runs - before the program runs. You choose when to do it and when to undo it. It may fail. It's hard to handle all this using the simple syntax of static allocation.
C was designed with simplicity in mind and compiler simplicity is a part of this. That's why you're exposed to the quirks of the underlying implementations. All systems have storage for statically-sized, local, temporary variables (registers, stack); this is what static allocation uses. Most systems have storage for dynamic, custom-lifetime objects and system calls to manage them; this is what dynamic allocation uses and exposes.
There is a way to do what you're asking and it's called C++. There, "MyInt x = 42;" is a function call or two.
I think your question comes down to this:
If a C programmer has the option of creating just int x or just int *x (both statically allocated)
The first statement allocates memory for an integer. Depending upon the placement of the statement, it might allocate the memory on the stack of a currently executing function or it might allocate memory in the .data or .bss sections of the program (if it is a global variable or static variable, at either file scope or function scope).
The second statement allocates memory for a pointer to an integer -- it hasn't actually allocated memory for the integer itself. If you tried to assign a value using the pointer *x=1, you would either receive a very quick SIGSEGV segmentation violation or corrupt some random piece of memory. C doesn't pre-zero memory allocated on the stack:
$ cat stack.c
#include <stdio.h>
int main(int argc, char *argv[]) {
int i;
int j;
int k;
int *l;
int *m;
int *n;
printf("i: %d\n", i);
printf("j: %d\n", j);
printf("k: %d\n", k);
printf("l: %p\n", l);
printf("m: %p\n", m);
printf("n: %p\n", n);
return 0;
}
$ make stack
cc stack.c -o stack
$ ./stack
i: 0
j: 0
k: 32767
l: 0x400410
m: (nil)
n: 0x4005a0
l and n point to something in memory -- but those values are just garbage, and probably don't belong to the address space of the executable. If we store anything into those pointers, the program would probably die. It might corrupt unrelated structures, though, if they are mapped into the program's address space.
m at least is a NULL pointer -- if you tried to write to it, the program would certainly die on modern hardware.
None of those three pointers actually point to an integer yet. The memory for those integers doesn't exist. The memory for the pointers does exist -- and is initially filled with garbage values, in this case.
The Wikipedia article on L-values -- mostly too obtuse to fully recommend -- makes one point that represented a pretty significant hurdle for me when I was first learning C: In languages with assignable variables it becomes necessary to distinguish between the R-value (or contents) and the L-value (or location) of a variable.
For example, you can write:
int a;
a = 3;
This stores the integer value 3 into whatever memory was allocated to store the contents of variable a.
If you later write:
int b;
b = a;
This takes the value stored in the memory referenced by a and stores it into the memory location allocated for b.
The same operations with pointers might look like this:
int *ap;
ap=malloc(sizeof int);
*ap=3;
The first ap= assignment stores a memory location into the ap pointer. Now ap actually points at some memory. The second assignment, *ap=, stores a value into that memory location. It doesn't update the ap pointer at all; it reads the value stored in the variable named ap to find the memory location for the assignment.
When you later use the pointer, you can choose which of the two values associated with the pointer to use: either the actual contents of the pointer or the value pointed to by the pointer:
int *bp;
bp = ap; /* bp points to the same memory cell as ap */
int *bp;
bp = malloc(sizeof int);
*bp = *ap; /* bp points to new memory and we copy
the value pointed to by ap into the
memory pointed to by bp */
I found assembly far easier than C for years because I found the difference between foo = malloc(); and *foo = value; confusing. I hope I found what was confusing you and seriously hope I didn't make it worse.
Perhaps you misunderstand the difference between declaring 'int x' and 'int *x'. The first allocates storage for an int value; the second doesn't - it just allocates storage for the pointer.
If you were to "dynamically allocate" a variable, there would be no point in the dynamic allocation anyway (unless you then took its address, which would of course yield a pointer) - you may as well declare it statically. Think about how the code would look - why would you bother with:
int x = malloc(sizeof(int)); *x = 0;
When you can just do:
int x = 0;
1)
For which datatypes must I allocate memory with malloc?
For types like structs, pointers, except basic datatypes, like int
For all types?
2)
Why can I run this code? Why does it not crash? I assumed that I need to allocate memory for the struct first.
#include <stdio.h>
#include <stdlib.h>
typedef unsigned int uint32;
typedef struct
{
int a;
uint32* b;
}
foo;
int main(int argc, char* argv[])
{
foo foo2;
foo2.a = 3;
foo2.b = (uint32*)malloc(sizeof(uint32));
*foo2.b = 123;
}
Wouldn't it be better to use
foo* foo2 = malloc(sizeof(foo));
3)
How is foo.b set? Does is reference random memory or NULL?
#include <stdio.h>
#include <stdlib.h>
typedef unsigned int uint32;
typedef struct
{
int a;
uint32* b;
}
foo;
int main(int argc, char* argv[])
{
foo foo2;
foo2.a = 3;
}
All types in C can be allocated either dynamically, automatically (on the stack) or statically. The issue is not the type, but the lifetime you want - you use malloc when you want an object to exist outside of the scope of the function that created it, or when you don't know in advance how big a thing you need.
Edit to address your numbered questions.
There are no data types you must allocate with malloc. Only if you want a pointer type to point to valid memory must you use the unary & (address-of) operator or malloc() or some related function.
There is nothing wrong with your code - the line:
foo foo2;
Allocates a structure on the stack - then everything works as normal. Structures are no different in this sense than any other variable. It's not better or worse to use automatic variables (stack allocation) or globals or malloc(), they're all different, with different semantics and different reasons to choose them.
In your example in #3, foo2.b's value is undefined. Any automatic variable has an undefined and indeterminate value until you explicitly initialize it.
You must allocate with malloc any memory that you wish to be managed manually, as opposed to automatically. It doesn't matter if what's stored there is an int or a double or a struct or anything; malloc is all about manual memory management.
When you create a variable without malloc, it is stored on the stack and when it falls out of scope, its memory is automatically reclaimed. A variable falls out of scope when the variable is no longer accessible; e.g. when the block or function that the variable was declared in ends.
When you allocate memory with malloc, it is stored on the heap, and malloc returns a pointer to this memory. This memory will not be reclaimed until you call free on it, regardless of whether or not a pointer to it remains accessible (when no pointers remain to heap-allocated memory, this is a memory leak). This means that you gain the ability to continue to use the memory you allocated after the block or function that it was allocated in ends, but on the other hand you now have the responsibility to manually deallocate it when you are finished with it.
In your example, foo2 is on the stack, and will be automatically deallocated when main ends. However, the memory pointed to by foo2.b will not be automatically deallocated, since it is on the heap. This isn't a problem in your example because all memory is returned to the OS when the program ends, but if it were in a function other than main it would have been a memory leak.
foo foo2; automatically allocates the structure on the stack, and it is automatically deallocated when the enclosing function (main in this case) ends.
You only need to allocate memory on the heap, using malloc, if you need the structure to persist after the enclosing scope ends. You may also need to do this when the object is too large to fit on the stack.
2) Why can I run this code? Why does it not crash?
The code never refers to undefined memory or NULL. Why would it crash? (You have a memory leak as written, but it's probably because you're only showing part of the code and in the given program it's not a problem anyway.)
The alternative code you suggest will also work, though memory returned from malloc is also uninitialized by default. (I once worked with a custom memory allocator that filled returned memory blocks with ? characters by default. Still perfectly legal by the rules. Note that calloc returns a pointer to zero-initialized memory; use it if that's what you want.)
3) How is foo.b set? Does is reference random memory or NULL?
Random memory. Stack allocated structures are not initialized for you.
You can do it, but this is not enough.
Because, the second field is a pointer, which has to be set to a valid address. One of the ways to do it is by allocating memory (with malloc).
To your first question -
use malloc when you MUST manage object's lifetime manually.
For instance, the second code could be rewritten as:
int main(int argc, char* argv[])
{
foo foo2; uint32 b;
foo2.a = 3;
foo2.b = &b;
*foo2.b = 123;
}
This is better, because lifetime is the same, and the memory is now on stack - and doesn't need to be freed.
The memory for the struct instance ("foo2") will be allocated on the stack - there's no need to allocate memory for this yourself - if you do allocate using malloc be sure to free off the memory at a later date.