At what size should I malloc structs? - c

Most examples using structs in C use malloc to assign the required size block of memory to a pointer to that struct. However, variables with basic types (int, char etc.) are allocated to the stack and it is assumed that enough memory will be available.
I understand the idea behind this is that memory may not be available for larger structs so we use malloc to ensure we do indeed have enough memory but in the case of our struct being small is this really necessary? For example if a struct only consists of three ints, surely I am always fine to assume there is enough memory?
So really my question boils down to what are the best practises in C regarding when it is necessary to malloc variables and what is the justification?

The only time you don't have to allocate memory is when you statically allocate memory, which is what happens when you have a statement like:
int number = 5;
You can always write it as:
int *pNumber = malloc(sizeof(int));
but you have to make sure to free it or you will be leaking memory.
You can do the same thing with a struct (instead of dynamically allocating memory for it, statically allocate):
struct some_struct_t myStruct;
and access members by:
myStruct.member1 = 0;
etc...
The big difference between dynamic allocation and static is whether that data is available outside of your current scope. With static allocation, it's not. With dynamic it is, but you have to make sure to free it.
Where you run into trouble is when you have to return a structure (or a pointer to it) from a function. You either have to dynamically allocate inside the function which is returning it or you have to pass in a pointer to an externally (dynamically or statically) allocated structure which the function can then work with.

Good code gets re-used. Good code have few size limitations. Write good code.
Use malloc() whenever there is anything more than trivial buffer sizes.
Buffer size to write an int: The needed buffer size is at most sizeof(int)*CHAR_BIT/3 + 3. Use a fixed buffer.
Buffer size to write a double as in sprintf(buf, "%f",...: The needed buffer size could be thousands of bytes: use malloc(). Or use sprintf(buf, "%e",... and use a fixed buffer.
Forming a file path name could involve thousands of char. Use malloc().

Related

C: How do I initialize a global array when size is not known until runtime?

I am writing some code in C (not C99) and I think I have a need for several global arrays. I am taking in data from several text files I don't yet know the size of, and I need to store these values and have them available in several different methods. I already have written code for reading the text files into an array, but if an array isn't the best choice I am sure I could rewrite it.
If you had encountered this situation, what would you do? I don't necessarily need code examples, just ideas.
Use dynamic allocation:
int* pData;
char* pData2;
int main() {
...
pData = malloc(count * sizeof *pData); // uninitialized
pData2 = calloc(count, sizeof *pData2); // zero-initialized
/* work on your arrays */
free(pData);
free(pData2);
...
}
First of all, try to make sense of the requirement. You cannot possibly initialize a memory of "unknown" size, you can only have it initialized once you have a certain amount of memory (in terms of bytes). So, the first thing is to get the memory allocated.
This is the scenario to use memory allocator functions, malloc() and family, which allows you to allocate memory of a given size at run-time. Define a pointer, then, at run-time, get the memory size and use the allocator functions to allocate the memory of required size.
That said,
calloc() initializes the returned memory to 0.
realloc() is used to re-size the memory at run-time.
Also, while using dynamic memory allocation, you should be careful enought to clean up the allocated memory using free() when you're done using the memory to avoid memory leaks.

Why must malloc be used?

From what I understand, the malloc function takes a variable and allocates memory as asked. In this case, it will ask the compiler to prepare memory in order to fit the equivalence of twenty double variables. Is my way of understanding it correctly, and why must it be used?
double *q;
q=(double *)malloc(20*sizeof(double));
for (i=0;i<20; i++)
{
*(q+i)= (double) rand();
}
You don't have to use malloc() when:
The size is known at compile time, as in your example.
You are using C99 or C2011 with VLA (variable length array) support.
Note that malloc() allocates memory at runtime, not at compile time. The compiler is only involved to the extent that it ensures the correct function is called; it is malloc() that does the allocation.
Your example mentions 'equivalence of ten integers'. It is very seldom that 20 double occupy the same space as 10 int. Usually, 10 double will occupy the same space as 20 int (when sizeof(int) == 4 and sizeof(double) == 8, which is a very commonly found setting).
It's used to allocate memory at run-time rather than compile-time. So if your data arrays are based on some sort of input from the user, database, file, etc. then malloc must be used once the desired size is known.
The variable q is a pointer, meaning it stores an address in memory. malloc is asking the system to create a section of memory and return the address of that section of memory, which is stored in q. So q points to the starting location of the memory you requested.
Care must be taken not to alter q unintentionally. For instance, if you did:
q = (double *)malloc(20*sizeof(double));
q = (double *)malloc(10*sizeof(double));
you will lose access to the first section of 20 double's and introduce a memory leak.
When you use malloc you are asking the system "Hey, I want this many bytes of memory" and then he will either say "Sorry, I'm all out" or "Ok! Here is an address to the memory you wanted. Don't lose it".
It's generally a good idea to put big datasets in the heap (where malloc gets your memory from) and a pointer to that memory on the stack (where code execution takes place). This becomes more important on embedded platforms where you have limited memory. You have to decide how you want to divvy up the physical memory between the stack and heap. Too much stack and you can't dynamically allocate much memory. Too little stack and you can function call your way right out of it (also known as a stack overflow :P)
As the others said, malloc is used to allocate memory. It is important to note that malloc will allocate memory from the heap, and thus the memory is persistent until it is free'd. Otherwise, without malloc, declaring something like double vals[20] will allocate memory on the stack. When you exit the function, that memory is popped off of the stack.
So for example, say you are in a function and you don't care about the persistence of values. Then the following would be suitable:
void some_function() {
double vals[20];
for(int i = 0; i < 20; i++) {
vals[i] = (double)rand();
}
}
Now if you have some global structure or something that stores data, that has a lifetime longer than that of just the function, then using malloc to allocate that memory from the heap is required (alternatively, you can declare it as a global variable, and the memory will be preallocated for you).
In you example, you could have declared double q[20]; without the malloc and it would work.
malloc is a standard way to get dynamically allocated memory (malloc is often built above low-level memory acquisition primitives like mmap on Linux).
You want to get dynamically allocated memory resources, notably when the size of the allocated thing (here, your q pointer) depends upon runtime parameters (e.g. depends upon input). The bad alternative would be to allocate all statically, but then the static size of your data is a strong built-in limitation, and you don't like that.
Dynamic resource allocation enables you to run the same program on a cheap tablet (with half a gigabyte of RAM) and an expensive super-computer (with terabytes of RAM). You can allocate different size of data.
Don't forget to test the result of malloc; it can fail by returning NULL. At the very least, code:
int* q = malloc (10*sizeof(int));
if (!q) {
perror("q allocation failed");
exit(EXIT_FAILURE);
};
and always initialize malloc-ed memory (you could prefer using calloc which zeroes the allocated memory).
Don't forget to later free the malloc-ed memory. On Linux, learn about using valgrind. Be scared of memory leaks and dangling pointers. Recognize that the liveness of some data is a non-modular property of the entire program. Read about garbage collection!, and consider perhaps using Boehm's conservative garbage collector (by calling GC_malloc instead of malloc).
You use malloc() to allocate memory dynamically in C. (Allocate the memory at the run time)
You use it because sometimes you don't know how much memory you'll use when you write your program.
You don't have to use it when you know thow many elements the array will hold at compile time.
Another important thing to notice that if you want to return an array from a function, you will want to return an array which was not defined inside the function on the stack. Instead, you'll want to dynamically allocate an array (on the heap) and return a pointer to this block:
int *returnArray(int n)
{
int i;
int *arr = (int *)malloc(sizeof(int) * n);
if (arr == NULL)
{
return NULL;
}
//...
//fill the array or manipulate it
//...
return arr; //return the pointer
}

using malloc over array

May be similar question found on SO. But, I didn't found that, here is the scenario
Case 1
void main()
{
char g[10];
char a[10];
scanf("%[^\n] %[^\n]",a,g);
swap(a,g);
printf("%s %s",a,g);
}
Case 2
void main()
{
char *g=malloc(sizeof(char)*10);
char *a=malloc(sizeof(char)*10);
scanf("%[^\n] %[^\n]",a,g);
swap(a,g);
printf("%s %s",a,g);
}
I'm getting same output in both case. So, my question is when should I prefer malloc() instead of array or vice-verse and why ?? I found common definition, malloc() provides dynamic allocation. So, it is the only difference between them ?? Please any one explain with example, what is the meaning of dynamic although we are specifying the size in malloc().
The principle difference relates to when and how you decide the array length. Using fixed length arrays forces you to decide your array length at compile time. In contrast using malloc allows you to decide the array length at runtime.
In particular, deciding at runtime allows you to base the decision on user input, on information not known at the time you compile. For example, you may allocate the array to be a size big enough to fit the actual data input by the user. If you use fixed length arrays, you have to decide at compile time an upper bound, and then force that limitation onto the user.
Another more subtle issue is that allocating very large fixed length arrays as local variables can lead to stack overflow runtime errors. And for that reason, you sometimes prefer to allocate such arrays dynamically using malloc.
Please any one explain with example, what is the meaning of dynamic although we are specifying the size.
I suspect this was significant before C99. Before C99, you couldn't have dynamically-sized auto arrays:
void somefunc(size_t sz)
{
char buf[sz];
}
is valid C99 but invalid C89. However, using malloc(), you can specify any value, you don't have to call malloc() with a constant as its argument.
Also, to clear up what other purpose malloc() has: you can't return stack-allocated memory from a function, so if your function needs to return allocated memory, you typically use malloc() (or some other member of the malloc familiy, including realloc() and calloc()) to obtain a block of memory. To understand this, consider the following code:
char *foo()
{
char buf[13] = "Hello world!";
return buf;
}
Since buf is a local variable, it's invalidated at the end of its enclosing function - returning it results in undefined behavior. The function above is erroneous. However, a pointer obtained using malloc() remains valid through function calls (until you don't call free() on it):
char *bar()
{
char *buf = malloc(13);
strcpy(buf, "Hello World!");
return buf;
}
This is absolutely valid.
I would add that in this particular example, malloc() is very wasteful, as there is more memory allocated for the array than what would appear [due to overhead in malloc] as well as the time it takes to call malloc() and later free() - and there's overhead for the programmer to remember to free it - memory leaks can be quite hard to debug.
Edit: Case in point, your code is missing the free() at the end of main() - may not matter here, but it shows my point quite well.
So small structures (less than 100 bytes) should typically be allocated on the stack. If you have large data structures, it's better to allocate them with malloc (or, if it's the right thing to do, use globals - but this is a sensitive subject).
Clearly, if you don't know the size of something beforehand, and it MAY be very large (kilobytes in size), it is definitely a case of "consider using malloc".
On the other hand, stacks are pretty big these days (for "real computers" at least), so allocating a couple of kilobytes of stack is not a big deal.

When and why to use malloc

Well, I can't understand when and why it is needed to allocate memory using malloc.
Here is my code:
#include <stdlib.h>
int main(int argc, const char *argv[]) {
typedef struct {
char *name;
char *sex;
int age;
} student;
// Now I can do two things
student p;
// Or
student *ptr = (student *)malloc(sizeof(student));
return 0;
}
Why is it needed to allocate memory when I can just use student p;?
malloc is used for dynamic memory allocation. As said, it is dynamic allocation which means you allocate the memory at run time. For example, when you don't know the amount of memory during compile time.
One example should clear this. Say you know there will be maximum 20 students. So you can create an array with static 20 elements. Your array will be able to hold maximum 20 students. But what if you don't know the number of students? Say the first input is the number of students. It could be 10, 20, 50 or whatever else. Now you will take input n = the number of students at run time and allocate that much memory dynamically using malloc.
This is just one example. There are many situations like this where dynamic allocation is needed.
Have a look at the man page malloc(3).
You use malloc when you need to allocate objects that must exist beyond the lifetime of execution of the current block (where a copy-on-return would be expensive as well), or if you need to allocate memory greater than the size of that stack (i.e., a 3 MB local stack array is a bad idea).
Before C99 introduced VLAs, you also needed it to perform allocation of a dynamically-sized array. However, it is needed for creation of dynamic data structures like trees, lists, and queues, which are used by many systems. There are probably many more reasons; these are just a few.
Expanding the structure of the example a little, consider this:
#include <stdio.h>
int main(int argc, const char *argv[]) {
typedef struct {
char *name;
char *sex;
char *insurance;
int age;
int yearInSchool;
float tuitionDue;
} student;
// Now I can do two things
student p;
// Or
student *p = malloc(sizeof *p);
}
C is a language that implicitly passes by value, rather than by reference. In this example, if we passed 'p' to a function to do some work on it, we would be creating a copy of the entire structure. This uses additional memory (the total of how much space that particular structure would require), is slower, and potentially does not scale well (more on this in a minute). However, by passing *p, we don't pass the entire structure. We only are passing an address in memory that refers to this structure. The amount of data passed is smaller (size of a pointer), and therefore the operation is faster.
Now, knowing this, imagine a program (like a student information system) which will have to create and manage a set of records in the thousands, or even tens of thousands. If you pass the whole structure by value, it will take longer to operate on a set of data, than it would just passing a pointer to each record.
Let's try and tackle this question considering different aspects.
Size
malloc allows you to allocate much larger memory spaces than the one allocated simply using student p; or int x[n];. The reason being malloc allocates the space on heap while the other allocates it on the stack.
The C programming language manages memory statically, automatically, or dynamically. Static-duration variables are allocated in main memory, usually along with the executable code of the program, and persist for the lifetime of the program; automatic-duration variables are allocated on the stack and come and go as functions are called and return. For static-duration and automatic-duration variables, the size of the allocation must be compile-time constant (except for the case of variable-length automatic arrays[5]). If the required size is not known until run-time (for example, if data of arbitrary size is being read from the user or from a disk file), then using fixed-size data objects is inadequate. (from Wikipedia)
Scope
Normally, the declared variables would get deleted/freed-up after the block in which it is declared (they are declared on the stack). On the other hand, variables with memory allocated using malloc remain till the time they are manually freed up.
This also means that it is not possible for you to create a variable/array/structure in a function and return its address (as the memory that it is pointing to, might get freed up). The compiler also tries to warn you about this by giving the warning:
Warning - address of stack memory associated with local variable 'matches' returned
For more details, read this.
Changing the Size (realloc)
As you may have guessed, it is not possible by the normal way.
Error detection
In case memory cannot be allocated: the normal way might cause your program to terminate while malloc will return a NULL which can easily be caught and handled within your program.
Making a change to string content in future
If you create store a string like char *some_memory = "Hello World"; you cannot do some_memory[0] = 'h'; as it is stored as string constant and the memory it is stored in, is read-only. If you use malloc instead, you can change the contents later on.
For more information, check this answer.
For more details related to variable-sized arrays, have a look at this.
malloc = Memory ALLOCation.
If you been through other programming languages, you might have used the new keyword.
Malloc does exactly the same thing in C. It takes a parameter, what size of memory needs to be allocated and it returns a pointer variable that points to the first memory block of the entire memory block, that you have created in the memory. Example -
int *p = malloc(sizeof(*p)*10);
Now, *p will point to the first block of the consecutive 10 integer blocks reserved in memory.
You can traverse through each block using the ++ and -- operator.
In this example, it seems quite useless indeed.
But now imagine that you are using sockets or file I/O and must read packets from variable length which you can only determine while running. Or when using sockets and each client connection need some storage on the server. You could make a static array, but this gives you a client limit which will be determined while compiling.

how is dynamic memory allocation better than array?

int numbers*;
numbers = malloc ( sizeof(int) * 10 );
I want to know how is this dynamic memory allocation, if I can store just 10 int items to the memory block ? I could just use the array and store elemets dynamically using index. Why is the above approach better ?
I am new to C, and this is my 2nd day and I may sound stupid, so please bear with me.
In this case you could replace 10 with a variable that is assigned at run time. That way you can decide how much memory space you need. But with arrays, you have to specify an integer constant during declaration. So you cannot decide whether the user would actually need as many locations as was declared, or even worse , it might not be enough.
With a dynamic allocation like this, you could assign a larger memory location and copy the contents of the first location to the new one to give the impression that the array has grown as needed.
This helps to ensure optimum memory utilization.
The main reason why malloc() is useful is not because the size of the array can be determined at runtime - modern versions of C allow that with normal arrays too. There are two reasons:
Objects allocated with malloc() have flexible lifetimes;
That is, you get runtime control over when to create the object, and when to destroy it. The array allocated with malloc() exists from the time of the malloc() call until the corresponding free() call; in contrast, declared arrays either exist until the function they're declared in exits, or until the program finishes.
malloc() reports failure, allowing the program to handle it in a graceful way.
On a failure to allocate the requested memory, malloc() can return NULL, which allows your program to detect and handle the condition. There is no such mechanism for declared arrays - on a failure to allocate sufficient space, either the program crashes at runtime, or fails to load altogether.
There is a difference with where the memory is allocated. Using the array syntax, the memory is allocated on the stack (assuming you are in a function), while malloc'ed arrays/bytes are allocated on the heap.
/* Allocates 4*1000 bytes on the stack (which might be a bit much depending on your system) */
int a[1000];
/* Allocates 4*1000 bytes on the heap */
int *b = malloc(1000 * sizeof(int))
Stack allocations are fast - and often preferred when:
"Small" amount of memory is required
Pointer to the array is not to be returned from the function
Heap allocations are slower, but has the advantages:
Available heap memory is (normally) >> than available stack memory
You can freely pass the pointer to the allocated bytes around, e.g. returning it from a function -- just remember to free it at some point.
A third option is to use statically initialized arrays if you have some common task, that always requires an array of some max size. Given you can spare the memory statically consumed by the array, you avoid the hit for heap memory allocation, gain the flexibility to pass the pointer around, and avoid having to keep track of ownership of the pointer to ensure the memory is freed.
Edit: If you are using C99 (default with the gnu c compiler i think?), you can do variable-length stack arrays like
int a = 4;
int b[a*a];
In the example you gave
int *numbers;
numbers = malloc ( sizeof(int) * 10 );
there are no explicit benefits. Though, imagine 10 is a value that changes at runtime (e.g. user input), and that you need to return this array from a function. E.g.
int *aFunction(size_t howMany, ...)
{
int *r = malloc(sizeof(int)*howMany);
// do something, fill the array...
return r;
}
The malloc takes room from the heap, while something like
int *aFunction(size_t howMany, ...)
{
int r[howMany];
// do something, fill the array...
// you can't return r unless you make it static, but this is in general
// not good
return somethingElse;
}
would consume the stack that is not so big as the whole heap available.
More complex example exists. E.g. if you have to build a binary tree that grows according to some computation done at runtime, you basically have no other choices but to use dynamic memory allocation.
Array size is defined at compilation time whereas dynamic allocation is done at run time.
Thus, in your case, you can use your pointer as an array : numbers[5] is valid.
If you don't know the size of your array when writing the program, using runtime allocation is not a choice. Otherwise, you're free to use an array, it might be simpler (less risk to forget to free memory for example)
Example:
to store a 3-D position, you might want to use an array as it's alwaays 3 coordinates
to create a sieve to calculate prime numbers, you might want to use a parameter to give the max value and thus use dynamic allocation to create the memory area
Array is used to allocate memory statically and in one go.
To allocate memory dynamically malloc is required.
e.g. int numbers[10];
This will allocate memory statically and it will be contiguous memory.
If you are not aware of the count of the numbers then use variable like count.
int count;
int *numbers;
scanf("%d", count);
numbers = malloc ( sizeof(int) * count );
This is not possible in case of arrays.
Dynamic does not refer to the access. Dynamic is the size of malloc. If you just use a constant number, e.g. like 10 in your example, it is nothing better than an array. The advantage is when you dont know in advance how big it must be, e.g. because the user can enter at runtime the size. Then you can allocate with a variable, e.g. like malloc(sizeof(int) * userEnteredNumber). This is not possible with array, as you have to know there at compile time the (maximum) size.

Resources