// Define Person
typedef struct Person {
char *name;
int age;
} person_t;
// Init person in heap
person_t *create_person_v1(char *name, int age)
{
person_t *person = calloc(1, sizeof(person_t));
person->name = name;
person->age = age;
return person;
}
// Init person on stak? a static struct I guess?
person_t create_person_v2(char *name, int age)
{
return (person_t) {name, age};
}
The code above has a definition of Person and two helper functions for its further initialization.
I don't really understand the difference between them, and what are the benefits of having each?
Is it something more than just accessors -> and . ?
The last function does not allocate a static structure, but a structure on the stack. There is two kind of memory for your programm, heap and stack. static is an other keyword (which have more than 1 use, depending on where they are used, but it's not the current subject).
heap allocations use more ressources than stack allocations, but are not tied to scopes where they are used. You can manipulate structures heap allocated between functions without having to think they can be "automatically freed" because the scope on which they are tied disappear (aka : initial function returned).
An example would be a linked list. You can't rely on stack allocations to build a linked list of structures.
Heap allocations can fail, you always have to check if malloc, calloc and cie, does not return NULL.
Stack allocations doesn't fail like that. If you sucked all your stack, your programm will just crash. ( Or do some fancy things, it is undefined bahavior ). But you can't recover from that.
stack allocated memory is "freed" on function returns.
not heap allocated memory. But you have to free it by hand (with function free)
Related
I've been working on some C projects and was wondering if I create a custom structure, for example, Student, define a variable of the custom structure type, and allocate memory to it using malloc, does it also allocate memory for variables' properties separately or are they all kept in the same space? if yes, will there be any difference if I allocate memory using malloc separately for every property?
For example:
typedef struct {
unsigned int ID;
char *first_name;
char *last_name;
int num_grades;
float *grades;
unsigned short int days_absent;
char *memo;
} Student;
int main() {
// Declare students array
Student *students = NULL;
int students_size = 0;
// Allocate memory for students array
students = (Student *) malloc(sizeof(Student));
return 0;
}
That allocates enough memory for the struct, which includes enough memory for ID, first_name, etc and all padding requirements.
Note that while it allocates memory for the pointer first_name, it doesn't allocate a buffer to hold the name. It just allocates memory for first_name, a pointer. If you want memory in which to store the names, you will need to allocate it.
If the struct had a char first_name[40]; field, it would be a different story. To allocate enough memory for first_name, it needs to allocate enough memory for an array of 40 char instead of enough for a pointer. This does provide a space in which a string could be stored.
No, it doesn't. It allocates uninitialized memory for the number of chars you want - which is usually a calculation based on sizeofs.
If you want it to allocate memory to store values that your struct has pointers to, you'll have to add that after having allocated the memory for the struct.
You'll also have to "go backwards" when you free such a struct.
Example:
typedef struct {
char *data;
} foo;
foo *foo_create() {
foo *retval = malloc(sizeof *retval ); // try allocation
if(retval == NULL) return NULL; // check that it worked
retval->data = malloc(10) ; // allocate something for a member
if(retval->data == NULL) { // check that it worked
free(retval); // oh, it didn't, free what you allocated
return NULL; // and return something to indicate failure
}
return retval; // all successful
}
void foo_free(foo *elem) {
if(elem != NULL) { // just a precaution
free(elem->data); // free the member's memory
free(elem); // then the memory for the object
}
}
Does it also allocate memory for variables' properties separately
No. After allocating for students, allocate for students->first_name, students->last_name, etc.
does it also allocate memory for variables' properties separately or are they all kept in the same space?
No. malloc() is given a size to indicate how much contiguous memory to allocate, and it returns a pointer pointing to it... malloc() knows nothing about what you are going to do with the pointer. When you assign it to a pointer variable to Student type is, somehow, dressing a bunch of memory with structure. But the char * fields that you have defined inside (or if you have other fields pointing to other structured data) those have to be allocated separately (or ask for more memory to allocate them all in the same returned segment, but this requires practice and knowledge of the alignment issues that arise from it)
Let's say we have a struct :
struct Person {
char *name;
};
struct Person *Person_create(char *name){
struct Person *who = malloc(sizeof(struct Person));
assert(who != NULL);
who->name = strdup(name);
return who;
}
void Person_destroy(struct Person *who){
assert(who != NULL);
free(who->name);
free(who);
}
And the main function :
int main(int argc,char *argv[]){
struct Person *mike = Person_create("mike");
Person_print(mike);
Person_destroy(mike);
return 0;
}
The above code won't work properly without the strdup() function. Valgrind says that the address you try to free with free(who->name) is not malloc'd. What's the story behind this, didn't I malloc'd that memory when I malloc'd the struct? And what difference does the strdup() make?
In your code each Person object involves two independent blocks of memory: the struct Person object itself and the separate memory block for the name. That separate memory block is pointed by the name pointer inside the struct. When you allocate struct Person by using malloc, you are not allocating any memory for the name. It is, again, an independent block of memory that should be allocated independently.
How you are planning to allocate memory for the name is entirely up to you. If you use strdup (which is essentially a wrapper for malloc) as in your code above, then you will have to free it by free in the end.
You did not explain what you mean by "without the strdup() function". What did you code look like without it? If you simply did
who->name = name;
then you made that who->name pointer to point directly to the string literal memory occupied by literal "mike". String literals reside in static memory. You are not allowed to free them. This is what valgrind is telling you.
Mallocing the struct allocates memory for the pointer name but it doesn't allocate any memory for name to point to. At that point who->name will be some random garbage value so freeing it makes no sense.
strdup uses malloc internally to allocate memory for the string it copies. Once you've got a pointer back from strdup you can, and should, free it when you're done.
strdup do noes call malloc, it is only string operation. you only malloc the pointer to the struct ,not the inner member
I have a generic linked-list that holds data of type void* I am trying to populate my list with type struct employee, eventually I would like to destruct the object struct employee as well.
Consider this generic linked-list header file (i have tested it with type char*):
struct accListNode //the nodes of a linked-list for any data type
{
void *data; //generic pointer to any data type
struct accListNode *next; //the next node in the list
};
struct accList //a linked-list consisting of accListNodes
{
struct accListNode *head;
struct accListNode *tail;
int size;
};
void accList_allocate(struct accList *theList); //allocate the accList and set to NULL
void appendToEnd(void *data, struct accList *theList); //append data to the end of the accList
void removeData(void *data, struct accList *theList); //removes data from accList
--------------------------------------------------------------------------------------
Consider the employee structure
struct employee
{
char name[20];
float wageRate;
}
Now consider this sample testcase that will be called from main():
void test2()
{
struct accList secondList;
struct employee *emp = Malloc(sizeof(struct employee));
emp->name = "Dan";
emp->wageRate =.5;
struct employee *emp2 = Malloc(sizeof(struct employee));
emp2->name = "Stan";
emp2->wageRate = .3;
accList_allocate(&secondList);
appendToEnd(emp, &secondList);
appendToEnd(emp2, &secondList);
printf("Employee: %s\n", ((struct employee*)secondList.head->data)->name); //cast to type struct employee
printf("Employee2: %s\n", ((struct employee*)secondList.tail->data)->name);
}
Why does the answer that I posted below solve my problem? I believe it has something to do with pointers and memory allocation. The function Malloc() that i use is a custom malloc that checks for NULL being returned.
Here is a link to my entire generic linked list implementation: https://codereview.stackexchange.com/questions/13007/c-linked-list-implementation
The problem is this accList_allocate() and your use of it.
struct accList secondList;
accList_allocate(&secondList);
In the original test2() secondList is memory on the stack. &secondList is a pointer to that memory. When you call accList_allocate() a copy of the pointer is passed in pointing at the stack memory. Malloc() then returns a chunk of memory and assigns it to the copy of the pointer, not the original secondList.
Coming back out, secondList is still pointing at uninitialised memory on the stack so the call to appendToEnd() fails.
The same happens with the answer except secondList just happens to be free of junk. Possibly by chance, possibly by design of the compiler. Either way it is not something you should rely on.
Either:
struct accList *secondList = NULL;
accList_allocate(&secondList);
And change accList_allocate()
accList_allocate(struct accList **theList) {
*theList = Malloc(sizeof(struct accList));
(*theList)->head = NULL;
(*theList)->tail = NULL;
(*theList)->size = 0;
}
OR
struct accList secondList;
accList_initialise(secondList);
With accList_allocate() changed to accList_initialise() because it does not allocate
accList_initialise(struct accList *theList) {
theList->head = NULL;
theList->tail = NULL;
theList->size = 0;
}
I think that your problem is this:
You've allocated secondList on the stack in your original test2 function.
The stack memory is probably dirty, so secondList requires initialization
Your accList_allocate function takes a pointer to the list, but then overwrites it with the Malloc call. This means that the pointer you passed in is never initialized.
When test2 tries to run, it hits a bad pointer (because the memory isn't initialized).
The reason that it works when you allocate it in main is that your C compiler probably zeros the stack when the program starts. When main allocates a variable on the stack, that allocation is persistent (until the program ends), so secondList is actually, and accidentally, properly initialized when you allocate it in main.
Your current accList_allocate doesn't actually initialize the pointer that's been passed in, and the rest of your code will never see the pointer that it allocates with Malloc. To solve your problem, I would create a new function: accList_initialize whose only job is to initialize the list:
void accList_initialize(struct accList* theList)
{
// NO malloc
theList->head = NULL;
theList->tail = NULL;
theList->size = 0;
}
Use this, instead of accList_allocate in your original test2 function. If you really want to allocate the list on the heap, then you should do so (and not mix it with a struct allocated on the stack). Have accList_allocate return a pointer to the allocated structure:
struct accList* accList_allocate(void)
{
struct accList* theList = Malloc( sizeof(struct accList) );
accList_initialize(theList);
return theList;
}
Two things I see wrong here based on the original code, in the above question,
What you've seen is undefined behaviour and arose from that is the bus error message as you were assigning a string literal to the variable, when in fact you should have been using the strcpy function, you've edited your original code accordinly so.. something to keep in mind in the future :)
The usage of the word Malloc is going to cause confusion, especially in peer-review, the reviewers are going to have a brain fart and say "whoa, what's this, should that not be malloc?" and very likely raise it up. (Basically, do not call custom functions that have similar sounding names as the C standard library functions)
You're not checking for the NULL, what if your souped up version of Malloc failed then emp is going to be NULL! Always check it no matter how trivial or your thinking is "Ah sher the platform has heaps of memory on it, 4GB RAM no problem, will not bother to check for NULL"
Have a look at this question posted elsewhere to explain what is a bus error.
Edit: Using linked list structures, in how the parameters in the function is called is crucial to the understanding of it. Notice the usage of &, meaning take the address of the variable that points to the linked list structure, and passing it by reference, not passing by value which is a copy of the variable. This same rule applies to usage of pointers also in general :)
You've got the parameters slightly out of place in the first code in your question, if you were using double-pointers in the parameter list then yes, using &secondList would have worked.
It may depend on how your Employee structure is designed, but you should note that
strcpy(emp->name, "Dan");
and
emp->name = "Dan";
function differently. In particular, the latter is a likely source of bus errors because you generally cannot write to string literals in this way. Especially if your code has something like
name = "NONE"
or the like.
EDIT: Okay, so with the design of the employee struct, the problem is this:
You can't assign to arrays. The C Standard includes a list of modifiable lvalues and arrays are not one of them.
char name[20];
name = "JAMES" //illegal
strcpy is fine - it just goes to the memory address dereferenced by name[0] and copies "JAMES\0" into the memory there, one byte at a time.
Please bear with me, i m from other language and newbie to c and learning it from http://c.learncodethehardway.org/book/learn-c-the-hard-way.html
struct Person {
char *name;
int age;
int height;
int weight;
};
struct Person *Person_create(char *name, int age, int height, int weight)
{
struct Person *who = malloc(sizeof(struct Person));
assert(who != NULL);
who->name = strdup(name);
who->age = age;
who->height = height;
who->weight = weight;
return who;
}
I understand the second Person_create function returns a pointer of struct Person. I don't understand is(may be because i m from other language, erlang, ruby), why does it define it as
struct Person *Person_create(char *name, int age, int height, int weight)
not
struct Person Person_create(char *name, int age, int height, int weight)
and is there other way to define a function to return a structure?
sorry if this question is too basic.
It is defined so because it returns a pointer to a struct, not a struct. You assign the return value to a struct Person *, not to struct Person.
It is possible to return a full struct, like that:
struct Person Person_create(char *name, int age, int height, int weight)
{
struct Person who;
who.name = strdup(name);
who.age = age;
who.height = height;
who.weight = weight;
return who;
}
But it is not used very often.
The Person_create function returns a pointer to a struct Person so you have to define the return value to be a pointer (by adding the *). To understand the reason for returning a pointer to a struct and not the struct itself one must understand the way C handles memory.
When you call a function in C you add a record for it on the call stack. At the bottom of the call stack is the main function of the program you're running, at the top is the currently executing function. The records on the stack contain information such as the values of the parameters passed to the functions and all the local variables of the functions.
There is another type of memory your program has access to: heap memory. This is where you allocate space using malloc, and it is not connected to the call stack.
When you return from a function the call stack is popped and all the information associated with the function call are lost. If you want to return a struct you have two options: copy the data inside the struct before it is popped from the call stack, or keep the data in heap memory and return a pointer to it. It's more expensive to copy the data byte for byte than to simply return a pointer, and thus you would normally want to do that to save resources (both memory and CPU cycles). However, it doesn't come without cost; when you keep your data in heap memory you have to remember to free it when you stop using it, otherwise your program will leak memory.
The function returns who, which is a struct Person * - a pointer to a structure. The memory to hold the structure is allocated by malloc(), and the function returns a pointer to that memory.
If the function were declared to return struct Person and not a pointer, then who could also be declared as a structure. Upon return, the structure would be copied and returned to the caller. Note that the copy is less efficient than simply returning a pointer to the memory.
Structs are not pointers (or references) by default in C/C++, as they are for example in Java. Struct Person Function() would therefor return struct itself (by value, making a copy) not a pointer.
You often don't want to create copies of objects (shallow copies by default, or copies created using copy constructors) as this can get pretty time consuming soon.
To copy the whole struct and not just pointer is less efficient because a pointer's sizeof is usually much smaller than sizeof of a whole struct itself.
Also, a struct might contain pointers to other data in memory, and blindly copying that could be dangerous for dynamically allocated data (if a code handling one copy would free it, the other copy would be left with invalid pointer).
So shallow copy is almost always a bad idea, unless you're sure that the original goes out of scope - and then why wouldn't you just return a pointer to the struct instead (a struct dynamically allocated on heap of course, so it won't be destroyed like the stack-allocated entities are destroyed, on return from a function).
I'm just reading about malloc() in C.
The Wikipedia article provides an example, however it justs allocate enough memory for an array of 10 ints in comparison with int array[10]. Not very useful.
When would you decided to use malloc() over C handling the memory for you?
Dynamic data structures (lists, trees, etc.) use malloc to allocate their nodes on the heap. For example:
/* A singly-linked list node, holding data and pointer to next node */
struct slnode_t
{
struct slnode_t* next;
int data;
};
typedef struct slnode_t slnode;
/* Allocate a new node with the given data and next pointer */
slnode* sl_new_node(int data, slnode* next)
{
slnode* node = malloc(sizeof *node);
node->data = data;
node->next = next;
return node;
}
/* Insert the given data at the front of the list specified by a
** pointer to the head node
*/
void sl_insert_front(slnode** head, int data)
{
slnode* node = sl_new_node(data, *head);
*head = node;
}
Consider how new data is added to the list with sl_insert_front. You need to create a node that will hold the data and the pointer to the next node in the list. Where are you going to create it?
Maybe on the stack! - NO - where will that stack space be allocated? In which function? What happens to it when the function exits?
Maybe in static memory! - NO - you'll then have to know in advance how many list nodes you have because static memory is pre-allocated when the program loads.
On the heap? YES - because there you have all the required flexibility.
malloc is used in C to allocate stuff on the heap - memory space that can grow and shrink dynamically at runtime, and the ownership of which is completely under the programmer's control. There are many more examples where this is useful, but the one I'm showing here is a representative one. Eventually, in complex C programs you'll find that most of the program's data is on the heap, accessible through pointers. A correct program always knows which pointer "owns" the data and will carefully clean-up the allocated memory when it's no longer needed.
What if you don't know the size of the array when you write your program ?
As an example, we could imagine you want to load an image. At first you don't know its size, so you will have to read the size from the file, allocate a buffer with this size and then read the file in that buffer. Obviously you could not have use a static size array.
EDIT:
Another point is: When you use dynamic allocation, memory is allocated on the heap while arrays are allocated on the stack. This is quite important when you are programming on embedded device as stack can have a limited size compared to heap.
I recommend that you google Stack and Heap.
int* heapArray = (int*)malloc(10 * sizeof(int));
int stackArray[10];
Both are very similar in the way you access the data. They are very different in the way that the data is stored behind the scenes. The heapArray is allocated on the heap and is only deallocted when the application dies, or when free(heapArray) is called. The stackArray is allocated on the stack and is deallocated when the stack unwinds.
In the example you described int array[10] goes away when you leave your stack frame. If you would like the used memory to persist beyond local scope you have to use malloc();
Although you can do variable length arrays as of C99, there's still no decent substitute for the more dynamic data structures. A classic example is the linked list. To get an arbitrary size, you use malloc to allocate each node so that you can insert and delete without massive memory copying, as would be the case with a variable length array.
For example, an arbitrarily sized stack using a simple linked list:
#include <stdio.h>
#include <stdlib.h>
typedef struct sNode {
int payLoad;
struct sNode *next;
} tNode;
void stkPush (tNode **stk, int val) {
tNode *newNode = malloc (sizeof (tNode));
if (newNode == NULL) return;
newNode->payLoad = val;
newNode->next = *stk;
*stk = newNode;
}
int stkPop (tNode **stk) {
tNode *oldNode;
int val;
if (*stk == NULL) return 0;
oldNode = *stk;
*stk = oldNode->next;
val = oldNode->payLoad;
free (oldNode);
return val;
}
int main (void) {
tNode *top = NULL;
stkPush (&top, 42);
printf ("%d\n", stkPop (&top));
return 0;
}
Now, it's possible to do this with variable length arrays but, like writing an operating system in COBOL, there are better ways to do it.
malloc() is used whenever:
You need dynamic memory allocation
If you need to create array of size n, where n is calculated during your program execution, the only way you can do it is using malloc().
You need to allocate memory in heap
Variables defined in some functions live only till the end of this function. So, if some "callstack-independent" data is needed, it must be either passed/returned as function parameter (which is not always suitable), or stored in heap. The only way to store data in heap is to use malloc(). There are variable-size arrays, but they are allocated on stack.