define a function returning struct pointer - c

Please bear with me, i m from other language and newbie to c and learning it from http://c.learncodethehardway.org/book/learn-c-the-hard-way.html
struct Person {
char *name;
int age;
int height;
int weight;
};
struct Person *Person_create(char *name, int age, int height, int weight)
{
struct Person *who = malloc(sizeof(struct Person));
assert(who != NULL);
who->name = strdup(name);
who->age = age;
who->height = height;
who->weight = weight;
return who;
}
I understand the second Person_create function returns a pointer of struct Person. I don't understand is(may be because i m from other language, erlang, ruby), why does it define it as
struct Person *Person_create(char *name, int age, int height, int weight)
not
struct Person Person_create(char *name, int age, int height, int weight)
and is there other way to define a function to return a structure?
sorry if this question is too basic.

It is defined so because it returns a pointer to a struct, not a struct. You assign the return value to a struct Person *, not to struct Person.
It is possible to return a full struct, like that:
struct Person Person_create(char *name, int age, int height, int weight)
{
struct Person who;
who.name = strdup(name);
who.age = age;
who.height = height;
who.weight = weight;
return who;
}
But it is not used very often.

The Person_create function returns a pointer to a struct Person so you have to define the return value to be a pointer (by adding the *). To understand the reason for returning a pointer to a struct and not the struct itself one must understand the way C handles memory.
When you call a function in C you add a record for it on the call stack. At the bottom of the call stack is the main function of the program you're running, at the top is the currently executing function. The records on the stack contain information such as the values of the parameters passed to the functions and all the local variables of the functions.
There is another type of memory your program has access to: heap memory. This is where you allocate space using malloc, and it is not connected to the call stack.
When you return from a function the call stack is popped and all the information associated with the function call are lost. If you want to return a struct you have two options: copy the data inside the struct before it is popped from the call stack, or keep the data in heap memory and return a pointer to it. It's more expensive to copy the data byte for byte than to simply return a pointer, and thus you would normally want to do that to save resources (both memory and CPU cycles). However, it doesn't come without cost; when you keep your data in heap memory you have to remember to free it when you stop using it, otherwise your program will leak memory.

The function returns who, which is a struct Person * - a pointer to a structure. The memory to hold the structure is allocated by malloc(), and the function returns a pointer to that memory.
If the function were declared to return struct Person and not a pointer, then who could also be declared as a structure. Upon return, the structure would be copied and returned to the caller. Note that the copy is less efficient than simply returning a pointer to the memory.

Structs are not pointers (or references) by default in C/C++, as they are for example in Java. Struct Person Function() would therefor return struct itself (by value, making a copy) not a pointer.
You often don't want to create copies of objects (shallow copies by default, or copies created using copy constructors) as this can get pretty time consuming soon.

To copy the whole struct and not just pointer is less efficient because a pointer's sizeof is usually much smaller than sizeof of a whole struct itself.
Also, a struct might contain pointers to other data in memory, and blindly copying that could be dangerous for dynamically allocated data (if a code handling one copy would free it, the other copy would be left with invalid pointer).
So shallow copy is almost always a bad idea, unless you're sure that the original goes out of scope - and then why wouldn't you just return a pointer to the struct instead (a struct dynamically allocated on heap of course, so it won't be destroyed like the stack-allocated entities are destroyed, on return from a function).

Related

What is the difference between stack struct vs heap struct?

// Define Person
typedef struct Person {
char *name;
int age;
} person_t;
// Init person in heap
person_t *create_person_v1(char *name, int age)
{
person_t *person = calloc(1, sizeof(person_t));
person->name = name;
person->age = age;
return person;
}
// Init person on stak? a static struct I guess?
person_t create_person_v2(char *name, int age)
{
return (person_t) {name, age};
}
The code above has a definition of Person and two helper functions for its further initialization.
I don't really understand the difference between them, and what are the benefits of having each?
Is it something more than just accessors -> and . ?
The last function does not allocate a static structure, but a structure on the stack. There is two kind of memory for your programm, heap and stack. static is an other keyword (which have more than 1 use, depending on where they are used, but it's not the current subject).
heap allocations use more ressources than stack allocations, but are not tied to scopes where they are used. You can manipulate structures heap allocated between functions without having to think they can be "automatically freed" because the scope on which they are tied disappear (aka : initial function returned).
An example would be a linked list. You can't rely on stack allocations to build a linked list of structures.
Heap allocations can fail, you always have to check if malloc, calloc and cie, does not return NULL.
Stack allocations doesn't fail like that. If you sucked all your stack, your programm will just crash. ( Or do some fancy things, it is undefined bahavior ). But you can't recover from that.
stack allocated memory is "freed" on function returns.
not heap allocated memory. But you have to free it by hand (with function free)

struct inside struct : to point or not to point?

I'd like to understand the difference between using a pointer and a value when it comes to referencing a struct inside another struct.
By that I mean, I can have those two declarations:
struct foo {
int bar;
};
struct fred {
struct foo barney;
struct foo *wilma;
}
It appears I can get the same behavior from both barney and wilma entries, as long as I de-reference accordingly when I access them. The barney case intuitively feels “wrong” but I cannot say why.
Am I just relying on some C undefined behavior? If not, what would be the reason(s) to opt for one style over the other?
The following code shows how I come to the conclusion both use cases are equivalent; neither clang nor gcc complain about anything.
#include <stdio.h>
#include <stdlib.h>
struct a_number {
int i;
};
struct s_w_ptr {
struct a_number *n;
};
struct s_w_val {
struct a_number n;
};
void store_via_ptr(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number *i = malloc(sizeof(i));
i->i = 1;
swp->n = i;
swv->n = *i;
}
void store_via_val(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number j;
j.i = 2;
swp->n = &j;
swv->n = j;
}
int main(void) {
struct s_w_ptr *swp = malloc(sizeof(swp));
struct s_w_val *swv = malloc(sizeof(swv));
store_via_ptr(swp, swv);
printf("p: %d | v: %d\n", swp->n->i, swv->n.i);
store_via_val(swp, swv);
printf("p: %d | v: %d\n", swp->n->i, swv->n.i);
}
It's perfectly valid to have both struct members in a struct and have pointers to struct in a struct. They must be used differently but both are legal.
Why have a struct in a struct ?
One reason is to group things together. For instance:
struct car
{
struct motor motor; // a struct with several members describing the motor
struct wheel wheel; // a struct with several members describing the wheels
...
}
struct car myCar = {....initializer...};
myCar.wheel = SomeOtherWheelModel; // Replace wheels in a single assign
myCar.wheel.pressure = 2.1; // Change a single wheel member
Why have a struct pointer in a struct?
One very obvious reason is that is can be used as an array of N structs by using dynamic allocation of N times the struct size.
Another typical example is linked lists where you have a pointer to a struct of the same type as the struct containing the pointer.
There are several advantages of having a struct in a struct instead of having a pointer to struct in a struct:
It requires less memory allocation. In the case where you have a pointer to a struct in a struct, the compiler will allocate memory to store the pointer to the struct within the parent struct and separately allocate the memory for the child struct.
Additional instructions are typically required to access the contents of the child struct. For example consider that the program is reading the contents of the child struct. If a struct within a struct is used, the program will apply an offset to the address of the variable and read the contents of that memory location. In the case of a pointer to a struct in a struct, the program will actually apply an offset to the parent struct variable address, fetch the address of the child struct, then read from memory the contents of the child struct.
A separate variable needs to be declared for both the parent and child struct and if an initializer is used, then a separate initializer is needed. In the case of a struct in a struct only one variable must be declared and a single initializer is used.
In cases where dynamic memory allocation is used, the developer must remember to deallocate memory for both the child and parent objects before the variables fall out of scope. In the case of struct in a struct the memory must be freed for only one variable.
Lastly, as is shown in the example, if a pointer is used, Null checking may be necessary to ensure that the pointer to the child struct has been initialized.
The primary advantages of having a pointer to a struct in a struct would be if you needed to replace the child struct with another struct within the program, such as a linked list. A less common case might be if the child struct can be of more than one type. In this case you might use a void * type for the child. I may also use a pointer within a struct to point to an array in case where the array pointed to may vary in size between instances.
Based on my knowledge the case shown in the example above, I would be inclined to use a struct in a struct, since both objects are of fixed size and type and since it appears that they would not need to be separated.
C structures can be used to group related data, such as the title of a book, its author, its assigned book number, and so on. But much of what we use structures for is creating data structures (in a different sense of the word “structure”) in memory.
Consider that the book’s author has a name, a date of birth, other biographical information, a list of books they have written, and more. We could include in the struct book a struct author that would contain all this information. But, if the author has written a hundred books, we could have 100 copies of all that information, one copy in each struct book. Further, we cannot continue the “contain the data inside the structure directly” model with the struct author, because it cannot contain a struct book for each book the author publishes if those struct book members also have to contain the struct author for the author—every object would have to contain itself.
It is more efficient to create one struct author and have each struct book for that author to link to their struct author.
Another example is that we use pointers to create data structures for efficient access to data. If we are reading data for thousands of items and want to keep them sorted by name, one option is to allocate memory for some number of structures, read the data, and sort the data. When new data is read and we have used all the memory we allocated, we allocate new memory, copy all the old data to the new memory if necessary, and move some of the data so we can insert the new data in its proper place. However, we have many better options than that. We can use linked lists, binary trees, other kinds of trees, and hash tables.
These data structures effectively require using pointers. A binary tree will have a root node, and each node contains two pointers, one to a subtree of nodes that are earlier than it in the sorting order and another to a subtree of nodes that are later than it. We can look up items in the tree by following pointers to earlier or later nodes to find the right position. And we can insert items by changing a few pointers. If the tree happens to become unbalanced, we can rearrange nodes in the tree by changing pointers. The bulk of the data in the nodes does not have to be changed or copied, just some pointers.
We can also use pointers to have multiple structures for the same data. All the data about books could be stored in one place, and a tree ordered by name could contain nodes in which each node contained a pointer to the book structure and two pointers to subtrees. We could have one tree like this ordered by title of the book and another tree ordered by the name of the author and another tree ordered by the assigned book number. Then we can efficiently look up a book by title or author or number, but there is only one master copy of the complete book data, in the struct book objects. The look-up data is in the tree, which contains only pointers. That is much more efficient than copying all of the struct book data for each tree.
So the reasons we choose between use structures or pointers as members is not whether the C syntax allows us to refer to the data or not—we can get to the data in both cases. The reasons are because one method requires embedding data, which is inflexible and requires copying data, and the other method is flexible and efficient.
Let's consider at first this function
void store_via_ptr(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number *i = malloc(sizeof(i));
i->i = 1;
swp->n = i;
swv->n = *i;
}
This declaration
struct a_number *i = malloc(sizeof(i));
is equivalent to the following declaration
struct a_number *i = malloc(sizeof( struct a_number * ));
So in general the function can invoke undefined behavior when sizeof( struct a_number ) is greater than sizeof( struct a_number * ).
It seems you mean
struct a_number *i = malloc(sizeof( *i ) );
^^^
If you will split the function in two functions for each its parameter like
void store_via_ptr1( struct s_w_ptr *swp ) {
struct a_number *i = malloc(sizeof( *i ) );
i->i = 1;
swp->n = i;
}
and
void store_via_ptr( struct s_w_val *swv ) {
struct a_number *i = malloc(sizeof( *i));
i->i = 1;
swv->n = *i;
}
then in the first function the object pointed to by the pointer swp will need to remember to free the allocated memory within the function. Otherwise there will be a memory leak.
The second function already produces a memory leak because the allocated memory was not freed.
Now let's consider the second function
void store_via_val(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number j;
j.i = 2;
swp->n = &j;
swv->n = j;
}
Here the pointer swp->n will point to a local object j. So after exiting the function this pointer will be invalid because the pointed object will not be alive.
So the both functions are incorrect. Instead you could write the following functions
int store_via_ptr(struct s_w_ptr *swp ) {
swp->n = malloc( sizeof( *swp->n ) );
int success = swp->n != NULL;
if ( success ) swp->n->i = 1;
return success;
}
and
void store_via_val( struct s_w_val *swv ) {
swv->n.i = 2;
}
When to include a whole object of a structure type in another object of a structure type or to use a pointer to an object of a structure type within other object of a structure type depends on the design and context where such objects are used.
For example consider a structure struct Point
struct Point
{
int x;
int y;
};
In this case if you want to declare a structure struct Rectangle then it is natural to define it like
struct Rectangle
{
struct Point top_left;
struct Point bottom_right;
};
On the other hand, if you have a two-sided singly-linked list then it can look like
struct Node
{
int value;
struct Node *next;
};
struct List
{
struct Node *head;
struct Node *tail;
};
Two problems:
In store_via_ptr you allocate memory for i dynamically. When you use s_w_val you copy the structure, and then leave the pointer. Which means the pointer will be lost and can't be passed to free later.
In store_via_val you make swp->n point to the local variable j. A variable whose life-time will end when the function returns, leaving you with an invalid pointer.
The first problem might lead to a memory leak (something you never care about in your simple example problem).
The second problem is worse, since it will lead to undefined behavior when you dereference the pointer swp->n.
Unrelated to that, in the main function you don't need to allocate memory dynamically for the structures. You could just have defined them as plain structure objects and used the pointer-to operator & when calling the functions.

allocating memory to a struct using malloc

I have a struct called State:
typedef struct State{
char alphabets[2][6];
struct State *PREV; /*this points to the previous state it came from*/
struct State *NEXT; /*this points to the next state in the linked list*/
int cost; /*Number of moves done to get to this position*/
int zero_index;/*this holds the index to the empty postion*/
} State;
Here's my memAllocator() method:
memAllocator(){
struct State *p = (State*) malloc(sizeof(State));
if (p==NULL){
printf("Malloc for a new position failed");
exit(1);
}
return p;
}
Here's my main method.
main(){
State *start_state_pointer=memAllocator();
State start_state;
start_state.zero_index=15;
start_state.PREV = NULL;
start_state.alphabets[0][0]='C';
start_state.alphabets[0][1]='A';
start_state.alphabets[0][2]='N';
start_state.alphabets[0][3]='A';
start_state.alphabets[0][4]='M';
start_state.alphabets[0][5]='A';
start_state.alphabets[1][0]='P';
start_state.alphabets[1][1]='A';
start_state.alphabets[1][2]='N';
start_state.alphabets[1][3]='A';
start_state.alphabets[1][4]='L';
start_state.alphabets[1][5]='_';
start_state_pointer=&(start_state);
/*start_state=*start_state_pointer;*/
}
I think the statement start_state_pointer=&(start_state); is just assigning the pointer start_state_pointer to to the small amount of temporary space created during State start_state, rather than to the space I allocated.
But when I try the commented out statement start_state=*start_state_pointer to deference the pointer and allocate the space to start state. It gives me a segmentation fault.
I am just starting out in C. Can some one help me with this?
Your memAllocator and main functions don't have explicit return types. This style of code has been deprecated for over 10 years. Functions in C should always have a return type. For main, the return type should be int, and for your memAllocator function, it should be State *.
The second issue is that you allocate space for a State struct, but then fill a different State struct and overwrite the pointer to the previously allocated State struct using start_state_pointer = &(start_state);.
To use the memory that you just allocated, you want to use something like this:
State *start_state = memAllocator();
start_state->zero_index = 15;
start_state->PREV = NULL;
start_state->alphabets[0][0] = 'C';
// etc.
There is no need to create two State structs. When you use State start_start; in your original code, you are creating a struct with something called automatic storage. This means the space for this struct is allocated automatically and is deallocated automatically for you at the end of the scope it is declared in. If you take the address of this struct and pass it around other parts of your program, then you will be passing around a pointer to a deallocated struct, and this could be why your program is crashing.

tsearch function does not maintain struct pointer within struct. Loss of information issue

I am using a tree in C to keep track of an undefined and varying number of input fields. I have a struct with a set number of fields as follows:
struct mystruct {
int id, mpid;
char *name;
struct myotherstruct *myostr;
};
I have a pointer instance of this type ( mystruct ) which I allocate memory for, then I fill these struct values in with input I've read from a file. I then use the tsearch function from search.h to add my mystruct object to the mystruct tree. The problem I am having, is that if I use the tfind function to retrieve a mystruct pointer from the tree, the mystruct memory that is returned has no recollection of the myotherstruct pointer data I allocated and pointed to when creating the value prior to adding to the tree.
The general sequence is as follows:
struct mystruct {
int id, mpid;
char *name;
struct myotherstruct *myostr;
};
struct myotherstruct {
int spid;
};
// allocate memory to temporary mystruct pointer
// add mpid field to mystruct pointer ( used in comparison function )
if( // tfind == NULL )
{
// set id value
// allocate char memory and strncpy correct value in
// allocate myotherstruct memory and assign all values
// tsearch for this newly created mystruct memory chunk so that it is added to tree
}
else
{
// fails here when attempting to access the data returned by mytfind
}
...
Using gdb, the program very clearly enters the if loop the first time ( since the tree is empty ) and creates a valid and full mystruct pointer with proper memory allocation. When tsearch returns it's output, the memory location is different from that of the mystruct pointer I filled in and then I am incapable of doing prints for mystruct->myotherstruct variables such as spid. The exact output I get when attempting to print in gdb is : Cannot access memory at address 0x----- where the -'s are various locations in memory that I apparently cannot access.
I suspect that there may be a problem with my comparison function since I am only comparing mystruct mpid fields to determine whether a tree node exists yet for a mystruct object, but my inexperience with both C and tsearch/tfind functionality are showing a bit here. Hopefully someone with more experience is able to help me since the examples provided on various tsearch.h webpages don't handle very sophisticated examples. Thanks in advance for the help!
PS: the code must remain in C, so language swapping doesn't suffice :(
EDIT:
here is my compare function:
int distcmp(const void *a, const void *b){
return ((int)((struct mystruct *)a)->mpid) != (int)(((struct mystruct *)b)->mpid);
}
Also, I use tfind initially because I want to know whether a particular value exists in the tree yet. If it does not exist ( it returns NULL ) then it enters the if loop and fills in the new mystruct object and adds it to the tree ( using tsearch ). If it already exists, a pointer to that object comes out of tfind which I assign to a mystruct pointer and use in the else portion of the code. Hopefully this helps. Thanks again.
SOLVED:
Just to update, the issue is that the returned pointer from tsearch and tfind is not of the type mystruct. It is a pointer to the memory location of the mystruct value that matched my search. In this case, the issue would be resolved by accessing the returned pointer with a * and passing that value to a mystruct pointer. Thanks to those who commented.

C generic linked-list

I have a generic linked-list that holds data of type void* I am trying to populate my list with type struct employee, eventually I would like to destruct the object struct employee as well.
Consider this generic linked-list header file (i have tested it with type char*):
struct accListNode //the nodes of a linked-list for any data type
{
void *data; //generic pointer to any data type
struct accListNode *next; //the next node in the list
};
struct accList //a linked-list consisting of accListNodes
{
struct accListNode *head;
struct accListNode *tail;
int size;
};
void accList_allocate(struct accList *theList); //allocate the accList and set to NULL
void appendToEnd(void *data, struct accList *theList); //append data to the end of the accList
void removeData(void *data, struct accList *theList); //removes data from accList
--------------------------------------------------------------------------------------
Consider the employee structure
struct employee
{
char name[20];
float wageRate;
}
Now consider this sample testcase that will be called from main():
void test2()
{
struct accList secondList;
struct employee *emp = Malloc(sizeof(struct employee));
emp->name = "Dan";
emp->wageRate =.5;
struct employee *emp2 = Malloc(sizeof(struct employee));
emp2->name = "Stan";
emp2->wageRate = .3;
accList_allocate(&secondList);
appendToEnd(emp, &secondList);
appendToEnd(emp2, &secondList);
printf("Employee: %s\n", ((struct employee*)secondList.head->data)->name); //cast to type struct employee
printf("Employee2: %s\n", ((struct employee*)secondList.tail->data)->name);
}
Why does the answer that I posted below solve my problem? I believe it has something to do with pointers and memory allocation. The function Malloc() that i use is a custom malloc that checks for NULL being returned.
Here is a link to my entire generic linked list implementation: https://codereview.stackexchange.com/questions/13007/c-linked-list-implementation
The problem is this accList_allocate() and your use of it.
struct accList secondList;
accList_allocate(&secondList);
In the original test2() secondList is memory on the stack. &secondList is a pointer to that memory. When you call accList_allocate() a copy of the pointer is passed in pointing at the stack memory. Malloc() then returns a chunk of memory and assigns it to the copy of the pointer, not the original secondList.
Coming back out, secondList is still pointing at uninitialised memory on the stack so the call to appendToEnd() fails.
The same happens with the answer except secondList just happens to be free of junk. Possibly by chance, possibly by design of the compiler. Either way it is not something you should rely on.
Either:
struct accList *secondList = NULL;
accList_allocate(&secondList);
And change accList_allocate()
accList_allocate(struct accList **theList) {
*theList = Malloc(sizeof(struct accList));
(*theList)->head = NULL;
(*theList)->tail = NULL;
(*theList)->size = 0;
}
OR
struct accList secondList;
accList_initialise(secondList);
With accList_allocate() changed to accList_initialise() because it does not allocate
accList_initialise(struct accList *theList) {
theList->head = NULL;
theList->tail = NULL;
theList->size = 0;
}
I think that your problem is this:
You've allocated secondList on the stack in your original test2 function.
The stack memory is probably dirty, so secondList requires initialization
Your accList_allocate function takes a pointer to the list, but then overwrites it with the Malloc call. This means that the pointer you passed in is never initialized.
When test2 tries to run, it hits a bad pointer (because the memory isn't initialized).
The reason that it works when you allocate it in main is that your C compiler probably zeros the stack when the program starts. When main allocates a variable on the stack, that allocation is persistent (until the program ends), so secondList is actually, and accidentally, properly initialized when you allocate it in main.
Your current accList_allocate doesn't actually initialize the pointer that's been passed in, and the rest of your code will never see the pointer that it allocates with Malloc. To solve your problem, I would create a new function: accList_initialize whose only job is to initialize the list:
void accList_initialize(struct accList* theList)
{
// NO malloc
theList->head = NULL;
theList->tail = NULL;
theList->size = 0;
}
Use this, instead of accList_allocate in your original test2 function. If you really want to allocate the list on the heap, then you should do so (and not mix it with a struct allocated on the stack). Have accList_allocate return a pointer to the allocated structure:
struct accList* accList_allocate(void)
{
struct accList* theList = Malloc( sizeof(struct accList) );
accList_initialize(theList);
return theList;
}
Two things I see wrong here based on the original code, in the above question,
What you've seen is undefined behaviour and arose from that is the bus error message as you were assigning a string literal to the variable, when in fact you should have been using the strcpy function, you've edited your original code accordinly so.. something to keep in mind in the future :)
The usage of the word Malloc is going to cause confusion, especially in peer-review, the reviewers are going to have a brain fart and say "whoa, what's this, should that not be malloc?" and very likely raise it up. (Basically, do not call custom functions that have similar sounding names as the C standard library functions)
You're not checking for the NULL, what if your souped up version of Malloc failed then emp is going to be NULL! Always check it no matter how trivial or your thinking is "Ah sher the platform has heaps of memory on it, 4GB RAM no problem, will not bother to check for NULL"
Have a look at this question posted elsewhere to explain what is a bus error.
Edit: Using linked list structures, in how the parameters in the function is called is crucial to the understanding of it. Notice the usage of &, meaning take the address of the variable that points to the linked list structure, and passing it by reference, not passing by value which is a copy of the variable. This same rule applies to usage of pointers also in general :)
You've got the parameters slightly out of place in the first code in your question, if you were using double-pointers in the parameter list then yes, using &secondList would have worked.
It may depend on how your Employee structure is designed, but you should note that
strcpy(emp->name, "Dan");
and
emp->name = "Dan";
function differently. In particular, the latter is a likely source of bus errors because you generally cannot write to string literals in this way. Especially if your code has something like
name = "NONE"
or the like.
EDIT: Okay, so with the design of the employee struct, the problem is this:
You can't assign to arrays. The C Standard includes a list of modifiable lvalues and arrays are not one of them.
char name[20];
name = "JAMES" //illegal
strcpy is fine - it just goes to the memory address dereferenced by name[0] and copies "JAMES\0" into the memory there, one byte at a time.

Resources