I've got a question about array initialization/usage in C for storing structs.
Suppose I have a Person struct:
struct person {
char *name;
int age;
};
and I declare an array of person structs:
struct person people[1000];
My question is, given this array at some future point in the program where n struct persons have been added to people[], what is the correct way to tell where to put the n + 1th struct person?
In something like Java a for loop with if(people[i] == null) could tell you if that index was not yet holding any value, but in C because this is an array of actual values I know I can't check if(people[i] == NULL) since NULL is pointer.
Is there a reliable/correct way to do this in C?
Just use a variable to keep track of number of elements in the array.
int people_count = 0;
struct person people[1000];
void add(struct person p) {
people[people_count] = ...
people_count++;
}
In C, arrays cannot have positions that are empty, meaning that all the elements in an array must exist (in Java, you would essentially have array of pointers, whereas in C you have array of values of the size depending on your struct size).
One common solution would be to add id field to your struct that would indicate whether your struct on a particular position of the array has been initialized or not, i.e.:
struct person {
char *name;
int age;
int id; // meaning: is initialized
};
I am assuming you would like to put something in your array not only at the end (in this case the other answers are better) but also is some positions that haven't been initialized for some reason.
There are multiple solutions:
Just store the value of n and update it when necessary.
Allocate the items of the people array on the HEAP, so the type of the items will be person*. Then you can check for null pointers and it is also more memory friendly, as you do not have to store unnecessary elements. However, for computation performance, I would still not recommend to find the first empty array element with a a linear search, instead store the index of the first free item.
Related
I have a problem with lists and pointers, let me explain.
let's define a list like this:
typedef struct list *LIST;
typedef struct node *link;
struct list{link head; int n;};
struct node{Item val;link next;};
where n is the number of nodes and LIST is a pointer to struct list (I have to do that because I want to make an opaque pointer to the list, in my homework the LIST pointer would go in the .h and the structs would go in the .c but that's not the problem, and I know I should avoid declaring pointer like that).
The Item type is also declared via an opaque pointer, so I have something like this:
typedef struct info *Item;
struct info{char *name;int N};
my problem is that I don't understand how to insert stuff in this lists. (so I would like to add an Item type to the lists, but I can't because Item is a pointer so, for example if a try to do this:
//the lists is already initialized and let's say we want to add 3 nodes
Item x=malloc(sizeof(*x));`
x->name=calloc(10,sizeof(char));
for(int i=0;i<3;i++){
fscanf("%s",x->name);
fscanf("%d",x->N);//this is a random number
ListInsert(L,x);
}
this is what i have in ListInsert():
void ListInsert(LIST L, Item x){
link z,p;
if(L->head==NULL)
L->head=newNode(x,L->head);
else{
for(z=L->head->next, p=L->head;z!=NULL;p=x, z=z->next);//i know a tail would help
p->next=newNode(x,z);
}
}
And this is what I have in newNode():
link newNode(item x,link next){
link z=malloc(sizeof(*z));//should control the allocation was successful I know
z->val=x;
z->next=next;
return z;
}
Whenever I modify the value x, I'm actually modifying what the head and everything points to, that's my problem, what could be a solution? maybe make an array? pointers can sometimes be so hard to understand, for example should I allocate z->val->name?
When you say ...
Whenever I modify the value x, I'm actually modifying what the head and everything points to, that's my problem, what could be a solution?
... I think you're talking about this code:
Item x=malloc(sizeof(*x));`
x->name=calloc(10,sizeof(char));
for(int i=0;i<3;i++){
fscanf("%s",x->name);
fscanf("%d",x->N);//this is a random number
ListInsert(L,x);
}
Indeed, you have allocated only one struct info and assigned x to point to it. You have added that one struct info to your linked list three times, and also modified it several times.
Supposing that your objective is to add three distinct objects to the list, the solution starts with allocating three distinct objects (else where would they come from?). Since each one has a pointer to a dynamically allocated array, you will also want to allocate a separate array for each of those. The easiest way to achieve that would be simply to move the allocations into the loop:
for (int i = 0; i < 3; i++) {
Item x = malloc(sizeof(*x));
x->name = calloc(10, sizeof(char));
fscanf("%s", x->name);
fscanf("%d", x->N); //this is a random number
ListInsert(L, x);
}
If you are permitted to modify the structures involved then you could also consider making the name element of struct info an array of suitable length instead of a pointer. That's a little less flexible, but it would mean that you need only one allocation for each item, not two.
I'd like to understand the difference between using a pointer and a value when it comes to referencing a struct inside another struct.
By that I mean, I can have those two declarations:
struct foo {
int bar;
};
struct fred {
struct foo barney;
struct foo *wilma;
}
It appears I can get the same behavior from both barney and wilma entries, as long as I de-reference accordingly when I access them. The barney case intuitively feels “wrong” but I cannot say why.
Am I just relying on some C undefined behavior? If not, what would be the reason(s) to opt for one style over the other?
The following code shows how I come to the conclusion both use cases are equivalent; neither clang nor gcc complain about anything.
#include <stdio.h>
#include <stdlib.h>
struct a_number {
int i;
};
struct s_w_ptr {
struct a_number *n;
};
struct s_w_val {
struct a_number n;
};
void store_via_ptr(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number *i = malloc(sizeof(i));
i->i = 1;
swp->n = i;
swv->n = *i;
}
void store_via_val(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number j;
j.i = 2;
swp->n = &j;
swv->n = j;
}
int main(void) {
struct s_w_ptr *swp = malloc(sizeof(swp));
struct s_w_val *swv = malloc(sizeof(swv));
store_via_ptr(swp, swv);
printf("p: %d | v: %d\n", swp->n->i, swv->n.i);
store_via_val(swp, swv);
printf("p: %d | v: %d\n", swp->n->i, swv->n.i);
}
It's perfectly valid to have both struct members in a struct and have pointers to struct in a struct. They must be used differently but both are legal.
Why have a struct in a struct ?
One reason is to group things together. For instance:
struct car
{
struct motor motor; // a struct with several members describing the motor
struct wheel wheel; // a struct with several members describing the wheels
...
}
struct car myCar = {....initializer...};
myCar.wheel = SomeOtherWheelModel; // Replace wheels in a single assign
myCar.wheel.pressure = 2.1; // Change a single wheel member
Why have a struct pointer in a struct?
One very obvious reason is that is can be used as an array of N structs by using dynamic allocation of N times the struct size.
Another typical example is linked lists where you have a pointer to a struct of the same type as the struct containing the pointer.
There are several advantages of having a struct in a struct instead of having a pointer to struct in a struct:
It requires less memory allocation. In the case where you have a pointer to a struct in a struct, the compiler will allocate memory to store the pointer to the struct within the parent struct and separately allocate the memory for the child struct.
Additional instructions are typically required to access the contents of the child struct. For example consider that the program is reading the contents of the child struct. If a struct within a struct is used, the program will apply an offset to the address of the variable and read the contents of that memory location. In the case of a pointer to a struct in a struct, the program will actually apply an offset to the parent struct variable address, fetch the address of the child struct, then read from memory the contents of the child struct.
A separate variable needs to be declared for both the parent and child struct and if an initializer is used, then a separate initializer is needed. In the case of a struct in a struct only one variable must be declared and a single initializer is used.
In cases where dynamic memory allocation is used, the developer must remember to deallocate memory for both the child and parent objects before the variables fall out of scope. In the case of struct in a struct the memory must be freed for only one variable.
Lastly, as is shown in the example, if a pointer is used, Null checking may be necessary to ensure that the pointer to the child struct has been initialized.
The primary advantages of having a pointer to a struct in a struct would be if you needed to replace the child struct with another struct within the program, such as a linked list. A less common case might be if the child struct can be of more than one type. In this case you might use a void * type for the child. I may also use a pointer within a struct to point to an array in case where the array pointed to may vary in size between instances.
Based on my knowledge the case shown in the example above, I would be inclined to use a struct in a struct, since both objects are of fixed size and type and since it appears that they would not need to be separated.
C structures can be used to group related data, such as the title of a book, its author, its assigned book number, and so on. But much of what we use structures for is creating data structures (in a different sense of the word “structure”) in memory.
Consider that the book’s author has a name, a date of birth, other biographical information, a list of books they have written, and more. We could include in the struct book a struct author that would contain all this information. But, if the author has written a hundred books, we could have 100 copies of all that information, one copy in each struct book. Further, we cannot continue the “contain the data inside the structure directly” model with the struct author, because it cannot contain a struct book for each book the author publishes if those struct book members also have to contain the struct author for the author—every object would have to contain itself.
It is more efficient to create one struct author and have each struct book for that author to link to their struct author.
Another example is that we use pointers to create data structures for efficient access to data. If we are reading data for thousands of items and want to keep them sorted by name, one option is to allocate memory for some number of structures, read the data, and sort the data. When new data is read and we have used all the memory we allocated, we allocate new memory, copy all the old data to the new memory if necessary, and move some of the data so we can insert the new data in its proper place. However, we have many better options than that. We can use linked lists, binary trees, other kinds of trees, and hash tables.
These data structures effectively require using pointers. A binary tree will have a root node, and each node contains two pointers, one to a subtree of nodes that are earlier than it in the sorting order and another to a subtree of nodes that are later than it. We can look up items in the tree by following pointers to earlier or later nodes to find the right position. And we can insert items by changing a few pointers. If the tree happens to become unbalanced, we can rearrange nodes in the tree by changing pointers. The bulk of the data in the nodes does not have to be changed or copied, just some pointers.
We can also use pointers to have multiple structures for the same data. All the data about books could be stored in one place, and a tree ordered by name could contain nodes in which each node contained a pointer to the book structure and two pointers to subtrees. We could have one tree like this ordered by title of the book and another tree ordered by the name of the author and another tree ordered by the assigned book number. Then we can efficiently look up a book by title or author or number, but there is only one master copy of the complete book data, in the struct book objects. The look-up data is in the tree, which contains only pointers. That is much more efficient than copying all of the struct book data for each tree.
So the reasons we choose between use structures or pointers as members is not whether the C syntax allows us to refer to the data or not—we can get to the data in both cases. The reasons are because one method requires embedding data, which is inflexible and requires copying data, and the other method is flexible and efficient.
Let's consider at first this function
void store_via_ptr(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number *i = malloc(sizeof(i));
i->i = 1;
swp->n = i;
swv->n = *i;
}
This declaration
struct a_number *i = malloc(sizeof(i));
is equivalent to the following declaration
struct a_number *i = malloc(sizeof( struct a_number * ));
So in general the function can invoke undefined behavior when sizeof( struct a_number ) is greater than sizeof( struct a_number * ).
It seems you mean
struct a_number *i = malloc(sizeof( *i ) );
^^^
If you will split the function in two functions for each its parameter like
void store_via_ptr1( struct s_w_ptr *swp ) {
struct a_number *i = malloc(sizeof( *i ) );
i->i = 1;
swp->n = i;
}
and
void store_via_ptr( struct s_w_val *swv ) {
struct a_number *i = malloc(sizeof( *i));
i->i = 1;
swv->n = *i;
}
then in the first function the object pointed to by the pointer swp will need to remember to free the allocated memory within the function. Otherwise there will be a memory leak.
The second function already produces a memory leak because the allocated memory was not freed.
Now let's consider the second function
void store_via_val(struct s_w_ptr *swp, struct s_w_val *swv) {
struct a_number j;
j.i = 2;
swp->n = &j;
swv->n = j;
}
Here the pointer swp->n will point to a local object j. So after exiting the function this pointer will be invalid because the pointed object will not be alive.
So the both functions are incorrect. Instead you could write the following functions
int store_via_ptr(struct s_w_ptr *swp ) {
swp->n = malloc( sizeof( *swp->n ) );
int success = swp->n != NULL;
if ( success ) swp->n->i = 1;
return success;
}
and
void store_via_val( struct s_w_val *swv ) {
swv->n.i = 2;
}
When to include a whole object of a structure type in another object of a structure type or to use a pointer to an object of a structure type within other object of a structure type depends on the design and context where such objects are used.
For example consider a structure struct Point
struct Point
{
int x;
int y;
};
In this case if you want to declare a structure struct Rectangle then it is natural to define it like
struct Rectangle
{
struct Point top_left;
struct Point bottom_right;
};
On the other hand, if you have a two-sided singly-linked list then it can look like
struct Node
{
int value;
struct Node *next;
};
struct List
{
struct Node *head;
struct Node *tail;
};
Two problems:
In store_via_ptr you allocate memory for i dynamically. When you use s_w_val you copy the structure, and then leave the pointer. Which means the pointer will be lost and can't be passed to free later.
In store_via_val you make swp->n point to the local variable j. A variable whose life-time will end when the function returns, leaving you with an invalid pointer.
The first problem might lead to a memory leak (something you never care about in your simple example problem).
The second problem is worse, since it will lead to undefined behavior when you dereference the pointer swp->n.
Unrelated to that, in the main function you don't need to allocate memory dynamically for the structures. You could just have defined them as plain structure objects and used the pointer-to operator & when calling the functions.
This is a structure that I am using:
struct nodeList//a node structure
{
int jump;
int config;
int level;
int shifts[200];
int shift_diff[200];
struct nodeList *next;
};
I want to create a 2D array of pointers that can be used to reference such a structure variable, ie, any element of that array can be assigned a pointer to a structure variable. I would prefer to create the array dynamically using malloc, if possible. Any pointers (pun unintended) would be appreciated.
First of all, please think twice about the program design. Do you really need a 2D array of pointers, each pointing to a struct, each struct containing a number of items? Those requirements are rather complex: if you can simplify them, your program will turn out much better.
Because with the current requirements, you'll notice that the pointer and array syntax will turn quite complex, which the requirements are to blame for, more so than the C language.
Consider things like using a 2D array of structs, or to use some sort of pointer-based ADT which makes sense for your given case (linked list, queue, graph, binary tree? etc etc).
That being said, a 2D array of pointers to struct:
struct nodelist* array[X][Y];
To allocate this dynamically, you need a pointer to a 2D array of pointers to struct:
struct nodelist* (*array_ptr)[X][Y];
Then assign this to a 2D array of pointers to struct, allocated dynamically:
array_ptr = malloc( sizeof(struct nodelist*[X][Y]) );
...
free(array_ptr);
Note that unless the 2D array is not allocated like above, in adjacent memory cells, it is not an array.
EDIT. Btw, if you wish to avoid the weird syntax that an array pointer will yield, there is a trick. With the code above you will have to address the array as
(*array_ptr)[i][j];
Meaning: "in the 2D array, give me item number [i][j]".
Had you omitted the inner-most dimension of the type, you could simplify this syntax:
struct nodelist* (*array_ptr)[Y]; // pointer to a 1D array
The malloc will be the same but you can now use it like this instead, which may be more intuitive:
array_ptr[i][j];
Meaning: "in array number i, give me item number j". You are here assuming there is an array of arrays in adjacent memory, which is true.
NODELIST ***pppNode, Node;
size_t Row = 5, Col = 5, i;
pppNode = malloc( sizeof(NODELIST **) * Row );
for(i = 0; i < Row; i++)
pppNode[i] = malloc( sizeof(NODELIST *) * Col );
pppNode[1][0] = &Node;
This is another way of dynamic allocation but as #Lundin said if it is not necessary change the design.
I want to create a struct with 2 variables, such as
struct myStruct {
char charVar;
int intVar;
};
and I will name the structs as:
struct myStruct name1;
struct myStruct name2;
etc.
The problem is, I don't know how many variables will be entered, so there must be infinite nameX structures.
So, how can I name these structures with variables?
Thanks.
You should use an array and a pointer.
struct myStruct *p = NULL;
p = malloc(N * sizeof *p); // where N is the number of entries.
int index = 1; /* or any other number - from 0 to N-1*/
p[index].member = x;
Then you can add elements to it by using realloc if you need to add additional entries.
Redefine myStruct as
struct myStruct {
char charVar;
int intVar;
struct myStruct *next;
};
Keep track of the last structure you have as well as the start of the list. When addding new elements, append them to the end of your linked list.
/* To initialize the list */
struct myStruct *start, *end;
start = malloc(sizeof(struct myStruct));
start->next = NULL;
end = start;
/* To add a new structure at the end */
end->next = malloc(sizeof(struct myStruct));
end = end->next;
end->next = NULL;
This example does not do any error checking. Here is how you would step along the list to print all the values in it:
struct myStruct *ptr;
for(ptr = start; ptr != NULL; ptr = ptr->next)
printf("%d %s\n", ptr->intVar, ptr->charVar);
You not have to have a distinct name for each structure in a linked list (or any other kind of list, in general). You can assign any of the unnamed structures to the pointer ptr as you use them.
So, how can I name these structures with variables?
I think every beginner starts out wanting to name everything. It's not surprising -- you learn about using variables to store data, so it seems natural that you'd always use variables. The answer, however, is that you don't always use variables for storing data. Very often, you store data in structures or objects that are created dynamically. It may help to read about dynamic allocation. The idea is that when you have a new piece of data to store, you ask for a piece of memory (using a library call like malloc or calloc). You refer to that piece of memory by its address, i.e. a pointer.
There are a number of ways to keep track of all the pieces of memory that you've obtained, and each one constitutes a data structure. For example, you could keep a number of pieces of data in a contiguous block of memory -- that's an array. See Devolus's answer for an example. Or you could have lots of little pieces of memory, with each one containing the address (again, a pointer) of the next one; that's a linked list. Mad Physicist's answer is a fine example of a linked list.
Each data structure has its own advantages and disadvantages -- for example, arrays allow fast access but are slow for inserting and deleting, while linked lists are relatively slow for access but are fast for inserting and deleting. Choosing the right data structure for the job at hand is an important part of programming.
It usually takes a little while to get comfortable with pointers, but it's well worth the effort as they open up a lot of possibilities for storing and manipulating data in your program. Enjoy the ride.
Learning C through "Learning C the hard way", and doing some of my own exercises. I stumbled upon the following problem.
Let's say I have the following structure:
struct Person {
char name[MAX_INPUT];
int age;
}
In main(), I have declared the following array:
int main(int argc, char *argv[]) {
struct Person personList[MAX_SIZE];
return 0;
}
Now let's say 2 functions away (main calls function1 which calls function2) I want to save a person inside the array I declared in the main function like so:
int function2(struct Person *list) {
struct Person *prsn = malloc(sizeof(struct Person));
assert(prsn != NULL); // Why is this line necessary?
// User input code goes here ...
// Now to save the Person created
strcpy(prsn->name, nameInput);
ctzn->age = ageInput;
list = prsn; // list was passed by reference by function1, does main need to pass the array by
// reference to function1 before?
// This is where I get lost:
// I want to increment the array's index, so next time this function is called and a
// new person needs to be saved, it is saved in the correct order in the array (next index)
}
So if I return to my main function and wanted to print the first three persons saved in it like so:
...
int i = 0;
for(i = 0; i < 3; i++) {
printf("%s is %d old", personList[i].name, personList[i].age);
}
...
Basically how to reference the array across the application while keeping it persistent. Keeping in mind that main does not necessarily call the function directly that makes use of the array. I'm suspecting someone might suggesting declaring it as a global variable, then what would be the alternative? Double pointers? How do double pointers work?
Thank you for your time.
Here are a few pointers (no pun intended!) to help you along:
As it stands, the line struct Person personList[MAX_SIZE]; allocates memory for MAX_SIZE number of Person structs. You don't actually need to allocate more memory using malloc if this is what you are doing.
However, you could save some memory by only allocating memory when you actually need a person. In this case, you want the personList array to contain pointers to Person structs, not the structs themselves (which you create using malloc).
That is: struct Person * personList[MAX_SIZE];
When you create the person:
struct Person * person = (struct Person *) malloc(sizeof(struct Person));
personList[index] = person;
And when you use the person list: printf("%s", personList[index]->name);
Arrays don't magically keep a record of any special index. You have to do this yourself. One way is to always pass the length of the array to each function that needs it.
void function1(struct Person * personList, int count);
If you wanted to modify the count variable when you returned back to the calling function, you could pass it by reference:
void function1(struct Person * personList, int * count);
A possibly more robust way would be to encapsulate the count and the array together into another structure.
struct PersonList { struct Person * list[MAX_SIZE]; int count; }
This way you can write a set of functions that always deal with the list data coherently -- whenever you add a new person, you always increment the count, and so on.
int addNewPerson(struct PersonList * personList, char * name, int age);
I think that much should be helpful to you. Just leave a comment if you would like something to be explained in more detail.
First of all, malloc does not guarantee to allocate new space from the memory and return it. If it cannot allocate the requested memory, it returns a NULL value. That's why it is necessary to check the pointer.
While you are calling function two, you can pass the address of the next element by using a variable that holds the current count of the array in function1;
function2(&personList[count++]);
then you return the current count from function1 to the main function;
int size=function1(personList);