The most efficient way to initialize array member of struct? - c

I have declared the struct
struct wnode {
char *word;
int lines[MAXLINES];
struct wnode *left;
struct wnode *right;
};
and the pointer
struct wnode *p;
The pointer is passed to a function.
In that function, I first allocate memory for the pointer with malloc. Then I want to initialize the struct member lines to zero zero out the struct member lines.
An array initialization method will not work as it is interpreted as assignment:
p->lines[MAXLINES] = {0};
The compiler throws the error:
error: expected expression before '{' token
In the end, I'm just using a for loop to zero out the lines array:
for (i = 0; i < MAXLINES; i++)
p->lines[i] = 0;
Is there a better way?

Arrays cannot be assigned to directly. You need to either use a loop to set all fields to 0 or you can use memset:
memset(p->lines, 0, sizeof(p->lines));
Note that for non-char types you can only to do this to set all members to 0. For any other value you need a loop.

If you want to use the = operator, you can do it this way:
struct wnode wn, *p;
/* ........ */
wn = (struct wnode){.word = wn.word, .lines = {0,}, .left = wn.left, .right = wn.right};
*p = (struct wnode){.word = p ->word, .lines = {0,}, .left = p -> left, .right = p -> right};

= {0} works only on initialization. You can't use it with assignment as such which is why you get the error.
You can either use a for loop as you said or use memset to zero out the array:
memset(p -> lines, 0, sizeof(p -> lines))

The only time an array variable can be initialized in this manner:
int someInt[MAXLINES] = {0};
Is during declaration.
But because this particular variable int lines[MAXLINES]; is declared within the confines of struct, which does not allow members to be initialized, the opportunity is lost to that method, requiring it to be initialized after the fact, and using a different method.
The most common (and preferred) way to initialize after declaration in this case is to use:
//preferred
memset(p->lines, 0, sizeof(p->lines));
A more arduous method, and one that is seen often, sets each element to the desired value in a loop:
for(int i=0;i<MAXLINES;i++)
{
p->lines[i] = 0;
}
As noted in comments, this method will be reduced by a good optimizing compiler to the equivalent of an memset() statement anyway.

This declaration of a pointer
struct wnode *p;
either zero-initializes the pointer p if the pointer has static storage duration or leaves the pointer uninitialized if the pointer has automatic storage duration, So applying the operator -> to the pointer invokes undefined behavior because the pointer does not point to a valid object.
If to assume that the pointer points to a valid object like for example
struct wnode w;
struct wnode *p = &w;
then within the function you can initialize the data member lines of the object w using the standard C function memset. For example
memset( p->lines, 0, MAXLINES * sizeof( int ) );
You may not write in the function just
p->lines = {0};
because the pointed object is already created and such an initialization is allowed in a declaration of an array. And moreover arrays do not have the assignment operator.

That kind of initialization can only be done on declaration. Notice that in p->lines[MAXLINES] = {0}; the expression p->lines[MAXLINES] means the integer one past the end of p->lines.
You could write p->lines[MAXLINES] = 0;. Not correct, but would compile.
You don't have the concept of array any more. You either have p->lines, which is a pointer to int, or p->lines[index], which is an int.
Yes, you have the allocated space, but that's all. memset will do the trick.
By the way, I hope your function (or the caller) do allocate the wnode element...

{0} works only in initialization, not in assignment.
Since you can't initialize a pointer target, only assign to it, you can work around the problem by assigning a just-initialized compound literal.
Compilers will usually optimize the compound literal out and assign directly to the target.
{0} initialization of a largish array will frequently compile to a call to memset or equivalent assembly, so another option is to call memset directly on p->lines manually.
Example:
#define MAXLINES 100
struct wnode {
char *word;
int lines[MAXLINES];
struct wnode *left;
struct wnode *right;
};
//(hopefully) elided compound literal
void wnode__init(struct wnode *X)
{
*X = (struct wnode){"foo",{0},X,X};
}
//explicit memset
#include <string.h>
void wnode__init2(struct wnode *X)
{
X->word = "foo";
memset(X->lines,0,sizeof(X->lines));
X->left = X;
X->right = X;
}
https://gcc.godbolt.org/z/TMgGqV

void *memset(void *s, int c, size_t n)
The memset() function fills the first n bytes of the memory area
pointed to by s with the constant byte c.
The memset() function returns a pointer to the memory area s.
Although, (sizeof(int)*MAXLINES) and sizeof(p->lines) yields the same result in bytes, and both are correct BUT, the second option ( sizeof(p->lines)) is better to use, because if we decide to change the array type or the array size, we dont need to change the expression inside the szieof operator. we change in oneplace only. so we let the compiler to do the work for us!
#include <string.h> /*for the memset*/
memset(p->lines, 0x0, sizeof(p->lines));

#ikegami's comment to the original question needs to be an answer.
Use calloc() rather than malloc()
struct wnode *p;
// presumably you have N as the number of elements and
p = malloc(N * sizeof *p);
// replace with
//p = calloc(N, sizeof *p);

Related

Pointers, structs and memset in C [duplicate]

This question already has answers here:
How to find the size of an array (from a pointer pointing to the first element array)?
(17 answers)
Why isn't the size of an array parameter the same as within main?
(13 answers)
Closed 5 months ago.
I have been learning C for a few days now without any other programming experience, so I might not be clear when asking my question. It is mostly about pointers. For convenience purposes I named the variables so no one gets confused.
#include <stdio.h>
#include <string.h>
struct car {
char* name;
int speed;
float price;
};
void fun(struct car* p);
int main(void) {
struct car myStruct = { .name = "FORD", .speed = 55, .price = 67.87 };
fun(&myStruct);
printf("%d\n", myStruct.name == NULL);
printf("%d\n", myStruct.speed == 0);
printf("%d\n", myStruct.price == 0.0);
return(0);
}
void fun(struct car* p) {
memset(p, 0, sizeof(p));
}
This is my code.
I declare the struct car type globally, so it can be seen by other functions.
I write a function prototype that takes an argument of type struct car* and stores a copy of the argument into the parameter p that is local to the function.
Later, I write the actual function body. As you can see, I call the memset function that is in the string.h header. According to Linux man pages, it looks like this void* memset(void* s, int c, size_t n);.
What the memset function does in this case, is it fills the first sizeof(struct car* p) bytes of the memory area pointed to by the struct car* p with the constant byte c, which in this case is 0.
In the main function I initialize the myStruct variable of type struct car and then call the function fun and pass the address of myStruct into the function. Then I want to check whether all of the struct car "data members" were set to 0 by calling the printf function.
The output I get is
1
0
0
It means that only the first "data member" was set to NULL and the rest weren't.
On the other hand, if I call the memset function inside the main function, the output I get is
1
1
1
If I understand pointers correctly (it's been a few days since I've first heard of them, so my knowledge is not optimal), struct car myStruct has its own address in memory, let's say 1 for convenience.
The parameter struct car* p also has its own address in memory, let's say 2 and it stores (points to) the address of the variable struct car myStruct, so to the 1 address, because I passed it to the function here fun(&myStruct);
So by dereferencing the parameter p, for example (*p).name, I can change the value of the "data member" variable and the effects will be seen globally, because even though the p parameter is only a copy of the original myStruct variable, it points to the same address as the myStruct variable and by dereferencing the pointer struct car* p, I retrieve the data that is stored at the address the pointer points to.
So (*p).name will give me "FORD" and (*p).name = "TOYOTA" will change the data both locally in the function fun and globally in other functions as well, which is impossible without creating a pointer variable, if I do p.name = "TOYOTA", it changes only the value of the copy, that has its own address in the memory that is different from the address of the original struct variable, of the "data member" variable name locally, inside the function fun. It happens, because in this case I operate only on the copy of the original myStruct variable and not on the original one.
I think that in C there is only pass by value, so essentially every parameter is only a copy of the original variable, but pointers make it so that you can pass the address of the original variable (so it's like "passing by reference", but the copy is still made regardless, the thing is that then the function operates on the original address instead of on the parameter's address).
What I don't know is, why the memset function only changes the first "data member" variable to NULL and not all of them ?
void fun(struct car* p) {
memset(p, 0, sizeof(p));
p->name = NULL;
p->speed = 0;
p->price = 0.0;
}
If I do this then it changes all the values to NULL, 0, 0, but I don't know, if it is a good practice to do that as it is unnecessary in this case, because I explicitly initialize all the "data members" in the struct with some value.
void fun(struct car* p) {
memset(&p, 0, sizeof(p));
}
This also works and gives NULL, 0, 0. So maybe I should actually pass &s into the function instead of s, but I don't know how this works. The function void* memset(void* s, int c, size_t n); takes void* as the argument and not void**, the latter is understandable, because:
struct car myStruct = { .name = "FORD", .speed = 55, .price = 67.87 }; // It has its own address in memory and stores the struct at this address
struct car* p = &myStruct; // It points to the address of myStruct and retrieves the data from there when dereference happens, so first it goes to address 1 and then gets the data from this address
void** s = &p; // It points to the address of p and retrieves the data from there when double dereference happens, so it first goes to address 2 and gets the data and the data is address 1, then it goes to address 1 and gets the data, so the struct
But void* means pointer to void, so to any data type. It confuses me why void* s = &p; works, even though p itself is a pointer, so s should be a pointer to pointer to void, so void** s instead of void* s.
Also the memset function returns a pointer to the memory area s, so if s = &p and p = &myStruct, then it returns a pointer to the memory area of the struct, so a pointer to &myStruct. Maybe that's why it works.
In this call
memset(p, 0, sizeof(p));
you are setting to 0 only a part of object of the structure that is equal to the size of the pointer p.
Instead you need to write
memset(p, 0, sizeof(*p));
that is to set the whole object of the structure type with 0.
Pay attention to as the variable p is a pointer then this record
p.name = "TOYOTA";
is just syntactically incorrect.
This function
void fun(struct car* p) {
memset(&p, 0, sizeof(p));
}
does not set the passed object of the structure type through the pointer p to zeroes. Instead it sets to zeroes the memory occupied by the local variable p itself.
As for this question
But void* means pointer to void, so to any data type. It confuses me
why void* s = &p; works, even though p itself is a pointer, so s
should be a pointer to pointer to void, so void** s instead of void*
s.
then according to the C Standard (6.3.2.3 Pointers_
1 A pointer to void may be converted to or from a pointer to any
object type. A pointer to any object type may be converted to a
pointer to void and back again; the result shall compare equal to the
original pointer.
So you can write for example
struct car myStruct =
{
.name = "FORD", .speed = 55, .price = 67.87
};
struct car *p = &myStruct;
void *q = &p;
and then
( *( struct car ** )q )->name = "My Ford";

How to access second member of struct via pointer?

I have seen the first address of struct is simultaneously the first address of first member of that struct. Now what I would like to understand is, why I need always double pointer to move around in the struct:
#include <stdio.h>
#include <stdlib.h>
struct foo
{
char *s;
char *q;
};
int main()
{
struct foo *p = malloc(sizeof(struct foo));
char ar[] = "abcd\n";
char ar2[] = "efgh\n";
*(char**)p = ar;
*(char**)((char**)p+1) = ar2; //here pointer arithmetic (char**)p+1
printf("%s\n",p->q);
}
the question is, why do I need char** instead of simple char*?
What I saw in assembler is in case of simple char*, the arithmetic would behave like normal char. That is -> the expression of (char*)p+1 would move the address p just by one byte (instead of 8 as address are 8 bytes long). But yet the type char* is address, so I don't get why the arithmetic behave like the dereference type instead (plain char -> one byte).
So the only solution for me was to add another indirection char**, where the pointer-arithmetic magically takes 8 as size. So why in structs is needed such bizarre conversion?
You are doing funny things. You should just do:
struct foo *p = malloc(sizeof(struct foo));
char ar[] = "abcd\n";
char ar2[] = "efgh\n";
p->s = ar;
p->q = ar2;
First of all, what you are doing is slightly bizarre. It's also unsafe, since there may be padding between struct members and your address calculation may be off (that's likely not true in this particular case, but it's something to keep in mind).
As to why you need multiple pointers...
The type of p is struct foo * - it's already a pointer type. Each of the members s and q have type char *. To access the s or q members, you need to dereference p:
(*p).s = ar; // char * == char *
(*p).q = ar2; // char * == char *
So if you're trying to access the first character pointed to by s through p, you're trying to access a character through a pointer (s) through another pointer (p). p does not store the address of the first character of s, it stores the address of the thing that stores the address of the first character of s. Hence the need to cast p to char ** instead of char *.
And at this point I must emphasize DON'T DO THIS. You can't safely iterate through struct members using a pointer.
The -> operator was introduced to make accessing struct members through a pointer a little less eye-stabby:
p->s = ar; // equivalent to (*p).s = ar
p->q = ar2; // equivalent to (*p).q = ar2
As the address of an object of a structure type is equal to the address of its first member then you could write for example
( void * )&p->s == ( void * )p
Here is a demonstrative program
#include <stdio.h>
#include <stdlib.h>
struct foo
{
char *s;
char *q;
};
int main(void)
{
struct foo *p = malloc(sizeof(struct foo));
printf( "( void * )p == ( void * )&p->s is %s\n",
( void * )p == ( void * )&p->s ? "true" : "false" );
return 0;
}
Its output is
true
So the value of the pointer p is equal to the address of the data member s.
In other words a pointer to the data member s is equal to the pointer p.
As the type of the data member s is char * then pointer to s has the type char **.
To assign the pointed object you need to cast the pointer p of the type struct foo * to the type char **. To access the pointed object that is the data member s you have to dereference the pointer of the type char **.
As a result you have
*(char**)p = ar;
Now the data member s (that is the pointer of the type char *) is assigned with the address of the first element of the array ar.
In the second expression the left most casting is redundant
*(char**)((char**)p+1) = ar2;
^^^^^^^^
because the expression (char**)p+1 is already has the type char **. So you could just write
*((char**)p+1) = ar2;
why do I need char** instead of simple char*
With pointer usage, the the left side of the assignment, code needs the address of the object.
*address_of_the_object = object
As the object is a char *, the type on the left side, the address of the object, needs to be type char **.
How to access second member of struct via pointer?
Better to instead use the sensible:
p->q = ar2;
... then the convoluted:
// |-- address of p->q as a char * ----|
*((char **) ((char *)p + offsetof(struct foo, q))) = ar2;
//|------------ address of p->q as a char ** ---|
OP's *(char**)((char**)p+1) = ar2; is amiss as it does the wrong pointer math and assumes no padding.
Convoluted approach details.
To portable find the offset within a struct, use offsetof(struct foo, q). It returns the byte offset and will accounts for potential padding. Add that to a char * version of the struct address to do the proper pointer addition to form the address of p->q. That sum is a char *, Convert to the type of the address of the object. Lastly de-reference it on the LHS as part of the assignment.

Copying char arr[64] to char arr[] can cause a segmentation fault?

typedef struct {
int num;
char arr[64];
} A;
typedef struct {
int num;
char arr[];
} B;
I declared A* a; and then put some data into it. Now I want to cast it to a B*.
A* a;
a->num = 1;
strcpy(a->arr, "Hi");
B* b = (B*)a;
Is this right?
I get a segmentation fault sometimes (not always), and I wonder if this could be the cause of the problem.
I got a segmentation fault even though I didn't try to access to char arr[] after casting.
This defines a pointer variable
A* a;
There is nothing it is cleanly pointing to, the pointer is non-initialised.
This accesses whatever it is pointing to
a->num = 1;
strcpy(a->arr, "Hi");
Without allocating anything to the pointer beforehand (by e.g. using malloc()) this is asking for segfaults as one possible consequence of the undefined behaviour it invokes.
This is an addendum to Yunnosch's answer, which identifies the problem correctly. Let's assume you do it correctly and either write just
A a;
which gives you an object of automatic storage duration when declared inside a function, or you dynamically allocated an instance of A like this:
A *a = malloc(sizeof *a);
if (!a) return -1; // or whatever else to do in case of allocation error
Then, the next thing is your cast:
B* b = (B*)a;
This is not correct, types A and B are not compatible. Here, it will probably work in practice because the struct members are compatible, but beware that strange things can happen because the compiler is allowed to assume a and b point to different objects because their types are not compatible. For more information, read on the topic of what's commonly called the strict aliasing rule.
You should also know that an incomplete array type (without a size) is only allowed as the very last member of a struct. With a definition like yours:
typedef struct {
int num;
char arr[];
} B;
the member arr is allowed to have any size, but it's your responsibility to allocate it correctly. The size of B (sizeof(B)) doesn't include this member. So if you just write
B b;
you can't store anything in b.arr, it has a size of 0. This last member is called a flexible array member and can only be used correctly with dynamic allocation, adding the size manually, like this:
B *b = malloc(sizeof *b + 64);
This gives you an instance *b with an arr of size 64. If the array doesn't have the type char, you must multiply manually with the size of your member type -- it's not necessary for char because sizeof(char) is by definition 1. So if you change the type of your array to something different, e.g. int, you'd write this to allocate it with 64 elements:
B *b = malloc(sizeof *b + 64 * sizeof *(b->arr));
It appears that you are confusing two different topics. In C99/C11 char arr[]; as the last member of a structure is a Flexible Array Member (FAM) and it allows you to allocate for the structure itself and N number of elements for the flexible array. However -- you must allocate storage for it. The FAM provides the benefit of allowing one-allocation and one-free where there would normally be two required. (In C89 a similar implementation went by the name struct hack, but it was slightly different).
For example, B *b = malloc (sizeof *b + 64 * sizeof *b->arr); would allocate storage for b plus 64-characters of storage for b->arr. You could then copy the members of a to b using the proper '.' and '->' syntax.
A short example can illustrate:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NCHAR 64 /* if you need a constant, #define one (or more) */
typedef struct {
int num;
char arr[NCHAR];
} A;
typedef struct {
int num;
char arr[]; /* flexible array member */
} B;
int main (void) {
A a = { 1, "Hi" };
B *b = malloc (sizeof *b + NCHAR * sizeof *b->arr);
if (!b) {
perror ("malloc-b");
return 1;
}
b->num = a.num;
strcpy (b->arr, a.arr);
printf ("b->num: %d\nb->arr: %s\n", b->num, b->arr);
free (b);
return 0;
}
Example Use/Output
$ ./bin/struct_fam
b->num: 1
b->arr: Hi
Look things over and let me know if that helps clear things up. Also let me know if you were asking something different. It is a little unclear exactly where you confusion lies.

pointer to pointer for struct

I have the following code
#include <stdlib.h>
#include <stdio.h>
typedef struct {
int age;
} data;
int storage (data **str) {
*str = malloc(4 * sizeof(**str));
(*str)[0].age = 12;
return 0;
}
int main() {
data *variable = NULL;
storage(&variable);
return 0;
}
I took it from a website source. I think I have a misunderstanding about a basic pointer to pointer concept because here in this code, we have a pointer to a struct, variable, and we are passing this to storage function, which expects pointer to pointer of struct type. After memory was malloced, I don't understand this assignment
(*str)[0].age = 12
It was assigned as if, str was of (*)[] type. I dont understand how this assignment works, like str is now a pointer to an array of structs?
First, a note about C syntax for dereferencing pointers:
a[b] is equivalent to *(a + b), is equivalent to *(b + a), is equivalent to b[a].
Now, in
int main() {
data *variable = NULL;
storage(&variable);
return 0;
}
variable is of type "pointer to data", therefore its address &variable is of type "pointer to pointer to data". This is passed to int storage(data **str), and is the correct type for the argument str.
int storage (data **str) {
*str = malloc(4 * sizeof(**str));
(*str)[0].age = 12;
return 0;
}
Here, str is dereferenced, yielding an lvalue of type data * designating the same object as main()s variable. Since it is an lvalue, it can be assigned to.
malloc() allocates memory without declared type, large enough (and sufficiently aligned) to contain four contiguous objects of type data. It returns a pointer to the beginning of the allocation.
(*str)[0] is now an lvalue designating an object of type data, and by accessing the memory malloc() allocated through this expression, the effective type of the memory becomes data. (*str)[0].age = 12; assigns the value 12 to the age-member of this object, leaving the other members of the struct (and the rest of the allocated memory) uninitialized.
It can be illustrated like this
main:
data* variable = NULL; //variable is a pointer
storage(&variable) //the address of the pointer is &variable
the storage(data**) allows the function to take the address
of the pointer variable
this allows storage to change what variable points to
In storage, the following statement changes what variable points to by dereferencing (since we did pass the address of variable):
*str = malloc(4 * sizeof(**str) )
The malloc allocates a memory block containing four structs (which each has the size sizeof(struct data) bytes)
A struct is just a convenient way to access a part of memory, the
struct describes the layout of the memory. The statement
(*str)[0].age = 12;
is the equivalent of
data* d = *str;
d[0].age = 12;
or you can write it as a ptr with offset:
data* d = *str;
*(d + 0).age = 12;
edit: a clarification about malloc
malloc returns a block of memory allocated in bytes, the return type of malloc is void* so per definition it has no type and can be assigned to a pointer of arbitrary type:
T* ptr = malloc(n * sizeof(T));
After the assignment to ptr, the memory is treated as one or more elements of type T by using the pointer T*
Well, I think your code is simply identical to:
#include <stdlib.h>
#include <stdio.h>
typedef struct
{
int age;
} data;
int main()
{
data *variable = NULL;
variable = malloc(4 * sizeof(*variable));
*(variable + 0).age = 12;
return 0;
}
So variable is malloced with a block of memory, which is large enough to hold 4 datas(from variable[0] to variable[3]). That's all.
This piece of code might help illustrate what's happening, the really interesting line is
assert(sizeof(**str2) == sizeof(data));
Your numbers may vary form mine but first lets create a struct with a rather dull but hard to fake size for testing purposes.
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
typedef struct {
uint8_t age;
uint8_t here_as_illustartion_only[1728];
} data;
int main() {
data str;
data * str1 = &str;
data ** str2 = &str1;
printf("sizeof(str) =%*zu\n", 5, sizeof(str));
printf("sizeof(str1) =%*zu\n", 5, sizeof(str1));
printf("sizeof(str2) =%*zu\n", 5, sizeof(str2));
printf("sizeof(*str2) =%*zu\n", 5, sizeof(*str2));
printf("sizeof(**str2) =%*zu\n", 5, sizeof(**str2));
assert(sizeof(**str2) == sizeof(data));
return 0;
}
On my machine this prints the following
sizeof(str) = 1729
sizeof(str1) = 8
sizeof(str2) = 8
sizeof(*str2) = 8
sizeof(**str2) = 1729
Note the size of the pointer to pointer ie sizeof(**) is the dull number we're looking for.

C functions to create dynamic array of structs

can someone help with this piece of code? I leaved out check of allocations to keep it brief.
typedef struct {
int x;
int y;
} MYSTRUCT;
void init(MYSTRUCT **p_point);
void plusOne(MYSTRUCT **p_point, int *p_size);
int main()
{
MYSTRUCT *point;
int size = 1;
init(&point);
plusOne(&point, &size);
plusOne(&point, &size);
point[1]->x = 47; // this was the problem
point[1].x = 47; // this is solution
return 0;
}
void init(MYSTRUCT **p_point)
{
*p_point = (MYSTRUCT *) malloc( sizeof(MYSTRUCT) );
}
void plusOne(MYSTRUCT **p_point, int *p_size)
{
(*p_size)++;
*p_point = realloc(*p_point, *p_size * sizeof(MYSTRUCT) ); // also calling to the function is fixed
}
I don't understand why index notation doesn't work after calling to functions.
This is because you are not multiplying the p_size by sizeof(MYSTRUCT) in the call of realloc, and not assigning the results back to p_point:
*p_point = realloc(*p_point, *p_size * sizeof(MYSTRUCT));
Notes:
You do not need to cast the result of malloc or realloc in C.
For consistency, consider passing &size to init, and set it to 1 there.
You have some type confusion going on... Here:
MYSTRUCT *point;
you declare point to be a pointer to a MYSTRUCT structure (or an array of them).
The syntax point[i] is equivalent to *(point + i) - in other words, it already dereferences the pointer after the addition of the appropriate offset, yielding a MYSTRUCT object, not a pointer to one.
The syntax p->x is equivalent to (*p).x. In other words, it also expects p to be a pointer, which it dereferences, and then yields the requested field from the structure.
However, since point[i] is no longer a pointer to a MYSTRUCT, using -> on it is wrong. What you are looking for is point[i].x. You could alternatively use (point + i) -> x, but that's considerably less readable...

Resources