initialising structs and passing by reference - c

I'm fairly new to C and I am having trouble working with structs. I have the following code:
typedef struct uint8array {
uint8 len;
uint8 data[];
} uint8array;
int compare_uint8array(uint8array* arr1, uint8array* arr2) {
printf("%i %i\n data: %i, %i\n", arr1->len, arr2->len, arr1->data[0], arr2->data[0]);
if (arr1->len != arr2->len) return 1;
return 0;
}
int compuint8ArrayTest() {
printf("--compuint8ArrayTest--\n");
uint8array arr1;
arr1.len = 2;
arr1.data[0] = 3;
arr1.data[1] = 5;
uint8array arr2;
arr2.len = 4;
arr2.data[0] = 3;
arr2.data[1] = 5;
arr2.data[2] = 7;
arr2.data[3] = 1;
assert(compare_uint8array(&arr1, &arr2) != 0);
}
Now the output of this program is:
--compuint8ArrayTest--
3 4
data: 5, 3
Why are the values not what I initialized them to? What am I missing here?

In your case, uint8 data[]; is a flexible array member. You need to allocate memory to data before you can actually access it.
In your code, you're trying to access invalid memory location, causing undefined behavior.
Quoting C11, chapter §6.7.2.1 (emphasis mine)
As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply. Howev er, when a . (or ->) operator has a left operand that is
(a pointer to) a structure with a flexible array member and the right operand names that
member, it behaves as if that member were replaced with the longest array (with the same
element type) that would not make the structure larger than the object being accessed; the
offset of the array shall remain that of the flexible array member, even if this would differ
from that of the replacement array. If this array would have no elements, it behaves as if
it had one element but the behavior is undefined if any attempt is made to access that
element or to generate a pointer one past it.
A proper usage example can also be found in chapter §6.7.2.1
EXAMPLE 2 After the declaration:
struct s { int n; double d[]; };
the structure struct s has a flexible array member d. A typical way to use this is:
int m = /* some value */;
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));
and assuming that the call to malloc succeeds, the object pointed to by p behaves, for most purposes, as if
p had been declared as:
struct { int n; double d[m]; } *p;

Related

Does the stb lib violate Strict Aliasing rules in C?

I'm interested in the technique used by Sean Barrett to make a dynamic array in C for any type. Comments in the current version claims the code is safe to use with strict-aliasing optimizations:
https://github.com/nothings/stb/blob/master/stb_ds.h#L332
You use it like:
int *array = NULL;
arrput(array, 2);
arrput(array, 3);
The allocation it does holds both the array data + a header struct:
typedef struct
{
size_t length;
size_t capacity;
void * hash_table;
ptrdiff_t temp;
} stbds_array_header;
The macros/functions all take a void* to the array and access the header by casting the void* array and moving back one:
#define stbds_header(t) ((stbds_array_header *) (t) - 1)
I'm sure Sean Barrett is far more knowledgeable than the average programmer. I'm just having trouble following how this type of code is not undefined behavior because of the strict aliasing rules in modern C. If this does avoid problems I'd love to understand why it does so I can incorporate it myself (maybe with a few less macros).
Lets follow the expansions of arrput in https://github.com/nothings/stb/blob/master/stb_ds.h :
#define STBDS_REALLOC(c,p,s) realloc(p,s)
#define arrput stbds_arrput
#define stbds_header(t) ((stbds_array_header *) (t) - 1)
#define stbds_arrput(a,v) (stbds_arrmaybegrow(a,1), (a)[stbds_header(a)->length++] = (v))
#define stbds_arrmaybegrow(a,n) ((!(a) || stbds_header(a)->length + (n) > stbds_header(a)->capacity) \
? (stbds_arrgrow(a,n,0),0) : 0)
#define stbds_arrgrow(a,b,c) ((a) = stbds_arrgrowf_wrapper((a), sizeof *(a), (b), (c)))
#define stbds_arrgrowf_wrapper stbds_arrgrowf
void *stbds_arrgrowf(void *a, size_t elemsize, size_t addlen, size_t min_cap)
{
...
b = STBDS_REALLOC(NULL, (a) ? stbds_header(a) : 0, elemsize * min_cap + sizeof(stbds_array_header));
//if (num_prev < 65536) prev_allocs[num_prev++] = (int *) (char *) b;
b = (char *) b + sizeof(stbds_array_header);
if (a == NULL) {
stbds_header(b)->length = 0;
stbds_header(b)->hash_table = 0;
stbds_header(b)->temp = 0;
} else {
STBDS_STATS(++stbds_array_grow);
}
stbds_header(b)->capacity = min_cap;
return b;
}
how this type of code is not undefined behavior because of the strict aliasing
Strict aliasing is about accessing data that has different effective type than data stored there. I would argue that the data stored in the memory region pointed to by stbds_header(array) has the effective type of the stbds_array_header structure, so accessing it is fine. The structure members are allocated by realloc and initialized one by one inside stbds_arrgrowf by stbds_header(b)->length = 0; lines.
how this type of code is not undefined behavior
I think the pointer arithmetic is fine. You can say that the result of realloc points to an array of one stbds_array_header structure. In other words, when doing the first stbds_header(b)->length = inside stbds_arrgrowf function the memory returned by realloc "becomes" an array of one element of stbds_array_header structures, as If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access from https://port70.net/~nsz/c/c11/n1570.html#6.5p6 .
int *array is assigned inside stbds_arrgrow to point to "one past the last element of an array" of one stbds_array_header structure. (Well, this is also the same place where an int array starts). ((stbds_array_header *) (array) - 1) calculates the address of the last array element by subtracting one from "one past the last element of an array". I would rewrite it as (char *)(void *)t - sizeof(stbds_array_header) anyway, as (stbds_array_header *) (array) sounds like it would generate a compiler warning.
Assigning to int *array in expansion of stbds_arrgrow a pointer to (char *)result_of_realloc + sizeof(stbds_array_header) may very theoretically potentially be not properly aligned to int array type, breaking If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined from https://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p7 . This is very theoretical, as stbds_array_header structure has size_t and void * and ptrdiff_t members, in any normal architecture it will have good alignment to access int (or any other normal type) after it.
I have only inspected the code in expansions of arrput. This is a 2000 lines of code, there may be other undefined behavior anywhere.

The most efficient way to initialize array member of struct?

I have declared the struct
struct wnode {
char *word;
int lines[MAXLINES];
struct wnode *left;
struct wnode *right;
};
and the pointer
struct wnode *p;
The pointer is passed to a function.
In that function, I first allocate memory for the pointer with malloc. Then I want to initialize the struct member lines to zero zero out the struct member lines.
An array initialization method will not work as it is interpreted as assignment:
p->lines[MAXLINES] = {0};
The compiler throws the error:
error: expected expression before '{' token
In the end, I'm just using a for loop to zero out the lines array:
for (i = 0; i < MAXLINES; i++)
p->lines[i] = 0;
Is there a better way?
Arrays cannot be assigned to directly. You need to either use a loop to set all fields to 0 or you can use memset:
memset(p->lines, 0, sizeof(p->lines));
Note that for non-char types you can only to do this to set all members to 0. For any other value you need a loop.
If you want to use the = operator, you can do it this way:
struct wnode wn, *p;
/* ........ */
wn = (struct wnode){.word = wn.word, .lines = {0,}, .left = wn.left, .right = wn.right};
*p = (struct wnode){.word = p ->word, .lines = {0,}, .left = p -> left, .right = p -> right};
= {0} works only on initialization. You can't use it with assignment as such which is why you get the error.
You can either use a for loop as you said or use memset to zero out the array:
memset(p -> lines, 0, sizeof(p -> lines))
The only time an array variable can be initialized in this manner:
int someInt[MAXLINES] = {0};
Is during declaration.
But because this particular variable int lines[MAXLINES]; is declared within the confines of struct, which does not allow members to be initialized, the opportunity is lost to that method, requiring it to be initialized after the fact, and using a different method.
The most common (and preferred) way to initialize after declaration in this case is to use:
//preferred
memset(p->lines, 0, sizeof(p->lines));
A more arduous method, and one that is seen often, sets each element to the desired value in a loop:
for(int i=0;i<MAXLINES;i++)
{
p->lines[i] = 0;
}
As noted in comments, this method will be reduced by a good optimizing compiler to the equivalent of an memset() statement anyway.
This declaration of a pointer
struct wnode *p;
either zero-initializes the pointer p if the pointer has static storage duration or leaves the pointer uninitialized if the pointer has automatic storage duration, So applying the operator -> to the pointer invokes undefined behavior because the pointer does not point to a valid object.
If to assume that the pointer points to a valid object like for example
struct wnode w;
struct wnode *p = &w;
then within the function you can initialize the data member lines of the object w using the standard C function memset. For example
memset( p->lines, 0, MAXLINES * sizeof( int ) );
You may not write in the function just
p->lines = {0};
because the pointed object is already created and such an initialization is allowed in a declaration of an array. And moreover arrays do not have the assignment operator.
That kind of initialization can only be done on declaration. Notice that in p->lines[MAXLINES] = {0}; the expression p->lines[MAXLINES] means the integer one past the end of p->lines.
You could write p->lines[MAXLINES] = 0;. Not correct, but would compile.
You don't have the concept of array any more. You either have p->lines, which is a pointer to int, or p->lines[index], which is an int.
Yes, you have the allocated space, but that's all. memset will do the trick.
By the way, I hope your function (or the caller) do allocate the wnode element...
{0} works only in initialization, not in assignment.
Since you can't initialize a pointer target, only assign to it, you can work around the problem by assigning a just-initialized compound literal.
Compilers will usually optimize the compound literal out and assign directly to the target.
{0} initialization of a largish array will frequently compile to a call to memset or equivalent assembly, so another option is to call memset directly on p->lines manually.
Example:
#define MAXLINES 100
struct wnode {
char *word;
int lines[MAXLINES];
struct wnode *left;
struct wnode *right;
};
//(hopefully) elided compound literal
void wnode__init(struct wnode *X)
{
*X = (struct wnode){"foo",{0},X,X};
}
//explicit memset
#include <string.h>
void wnode__init2(struct wnode *X)
{
X->word = "foo";
memset(X->lines,0,sizeof(X->lines));
X->left = X;
X->right = X;
}
https://gcc.godbolt.org/z/TMgGqV
void *memset(void *s, int c, size_t n)
The memset() function fills the first n bytes of the memory area
pointed to by s with the constant byte c.
The memset() function returns a pointer to the memory area s.
Although, (sizeof(int)*MAXLINES) and sizeof(p->lines) yields the same result in bytes, and both are correct BUT, the second option ( sizeof(p->lines)) is better to use, because if we decide to change the array type or the array size, we dont need to change the expression inside the szieof operator. we change in oneplace only. so we let the compiler to do the work for us!
#include <string.h> /*for the memset*/
memset(p->lines, 0x0, sizeof(p->lines));
#ikegami's comment to the original question needs to be an answer.
Use calloc() rather than malloc()
struct wnode *p;
// presumably you have N as the number of elements and
p = malloc(N * sizeof *p);
// replace with
//p = calloc(N, sizeof *p);

Understanding-pointer to a structure

I want to understand how the pointer to the structure is passed to the function argument and implemented. How is avrg_stpc[idx_u16].sum_f32 array is working?
typedef struct
{
const float * input_f32p;
float avg_f32;
float sum_f32;
float factor_f32;
unsigned int rs_u16;
} avgminmax_avg_t;
void avgminmax_AvgCalculate_vd(
avgminmax_avg_t * const avrg_stpc,
const unsigned int numOfEntrys_u16c)
{
unsigned int idx_u16 = 0u;
do
{
avrg_stpc[idx_u16].sum_f32 += (*avrg_stpc[idx_u16].input_f32p
- avrg_stpc[idx_u16].avg_f32);
avrg_stpc[idx_u16].avg_f32 = (avrg_stpc[idx_u16].sum_f32 *
avrg_stpc[idx_u16].factor_f32);
idx_u16++;
}while(idx_u16 < numOfEntrys_u16c);
}
A few points that could help you understand arrays and pointers and their relationship:
A pointer really only points to one "object", but that object might be the first in an array.
Arrays naturally decays to pointers to their first element.
And array indexing is equivalent to pointers arithmetic (for any pointer or array a and index i, the expression a[i] is exactly equal to *(a + i)).
As for your specific example code, perhaps it would be easier if you thought of it similar to this:
avgminmax_avg_t *temp_ptr = &avrg_stpc[idx_u16];
temp_ptr->sum_f32 += ...;
temp_ptr->avg_f32 = ...;
Or perhaps like:
avgminmax_avg_t temp_object = avrg_stpc[idx_u16];
temp_object.sum_f32 += ...;
temp_object.avg_f32 = ...;
avrg_stpc[idx_u16] = temp_obj;
Both the snippets above will lead to the same result as your existing code, but requires an extra temporary variable, and in the latter snippet copying of the structure twice.
avrg_stpc is regarded as an array (possibly, allocated on heap via .*alloc); since its bounds can't be known, hence the second argument to the function. See here: https://en.cppreference.com/w/c/language/operator_member_access

how can one get the size of an array via a pointer? [duplicate]

This question already has answers here:
How can I get the size of an array from a pointer in C?
(16 answers)
Closed 9 years ago.
For the following scenario, how can I get the size (3) of the array a via the pointer c? What is the pattern for solving this sort of problems?
struct struct_point {
int x;
int y;
int z;
};
typedef struct struct_point point;
int test_void_pointer () {
point a[3] = {{1, 1, 1}, {2, 2, 2}};
void * b;
point * c;
b = a;
c = b;
/* get_size_p (c) */
}
You can't. The pointer is just an address, a number, and it doesn't hold any information about the data it points to except its type.
Side note: that's why they say "arrays decay to pointers". They "decay" because inherently a pointer holds less information compared to an array.
As nims points out in the comments when passing an array to a function, it automatically decays to a pointer to the first element - and doing sizeof in the function doesn't yield the expected result. Incidentally there's also a C FAQ about this.
In C, no information about the size of the array is stored with the array. You have to know how big it is to work with it safely.
There are a few techniques for working around this. If the array is declared statically in the current scope, you can determine the size as:
size_t size = (sizeof(a) / sizeof(a[0]);
This is useful if you don't want to have to update the size every time you add an element:
struct point a[] = {{1, 1, 1}, {2, 2, 2}};
size_t size = (sizeof(a) / sizeof(a[0));
But if you have an arbitrary array, that has been passed in from somewhere else, or converted to a pointer as in your example, you'll need some way of determining its size. The usual ways to do this are to pass the size in along with the array (either as a separate parameter, or as a struct containing the array), or if the array is of a type which can contain a sentinel value (a value of the given type that is not valid), you can allocate an array one bigger than you need add a sentinel to the end of the array and use that to determine when you've reached the end.
Here's how you might pass in a length as a separate argument:
struct point myfunction(struct point array[], size_t n) {
for (size_t i = 0; i < n; ++i) {
struct point p = array[i];
// do something with p ...
}
}
Or as a structure containing the length:
struct point_array {
size_t n;
struct point elems[];
}
struct point myfunction(struct point_array a) {
for (size_t i = 0; i < a.n; ++i) {
struct point p = a.elems[i];
// do something with p ...
}
}
It would probably be hard to use sentinel values with an array of struct point directly, as there is no obvious invalid value that is still of the same type, but they are commonly used for strings (arrays of char which are terminated by a '\0' character), and arrays of pointers which are terminated by a null pointer. We can use that with struct point by storing pointers to our structures rather than storing them inline in the array:
struct point *myfunction(struct point *a[]) {
for (size_t i = 0; a[i] != NULL; ++i) {
struct point *p = a[i];
// do something with p ...
}
}
There's a way to determine the length of an array, but for that you would have to mark the end of the array with another element, such as -1. Then just loop through it and find this element. The position of this element is the length.

The difference of the pointer and a array

I have a piece of code bellow,and what's the difference of them?
The first one,the address of buf element of the struct is 4 bigger than that of the struct while the second one is not.
First
#include <stdio.h>
typedef struct A
{
int i;
char buf[]; //Here
}A;
int main()
{
A *pa = malloc(sizeof(A));
char *p = malloc(13);
memcpy(p, "helloworld", 10);
memcpy(pa->buf, p, 13);
printf("%x %x %d %s\n", pa->buf, pa, (char *)pa->buf - (char *)pa, pa->buf);
}
Second
typedef struct A
{
int i;
char *buf; //Here
}A;
The first is a C99 'flexible array member'. The second is the reliable fallback for when you don't have C99 or later.
With a flexible array member, you allocate the space you need for the array along with the main structure:
A *pa = malloc(sizeof(A) + strlen(string) + 1);
pa->i = index;
strcpy(pa->buf, string);
...use pa...
free(pa);
As far as the memory allocation goes, the buf member has no size (so sizeof(A) == sizeof(int) unless there are padding issues because of array alignment — eg if you had a flexible array of double).
The alternative requires either two allocations (and two releases), or some care in the setup:
typedef struct A2
{
int i;
char *buf;
} A2;
A2 *pa2 = malloc(sizeof(A2));
pa2->buff = strdup(string);
...use pa2...
free(pa2->buff);
free(pa2);
Or:
A2 *pa2 = malloc(sizeof(A2) + strlen(string) + 1);
pa2->buff = (char *)pa2 + sizeof(A2);
...use pa2...
free(pa2);
Note that using A2 requires more memory, either by the size of the pointer (single allocation), or by the size of the pointer and the overhead for the second memory allocation (double allocation).
You will sometimes see something known as the 'struct hack' in use; this predates the C99 standard and is obsoleted by flexible array members. The code for this looks like:
typedef struct A3
{
int i;
char buf[1];
} A3;
A3 *pa3 = malloc(sizeof(A3) + strlen(string) + 1);
strcpy(pa3->buf, string);
This is almost the same as a flexible array member, but the structure is bigger. In the example, on most machines, the structure A3 would be 8 bytes long (instead of 4 bytes for A).
GCC has some support for zero length arrays; you might see the struct hack with an array dimension of 0. That is not portable to any compiler that is not mimicking GCC.
It's called the 'struct hack' because it is not guaranteed to be portable by the language standard (because you are accessing outside the bounds of the declared array). However, empirically, it has 'always worked' and probably will continue to do so. Nevertheless, you should use flexible array members in preference to the struct hack.
ISO/IEC 9899:2011 §6.7.2.1 Structure and union specifiers
¶3 A structure or union shall not contain a member with incomplete or function type (hence,
a structure shall not contain an instance of itself, but may contain a pointer to an instance
of itself), except that the last member of a structure with more than one named member
may have incomplete array type; such a structure (and any union containing, possibly
recursively, a member that is such a structure) shall not be a member of a structure or an
element of an array.
¶18 As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply. However, when a . (or ->) operator has a left operand that is
(a pointer to) a structure with a flexible array member and the right operand names that
member, it behaves as if that member were replaced with the longest array (with the same
element type) that would not make the structure larger than the object being accessed; the
offset of the array shall remain that of the flexible array member, even if this would differ
from that of the replacement array. If this array would have no elements, it behaves as if
it had one element but the behavior is undefined if any attempt is made to access that
element or to generate a pointer one past it.
struct A {
int i;
char buf[];
};
does not reserve any space for the array, or for a pointer to an array. What this says is that an array can directly follow the body of A and be accessed via buf, like so:
struct A *a = malloc(sizeof(*a) + 6);
strcpy(a->buf, "hello");
assert(a->buf[0] == 'h');
assert(a->buf[5] == '\0';
Note I reserved 6 bytes following a for "hello" and the nul terminator.
The pointer form uses an indirection (the memory could be contiguous, but this is neither depended on nor required)
struct B {
int i;
char *buf;
};
/* requiring two allocations: */
struct B *b1 = malloc(sizeof(*b1));
b1->buf = strdup("hello");
/* or some pointer arithmetic */
struct B *b2 = malloc(sizeof(*b2) + 6);
b2->buf = (char *)((&b2->buf)+1);
The second is now laid out the same as a above, except with a pointer between the integer and the char array.

Resources