Pointer in structure memory allocation in initialization (c99) - c

Lets assume that we've got a type:
typedef struct __BUFF_T__
{
u_int8_t *buf;
u_int32_t size;
}buff_t;
Is it correct allocating memory next way in c99?
buff_t a = {.size = 20,.buf = calloc(a.size,1)};
Compiler shows warning
Variable 'data' is uninitialized when used within its own initialization
Memory's available and all, but are there some other non-warning options to do the same?

From 6.7.9p23:
The evaluations of the initialization list expressions are indeterminately sequenced with
respect to one another [...] (152) In particular, the evaluation order need not be the same as the order of subobject initialization.
So there is no guarantee that a.size is initialized at the point calloc(a.size, 1) is evaluated for the initialization of a.buf.
In this case, a suitable initializer would be a creation function:
inline buff_t create_buff(u_int32_t size) {
return (buff_t) {.size = size, .buf = calloc(size, 1)};
}
buff_t a = create_buff(20);
This can't be used for static or file-scope objects; in that case a macro would be necessary (or, for example, a gcc statement expression, which could be used in a macro).

The structure is not fully initialized until after the assignment of a, because you don't know in which order the expressions will be evaluated.
If you need to use a structure field to initialize another field in the same structure, you have to do it in separate steps.

Related

Is accessing an array within a temporary variable undefined behaviour?

typedef struct test{
char c_arr[1];
} test;
test array[1] = {{1}};
test get(int index){
return array[index];
}
int main(){
char* a = get(0).c_arr;
return *a;
}
In this post the behaviour is explained in C++: Accessing an array within a struct causes warnings with clang
Above code does not cause any warnings or errors when compiling with gcc or clang.
Is get(0).c_arr returning a pointer to a temporary variable which gets destroyed at the end of the expression? and if so, is dereferencing and returning its value UB? if it is then what would be a good way to fix it, maybe this?
test* get(int index){
return &array[index];
}
char* a = get(0).c_arr;
return *a;
Is clearly UB; the memory referred to by a has been released from the stack by the time the return runs, and I would be really annoyed by the contrary result. Imagine if somebody wrote:
while (true) {
char a = get(0).c_arr[0];
//...
a = get(0).c_arr[0];
//...
a = get(0).c_arr[0];
//...
a = get(0).c_arr[0];
//...
a = get(0).c_arr[0];
//...
}
The stack waste stinks and embedded programmers and the kernel programmers would be up in arms about it.
However, the following is valid in C: char a = get(0).c_arr[0]; This is because the temporary persists long enough to use in the expression.
Yes, it's undefined behavior. The relevant section of the C17 standard is "6.2.4 Storage durations of objects":
The lifetime of an object is the portion of program execution during which storage is guaranteed
to be reserved for it. An object exists, has a constant address,33) and retains its last-stored value
throughout its lifetime.34) If an object is referred to outside of its lifetime, the behavior is undefined.
A non-lvalue expression with structure or union type, where the structure or union contains a
member with array type (including, recursively, members of all contained structures and unions)
refers to an object with automatic storage duration and temporary lifetime.36) Its lifetime begins
when the expression is evaluated and its initial value is the value of the expression. Its lifetime ends
when the evaluation of the containing full expression ends.
The expression get(0) is not an lvalue, and struct test contains c_arr, a member with array type, so it has temporary lifetime. This means that return *a; is UB because it accesses it outside of its lifetime.
Also, a big hint that this isn't allowed is that had you used char c; instead of char c_arr[1];, then char* a = &get(0).c; would have been a compile-time error, since you can't take the address of an rvalue, and what you wrote is basically morally equivalent to trying to do that.
I filed GCC bug 101358 and LLVM bug 51002 about not receiving warnings for this.

Can the address of a variable with automatic storage duration be taken in its definition?

Is it allowed to take the address of an object on the right hand-side of its definition, as happens in foo() below:
typedef struct { char x[100]; } chars;
chars make(void *p) {
printf("p = %p\n", p);
chars c;
return c;
}
void foo(void) {
chars b = make(&b);
}
If it is allowed, is there any restriction on its use, e.g., is printing it OK, can I compare it to another pointer, etc?
In practice it seems to compile on the compilers I tested, with the expected behavior most of the time (but not always), but that's far from a guarantee.
To answer the question in the title, with your code sample in mind, yes it may. The C standard says as much in §6.2.4:
The lifetime of an object is the portion of program execution during
which storage is guaranteed to be reserved for it. An object exists,
has a constant address, and retains its last-stored value throughout
its lifetime.
For such an object that does not have a variable length array type,
its lifetime extends from entry into the block with which it is
associated until execution of that block ends in any way.
So yes, you may take the address of a variable from the point of declaration, because the object has the address at this point and it's in scope. A condensed example of this is the following:
void *p = &p;
It serves very little purpose, but is perfectly valid.
As for your second question, what can you do with it. I can mostly say I wouldn't use that address to access the object until initialization is complete, because the order of evaluation for expressions in initializers is left unsepcified (§6.7.9). You can easily find your foot shot off.
One place where this does come through, is when defining all sorts of tabular data structures that need to be self referential. For instance:
typedef struct tab_row {
// Useful data
struct tab_row *p_next;
} row;
row table[3] = {
[1] = { /*Data 1*/, &table[0] },
[2] = { /*Data 2*/, &table[1] },
[0] = { /*Data 0*/, &table[2] },
};
6.2.1 Scopes of identifiers
Structure, union, and enumeration tags have scope that begins just after the appearance of
the tag in a type specifier that declares the tag. Each enumeration constant has scope that
begins just after the appearance of its defining enumerator in an enumerator list. Any
other identifier has scope that begins just after the completion of its declarator.
In
chars b = make(&b);
// ^^
the declarator is b, so it is in scope in its own initializer.
6.2.4 Storage durations of objects
For such an [automatic] object that does not have a variable length array type, its lifetime extends
from entry into the block with which it is associated until execution of that block ends in
any way.
So in
{ // X
chars b = make(&b);
}
the lifetime of b starts at X, so by the time the initializer executes, it is both alive and in scope.
As far as I can tell, this is effectively identical to
{
chars b;
b = make(&b);
}
There's no reason you couldn't use &b there.
The question has already been answered, but for reference, it doesn't make much sense. This is how you would write the code:
typedef struct { char x[100]; } chars;
chars make (void) {
chars c;
/* init c */
return c;
}
void foo(void) {
chars b = make();
}
Or perhaps preferably in case of an ADT or similar, return a pointer to a malloc:ed object. Passing structs by value is usually not a good idea.

Can someone please explain the difference between these two initiliazers?

I was wondering if someone could provide a detailed, simple explanation of the differences between the two of the following pieces of code. Given the following definition:
typedef struct {
stuff;
stuff_2;
} variable_t;
What is the difference between:
variable_t my_variable;
variable_t my_variable = {};
And if I do the first one, and then never fully initialize it, why does the compiler not throw an error?
Note: I am compiling with gcc -std=gnu99, so the second is valid and wound up being the solution to a problem that I had. I was wondering as to why.
It depends a little bit on where you place the respective variable definition, and it also seems depends on the compiler in use.
Automatic storage duration
Let's discuss the difference when the variables have automatic storage duration (which is the case if you place it in function or block scope and there without static keyword):
void someFunction() {
variable_t my_variable; // (1)
variable_t my_variable = {}; // (2)
}
(1) denotes a variable definition without an explicit initialization. And according this online C standard draft, it's value is indeterminate:
If an object that has automatic storage duration is not initialized
explicitly, its value is indeterminate.
(2) is a variable definition with explicit initialization through an initializer list without designators, i.e. without associating values to members through their names but only through the order of values (cf. 6.7.9 p17..21).
The interesting paragraph is 6.7.9 p21, which states that if the initializer list has fewer entries than the number of struct members, the members are initialized according to the initialization rule of static storage duration (i.e. to 0 or NULL as explained later):
If there are fewer initializers in a brace-enclosed list than there
are elements or members of an aggregate, ... , the remainder of the
aggregate shall be initialized implicitly the same as objects that
have static storage duration.
So it seems that if you write variable_t my_variable = {}, then all members are initialized to 0 or NULL.
However, as mentioned by aschepler in the comments, C initialization list grammar states that initializer lists must not be empty (cf. also cppreference.com):
... the initializer must be a non-empty, brace-enclosed,
comma-separated list of initializers for the members
So according to the standard an initializer list in C needs at least one entry; When testing it in my XCode8.3 environment with -std=gnu99, an empty initialization list seems to be supported, but I am aware that this is not a valid reference. So to be safe and not to depend on particular compiler extensions, you should actually write:
variable_t my_variable = {0};
Static storage duration
At file scope, your variable definitions will have static storage duration, and then other rules apply (cf. 6.7.9 (10)):
(10) ... If an object that has static or thread storage duration is
not initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned) zero;
if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized
to zero bits;
...
(21) If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, ... the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.
So if you write...
#include <stdio.h>
variable_t my_variable; // (1)
variable_t my_variable = {}; // (2)
then (1) and (2) actually yield the same result because for the not explicitly initialized variable (1), paragraph (10) applies, and for the explicitly but empty initialized variable (2), according to paragraph (21), every member falls back to the initialization rule of (10).
Again, compilers may not support empty initalization lists as discussed above.
Hope it helps (because it has been a lot of typing :-) )
When you declare:
variable_t my_variable; // a variable_t that is uninitialized
variable_t my_variable = {}; // a variable_t initialized with zeroes.
Note that for variables declared at file-scope, this doesn't really matter since the data is - normally- zeroed out before program start.
Used on the stack, the second line efficiently fills my_variables with zeroes. Same as if there was a call to:
memset(&variable, 0, sizeof(variable));
This works because, in C, you can copy a structusing =.
Here's a little game the computer is sure to win.
struct A { /*...*/ };
void main()
{
A a; // random
A b = {};
if (memcmp(&a, &b, sizeof(A)) == 0)
{
printf("You win\n");
return;
}
a = b;
if (memcmp(&a, &b, sizeof(A)) == 0)
{
printf("I win\n");
}
}

Using malloc with static pointers

I know that declaring a ststic variable and initializing it in this waystatic int *st_ptr = malloc(sizeof(int)); will generate a compile error message(Type initializer element is not constant),and solving this by using separate statements in this way static int *st_ptr;
st_ptr = malloc(5*sizeof(int));
i need to understand the difference between initialization operator and assignment operator in this case ?and why this way solved the problem ?
First, let's have a brief on initialization vs. assignment.
Initialization:
This is used to specify the initial value of an object. Usually, this means, only at the time of defining a variable, initialization takes place. The value to initialize the object is called an initalizer. From C11 , chapter 6.7.9,
An initializer specifies the initial value stored in an object.
Assignment:
Assignment is assigning (or setting) the value of a variable, at any (valid) given point of time of execution. Quoting the standard, chapter 6.5.16,
An assignment operator stores a value in the object designated by the left operand.
In case of simple assignment (= operator),
In simple assignment (=), the value of the right operand is converted to the type of the
assignment expression and replaces the value stored in the object designated by the left
operand.
That said, I think, your query has to do with the initialization of static object.
For the first case,
static int *st_ptr = malloc(sizeof(int));
Quoting from C11 standard document, chapter §6.7.9, Initialization, paragraph 4,
All the expressions in an initializer for an object that has static or thread storage duration
shall be constant expressions or string literals.
and regarding the constant expression, from chapter 6.6 of the same document, (emphasis mine)
Constant expressions shall not contain assignment, increment, decrement, function-call,
or comma operators, except when they are contained within a subexpression that is not
evaluated.
clearly, malloc(sizeof(int)); is not a constant expression, so we cannot use it for initialization of a static object.
For the second case,
static int *st_ptr;
st_ptr = malloc(5*sizeof(int));
you are not initializing the static object. You're leaving it uninialized. Next instruction, you're assigning the return value of malloc() to it. So your compiler does not produce any complains.
when a variable is declared static inside a function , it is created in either the "data segment" or the "bss segment" , depends if it were initialized or not. this variable is created in the binaries and must have a constant value - remember - static variables inside a function are created when the program goes on even before the main() starts , it can't be initialized with any function since the program does not 'run' yet(there is no kind of evaluations or function calls) so the initializer must be constant or not initialize at the first place.
static int *st_ptr = malloc(sizeof(int));
here, you bind the creation of st_ptr with malloc , but since malloc is a function that needs to run and st_ptr must be created before any other function runs - this creates impossible state
static int *st_ptr;
st_ptr = malloc(5*sizeof(int));
here, the st_ptr is created and left un-initialize, the creation of it is not bound to any function.
each time the function runs - malloc takes place. so the activation of malloc and creation st_ptr are not depended.
but as I stated in the comment - this is extremely dangerous practice. you allocate more and more memory on the same variable. the only way to avoid it is to free(st_ptr) in the end of every function. this said - you don't need it to be static at the first place
Roughly, initialization in C is when the compiler outputs binary data to executable file; assignment is the operation performed by actual executable code.
So, static int i = 5 makes the compiler to output data word 5 to executable file's data section; while int i = func() makes the compiler to generate several CPU instructions as call to call subroutine and mov to store the result.
Thus the expression static int i = func() requires both 1) to be calculated earlier than main() (as this is an initialization), 2) a piece of user code to execute (which may only make sense in the context of the new program instance). It's possible to solve that issue by creating some hidden initialization subroutine which executes before main(). Actually, C++ does this. But C has no such feature, so static variables may be initialized only with constants.

Is there anything wrong with `something_t* x = malloc(sizeof(*x))`?

I'm writing some extremely repetitive code in C (reading XML), and I found that writing my code like makes it easier to copy and paste code in a constructor*:
something_t* something_new(void)
{
something_t* obj = malloc(sizeof(*obj));
/* initialize */
return obj;
}
What I'm wondering is, it is safe to use sizeof(*obj) like this, when I just defined obj? GCC isn't showing any warnings and the code works fine, but GCC tends to have "helpful" extensions so I don't trust it.
* And yes, I realize that I should have just written a Python program to write my C program, but it's almost done already.
something_t* obj = malloc(sizeof(*obj));
What I'm wondering is, it is safe to use sizeof(*obj)
like this, when I just defined obj?
You have a declaration consisting of:
type-specifier something_t
declarator * obj
=
initializer malloc(sizeof(*obj))
;
The C standard says in section Scopes of identifiers:
Structure, union, and enumeration tags have scope that begins just
after the appearance of the tag in a type specifier that declares the
tag. Each enumeration constant has scope that begins just after the
appearance of its defining enumerator in an enumerator list. Any other
identifier has scope that begins just after the completion of its
declarator.
Since obj has scope that begins just after the completion of its declarator, it is guaranteed by the standard that the identifier used in the initializer refers to the just defined object.
While we are giving the sizeof like this.
something_t* obj = malloc(sizeof(obj));
It will allocate the memory to that pointer variable as four bytes( bytes allocated to a pointer variable.)
something_t* obj = malloc(sizeof(*obj));
It will take the data type which is declared to that pointer.
For example,
char *p;
printf("%d\n",sizeof(p));
It will return the value as four.
printf("%d\n",sizeof(*p));
Now it will return value as one. Which is a byte allocated to the character. So when we are using the *p in sizeof it will take the datatype. I don't know this is the answer you are expecting.
It's safe. For example,
n = sizeof(*(int *)NULL);
In this case, NULL pointer access doesn't occur, because a compiler can caluculate the size of the operand without knowing the value of "*(int *)NULL" in run time.
The C89/90 standard guarantees the 'sizeof' expression is a constant one; it is translated into a constant (e.g. 0x04) and embedded into a binary code in compilation stage.
In C99 standard, the 'sizeof' expression is not always a compile-time constant, because of introducing variable length array. For example,
n = sizeof(int [*(int *)NULL]);
In this case, the value of "*(int *)NULL" needs to be known in run time to caluculate the size of 'int[]'.

Resources