I was very surprised when I saw this notation. What does it do and what kind of C notion is it?
This is a compound literal as defined in section 6.5.2.5 of the C99 standard.
It's not part of the C++ language, so it's not surprising that C++ compilers don't compile it. (or Java or Ada compilers for that matter)
The value of the compound literal is that of an unnamed object initialized by the
initializer list. If the compound literal occurs outside the body of a function, the object
has static storage duration; otherwise, it has automatic storage duration associated with
the enclosing block.
So no, it won't destroy the stack. The compiler allocates storage for the object.
Parenthesis are put around the type and it is then followed by an initializer list - it's not a cast, as a bare initialiser list has no meaning in C99 syntax; instead, it is a postfix operator applied to a type which yields an object of the given type. You are not creating { 0, 3 } and casting it to an array, you're initialising an int[2] with the values 0 and 3.
As to why it's used, I can't see a good reason for it in your single line, although it might be that a could be reassigned to point at some other array, and so it's a shorter way of doing the first two lines of:
int default_a[] = { 0, 2 };
int *a = default_a;
if (some_test) a = get_another_array();
I've found it useful for passing temporary unions to functions
// fills an array of unions with a value
kin_array_fill ( array, ( kin_variant_t ) { .ref = value } )
This is a c99 construct, called a compound literal.
From the May 2005 committee draft section 6.5.2.5:
A postfix expression that consists of
a parenthesized type name followed by
a brace- enclosed list of initializers
is a compound literal. It provides an
unnamed object whose value is given by
the initializer list.
...
EXAMPLE 1 The file scope definition
int *p = (int []){2, 4};
initializes p
to point to the first element of an
array of two ints, the first having
the value two and the second, four.
The expressions in this compound
literal are required to be constant.
The unnamed object has static storage
duration.
Allocates, on the stack, space for [an array of] two ints.
Populates [the array of] the two ints with the values 0 and 2, respectively.
Declares a local variable of type int* and assigns to that variable the address of [the array of] the two ints.
(int[2]) tells the compiler that the following expression should be casted to int[2]. This is required since {0, 2} can be casted to different types, like long[2]. Cast occurs at compile time - not runtime.
The entire expression creates an array in memory and sets a to point to this array.
{0, 2} is the notation for an array consisting of 0 and 2.
(int[2]) casts it to an array (don't know why).
int * a = assigns it to the int pointer a.
Related
Given that this is legal
uint8_t bytes[4] = { 1, 2, 3, 4 };
And this is not:
uint8_t bytes2[4];
bytes2 = { 1, 2, 3, 4 };
what does { 1, 2, 3, 4 } represent?
Am assuming it's neither an rvalue or an lvalue. A preprocessor code candy that expands to something?
A syntax like {1,2,3,4}; is called brace-enclosed initializer list, it's an initializer. It can be only used for initialization (for array type).
Quoting C11, chapter §6.7.9
P11
The initializer for a scalar shall be a single expression,
[an array is not a scalar type, so not applicable for us]
P14
An array of character type may be initialized by a character string literal or UTF−8 string
literal, optionally enclosed in braces.
[We are not using a string literal here, so also not applicable for us]
P16
Otherwise, the initializer for an object that has aggregate or union type shall be a brace-enclosed
list of initializers for the elements or named members.
[This is the case of our interest]
and, P17,
Each brace-enclosed initializer list has an associated current object. When no
designations are present, subobjects of the current object are initialized in order according
to the type of the current object: array elements in increasing subscript order, structure
members in declaration order, and the first named member of a union.[....]
So, here, the values from the brace enclosed list are not directly "assigned" to the array, they are used to initialize individual members of the array.
OTOH, an array type, is not a modifiable lvalue, so it cannot be assigned. In other words, an array variable cannot be used as the LHS of the assignment operator.
To elaborate, from C11, chapter §6.5.16
assignment operator shall have a modifiable lvalue as its left operand.
{1,2,3,4} is an initializer-list, a particular syntax token that can only be used on the line where the array is declared.
This is regulated purely by the C standard syntax. There's no particular rationale behind it, this is just how the language is defined. In the C syntax, arrays cannot be assigned to, nor copied by assignment.
You can however dodge the syntax restrictions in several ways, to overwrite all values at once. The simplest way is just to create a temporary array and memcpy:
uint8_t tmp[] = {5,6,7,8};
memcpy(bytes, tmp, sizeof bytes);
Alternatively, use a compound literal:
memcpy(bytes, (uint8_t[]){5,6,7,8}, sizeof bytes);
If it makes sense for the specific application, you can also wrap the array in a struct to bypass the syntax restrictions:
typedef struct
{
uint8_t data [4];
} array_t;
...
array_t bytes = { .data = {1,2,3,4} };
array_t tmp = { .data = {5,6,7,8} };
bytes = tmp; // works just fine, structs can be copied this way
Initialization and assignment are fundamentally different things. As for the language C, you just have to accept the fact that they are distinguished, but of course, there's a technical reason it's defined this way:
In many systems, you can have a data segment in your executable file. This segment could be read/write and given an initialized array like
uint8_t foo[] = {1, 2, 3, 4}; // assume this has static storage duration
the compiler could just decide to output this exact byte sequence directly into your executable. So there's no code at all doing an assignment, the data is already there in memory when your program starts.
OTOH, arrays cannot be assigned to (only their individual members). That's how C is defined, and it's sometimes unfortunate.
{1,2,3,4} is an initializer list. It an be used to specify the initial value of an object with at least 4 elements, be they array elements or structure members including those of nested objects.
You cannot use this syntax to assign values to an array as this:
bytes2 = {1,2,3,4};
Because the syntax is not supported, and arrays are not lvalues.
You can use intializer lists as part of a C99 syntax known as compound literals to create objects and use them as rvalues for assignment, return values or function arguments:
struct quad { int x, y, z, t; };
struct quad p;
p = (struct quad){1,2,3,4};
You still cannot use this for arrays because they are not lvalues, but you can achieve the same effect with a call to memcpy():
uint8_t bytes2[4];
memcpy(bytes2, (uint8_t[4]){1,2,3,4}, sizeof(bytes2));
This statement is compiled by clang as a single intel instruction as can be seen on Godbolt's Compiler Explorer
This question already has answers here:
Why are compound literals in C modifiable
(2 answers)
Closed 4 years ago.
In C99 we can use compound literals as unnamed array.
But are this literals constants like for example 100, 'c', 123.4f, etc.
I noticed that I can do:
((int []) {1,2,3})[0] = 100;
and, I have no compilation error and is guessable that the first element of that unnamed array is modified with 100.
So it seems as array as compound literal are lvalue and not constant value.
It is an lvalue, we can see this if we look at the draft C99 standard section 6.5.2.5 Compound literals it says (emphasis mine):
If the type name specifies an array of unknown size, the size is
determined by the initializer list as specified in 6.7.8, and the type
of the compound literal is that of the completed array type. Otherwise
(when the type name specifies an object type), the type of the
compound literal is that specified by the type name. In either case,
the result is an lvalue.
If you want a const version, later on in the same section it gives the following example:
EXAMPLE 4 A read-only compound literal can be specified through
constructions like:
(const float []){1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6}
We can find an explanation of the terminology in this Dr Dobb's article The New C: Compound Literals and says:
Compound literals are not true constants in that the value of the
literal might change, as is shown later. This brings us to a bit of
terminology. The C99 and C90 Standards [2, 3] use the word “constant”
for tokens that represent truly unchangeable values that are
impossible to modify in the language. Thus, 10 and 3.14 are an integer
decimal constant and a floating constant of type double, respectively.
The word “literal” is used for the representation of a value that
might not be so constant. For example, early C implementations
permitted the values of quoted strings to be modified. C90 and C99
banned the practice by saying that any program than modified a string
literal had undefined behavior, which is the Standard’s way of saying
it might work, or the program might fail in a mysterious way. [...]
As far I remeber you are right, compound literals are lvalues*, you can also take pointer of such literal (which points to its first element):
int *p = (int []){1, 2, 3};
*p = 5; /* modified first element */
It is also possible to apply const qualifier on such compound literal, so elements are read-only:
const int *p = (const int []){1, 2, 3};
*p = 5; /* wrong, violation of `const` qualifier */
*Note this not means it's automatically modifiable lvalue (so it can used as left operand for assignment operator) since it has array type and refering to C99 draft 6.3.2.1 Lvalues, arrays, and function designators:
A modifiable lvalue is an lvalue that does not have array type, [...]
Referring to the C11 standard draft N1570:
Section 6.5.2.5p4:
In either case, the result is an lvalue.
An "lvalue" is, roughly, an expression that designates an object -- but it's important to note that not all lvalues are modifiable. A simple example:
const int x = 42;
The name x is an lvalue, but it's not a modifiable lvalue. (Expressions of array type cannot be modifiable lvalues, because you can't assign to an array object, but the elements of an array may be modifiable.)
Paragraph 5 of the same section:
The value of the compound literal is that of an unnamed object
initialized by the initializer list. If the compound literal occurs
outside the body of a function, the object has static storage
duration; otherwise, it has automatic storage duration associated with
the enclosing block.
The section describing compound literals doesn't specifically say that whether the unnamed object is modifiable or not. In the absence of such a statement, the object is taken to be modifiable unless the type is const-qualified.
The example in the question:
((int []) {1,2,3})[0] = 100;
is not particularly useful, since there's no way to refer to the unnamed object after the assignment. But a similar construct can be quite useful. A contrived example:
#include <stdio.h>
int main(void) {
int *ptr = (int[]){1, 2, 3};
ptr[0] = 100;
printf("%d %d %d\n", ptr[0], ptr[1], ptr[2]);
}
As mentioned above, the array has automatic storage duration, which means that if it's created inside a function, it will cease to exist when the function returns. Compound literals are not a replacement for malloc.
Compound literals are lvalues and it's elements can be modifiable. You can assign value to it. Even pointer to compound literals are allowed.
I'm programming in C99 and use variable length arrays in one portion of my code. I know in C89 zero-length arrays are not allowed, but I'm unsure of C99 and variable length arrays.
In short, is the following well defined behavior?
int main()
{
int i = 0;
char array[i];
return 0;
}
No, zero-length arrays are explicitly prohibited by C language, even if they are created as VLA through a run-time size value (as in your code sample).
6.7.5.2 Array declarators
...
5 If the size is an expression that is not an integer constant
expression: if it occurs in a declaration at function prototype scope,
it is treated as if it were replaced by *; otherwise, each time it is
evaluated it shall have a value greater than zero.
Zero-length arrays are not allowed in C. Statically typed arrays must have a fixed, non-zero size that is a constant expression, and variable-length-arrays must have a size which evaluates non-zero; C11 6.7.6.2/5:
each time it [the size expression] is evaluated it shall have a value greater than zero
However, C99 and C11 have a notion of a flexible array member of a struct:
struct foo
{
int a;
int data[];
};
From C11, 6.7.21/18:
As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply. However, when a . (or ->) operator has a left operand that is
(a pointer to) a structure with a flexible array member and the right operand names that
member, it behaves as if that member were replaced with the longest array (with the same
element type) that would not make the structure larger than the object being accessed;
Zero-length arrays are not allowed in standard C(not even C99 or C11). But gcc does provide an extension to allow it. See http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
struct line {
int length;
char contents[0];
};
struct line *thisline = (struct line *)
malloc (sizeof (struct line) + this_length);
thisline->length = this_length;
In C we are allowed to assign the value of one structure variable to other if they are of the same type.In accordance with that, in my following program I am allowed to use s1=s2 when both are struct variables of the same type.But why then I am not allowed to use s1={59,3.14} after that?
I know we can't assign a string "Test" to a character array arr other than in the initialization statement because for the string "Test",it decomposes to type char* during assignment and hence there is a type mismatch error.But in my program, {59,3.14} doesn't decompose to any pointer,does it?Why then it is not allowed to be assigned to s1 even though it is of same type,especially since it is allowed during the initialization?What is the different between s2 and {59,3.14} such that one is allowed to be assigned to s1 but the other is not?
#include<stdio.h>
int main(void)
{
struct test1
{
int a;
float b;
} s1= {25,3.5},s2= {38,9.25};
printf("%d,%f\n",s1.a,s1.b);
s1=s2; // Successful
printf("%d,%f\n",s1.a,s1.b);
s1= {59,3.14}; //ERROR:expected expression before '{' token|
printf("%d,%f\n",s1.a,s1.b);
}
The C grammar strictly distinguishes between assignment and initialization.
For initialization it is clear what the type on the right side ought to be: the type of the object that is declared. So the initializer notation is unambiguous; { a, b, c } are the fields in declaration order.
For assignment things are less clear. An assignment expression X = Y first evaluates both subexpressions (X and Y), looks at their types and then does the necessary conversions, if possible, from the type of Y to the type of X. An expression of the form { a, b, c } has no type, so the mechanism doesn't work.
The construct that yoones uses in his answer is yet another animal, called compound literal. This is a way of creating an unnamed auxiliary object of the specified type. You may use it in initializations or any other place where you'd want to use a temporary object. The storage class and lifetime of a compound literal is deduced from the context where it is used. If it is in function scope, it is automatic (on the "stack") as would be a normal variable that would be declared in the same block, only that it doesn't have a name. If it is used in file scope (intialization of a "global" variable, e.g) is has static storage duration and a lifetime that is the whole duration of the program execution.
You need to cast it this way: s1 = (struct test1){59, 3.14}; to let the compiler know that it should consider your {...} of type struct test1.
Put in an other way, your data gathered between brackets doesn't have a type, that's why you need to specify one using a cast.
Edit:
The compiler needs to know the expected type for each struct's fields. This is needed to know the right number of bytes for each argument, for padding, etc. Otherwise it could as well copy the value 59 (which is meant to be an int) as a char since it's a value that fits in one byte.
The following compiles and prints "string" as an output.
#include <stdio.h>
struct S { int x; char c[7]; };
struct S bar() {
struct S s = {42, "string"};
return s;
}
int main()
{
printf("%s", bar().c);
}
Apparently this seems to invokes an undefined behavior according to
C99 6.5.2.2/5 If an attempt is made to modify the result of a function
call or to access it after the next sequence point, the behavior is
undefined.
I don't understand where it says about "next sequence point". What's going on here?
You've run into a subtle corner of the language.
An expression of array type is, in most contexts, implicitly converted to a pointer to the first element of the array object. The exceptions, none of which apply here, are:
When the array expression is the operand of a unary & operator (which yields the address of the entire array);
When it's the operand of a unary sizeof or (as of C11) _Alignof operator (sizeof arr yields the size of the array, not the size of a pointer); and
When it's a string literal in an initializer used to initialize an array object (char str[6] = "hello"; doesn't convert "hello" to a char*.)
(The N1570 draft incorrectly adds _Alignof to the list of exceptions. In fact, for reasons that are not clear, _Alignof can only be applied to a type name, not to an expression.)
Note that there's an implicit assumption: that the array expression refers to an array object in the first place. In most cases, it does (the simplest case is when the array expression is the name of a declared array object) -- but in this one case, there is no array object.
If a function returns a struct, the struct result is returned by value. In this case, the struct contains an array, giving us an array value with no corresponding array object, at least logically. So the array expression bar().c decays to a pointer to the first element of ... er, um, ... an array object that doesn't exist.
The 2011 ISO C standard addresses this by introducing "temporary lifetime", which applies only to "A non-lvalue expression with structure or union type, where the structure or union
contains a member with array type" (N1570 6.2.4p8). Such an object may not be modified, and its lifetime ends at the end of the containing full expression or full declarator.
So as of C2011, your program's behavior is well defined. The printf call gets a pointer to the first element of an array that's part of a struct object with temporary lifetime; that object continues to exist until the printf call finishes.
But as of C99, the behavior is undefined -- not necessarily because of the clause you quote (as far as I can tell, there is no intervening sequence point), but because C99 doesn't define the array object that would be necessary for the printf to work.
If your goal is to get this program to work, rather than to understand why it might fail, you can store the result of the function call in an explicit object:
const struct s result = bar();
printf("%s", result.c);
Now you have a struct object with automatic, rather than temporary, storage duration, so it exists during and after the execution of the printf call.
The sequence point occurs at the end of the full expression- i.e., when printf returns in this example. There are other cases where sequence points occur
Effectively, this rule states that function temporaries do not live beyond the next sequence point- which in this case, occurs well after it's use, so your program has quite well-defined behaviour.
Here's a simple example of not well-defined behaviour:
char* c = bar().c; *c = 5; // UB
Here, the sequence point is met after c is created, and the memory it points to is destroyed, but we then attempt to access c, resulting in UB.
In C99 there is a sequence point at the call to a function, after the arguments have been evaluated (C99 6.5.2.2/10).
So, when bar().c is evaluated, it results in a pointer to the first element in the char c[7] array in the struct returned by bar(). However, that pointer gets copied into an argument (a nameless argument as it happens) to printf(), and by the time the call is actually made to the printf() function the sequence point mentioned above has occurred, so the member that the pointer was pointing to may no longer be alive.
As Keith Thomson mentions, C11 (and C++) make stronger guarantees about the lifetime of temporaries, so the behavior under those standards would not be undefined.