Given that this is legal
uint8_t bytes[4] = { 1, 2, 3, 4 };
And this is not:
uint8_t bytes2[4];
bytes2 = { 1, 2, 3, 4 };
what does { 1, 2, 3, 4 } represent?
Am assuming it's neither an rvalue or an lvalue. A preprocessor code candy that expands to something?
A syntax like {1,2,3,4}; is called brace-enclosed initializer list, it's an initializer. It can be only used for initialization (for array type).
Quoting C11, chapter §6.7.9
P11
The initializer for a scalar shall be a single expression,
[an array is not a scalar type, so not applicable for us]
P14
An array of character type may be initialized by a character string literal or UTF−8 string
literal, optionally enclosed in braces.
[We are not using a string literal here, so also not applicable for us]
P16
Otherwise, the initializer for an object that has aggregate or union type shall be a brace-enclosed
list of initializers for the elements or named members.
[This is the case of our interest]
and, P17,
Each brace-enclosed initializer list has an associated current object. When no
designations are present, subobjects of the current object are initialized in order according
to the type of the current object: array elements in increasing subscript order, structure
members in declaration order, and the first named member of a union.[....]
So, here, the values from the brace enclosed list are not directly "assigned" to the array, they are used to initialize individual members of the array.
OTOH, an array type, is not a modifiable lvalue, so it cannot be assigned. In other words, an array variable cannot be used as the LHS of the assignment operator.
To elaborate, from C11, chapter §6.5.16
assignment operator shall have a modifiable lvalue as its left operand.
{1,2,3,4} is an initializer-list, a particular syntax token that can only be used on the line where the array is declared.
This is regulated purely by the C standard syntax. There's no particular rationale behind it, this is just how the language is defined. In the C syntax, arrays cannot be assigned to, nor copied by assignment.
You can however dodge the syntax restrictions in several ways, to overwrite all values at once. The simplest way is just to create a temporary array and memcpy:
uint8_t tmp[] = {5,6,7,8};
memcpy(bytes, tmp, sizeof bytes);
Alternatively, use a compound literal:
memcpy(bytes, (uint8_t[]){5,6,7,8}, sizeof bytes);
If it makes sense for the specific application, you can also wrap the array in a struct to bypass the syntax restrictions:
typedef struct
{
uint8_t data [4];
} array_t;
...
array_t bytes = { .data = {1,2,3,4} };
array_t tmp = { .data = {5,6,7,8} };
bytes = tmp; // works just fine, structs can be copied this way
Initialization and assignment are fundamentally different things. As for the language C, you just have to accept the fact that they are distinguished, but of course, there's a technical reason it's defined this way:
In many systems, you can have a data segment in your executable file. This segment could be read/write and given an initialized array like
uint8_t foo[] = {1, 2, 3, 4}; // assume this has static storage duration
the compiler could just decide to output this exact byte sequence directly into your executable. So there's no code at all doing an assignment, the data is already there in memory when your program starts.
OTOH, arrays cannot be assigned to (only their individual members). That's how C is defined, and it's sometimes unfortunate.
{1,2,3,4} is an initializer list. It an be used to specify the initial value of an object with at least 4 elements, be they array elements or structure members including those of nested objects.
You cannot use this syntax to assign values to an array as this:
bytes2 = {1,2,3,4};
Because the syntax is not supported, and arrays are not lvalues.
You can use intializer lists as part of a C99 syntax known as compound literals to create objects and use them as rvalues for assignment, return values or function arguments:
struct quad { int x, y, z, t; };
struct quad p;
p = (struct quad){1,2,3,4};
You still cannot use this for arrays because they are not lvalues, but you can achieve the same effect with a call to memcpy():
uint8_t bytes2[4];
memcpy(bytes2, (uint8_t[4]){1,2,3,4}, sizeof(bytes2));
This statement is compiled by clang as a single intel instruction as can be seen on Godbolt's Compiler Explorer
Related
This seems like a hole in my knowledge. As far as I am aware, in C99 if you initialise a single element of a struct and no others, the others are zero initialised. Does the following code zero initialise all the members of a struct though?
typedef struct
{
int foo;
int bar;
char* foos;
double dar;
} some_struct_t;
some_struct_t mystructs[100] = {};
Update: There are some comments indicating that this syntax is an extension. If that is the case, is there any way of doing this that is pure C99 compliant?
As per C11, chapter §6.7.9, Initialization syntax, (for the sake of completeness, same mentioned in chapter §6.7.8 in C99)
initializer:
assignment-expression
{ initializer-list }
{ initializer-list , }
initializer-list:
designationopt initializer
initializer-list , designationopt initializer
designation:
designator-list =
designator-list:
designator
designator-list designator
designator:
[ constant-expression ]
. identifier
Which implies, the brace closed initializer list should have at minimum one initializer element (object).
In your code, the empty initializer list
some_struct_t mystructs[100] = {}; //empty list
is not a valid pure C syntax; it's a compiler extension.
You need to mention a single element in the list to make it standard conforming, like
some_struct_t mystructs[100] = {0};
which meets the criteria, from paragraph 21 of same standard(s),
If there are fewer initializers in a brace-enclosed list than there are elements or members
of an aggregate, or fewer characters in a string literal used to initialize an array of known
size than there are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage duration.
So, in this case, you have one explicit 0 and remaining implicit zero-initialization (or similar).
Because it's array initialisation, you would need
some_struct_t mystructs[100] = { 0 }; // ensure all array elements (struct) being zero initialisation
For structs/unions (and arrays) there is a rule saying that if it is partially initialized, the rest of the items that didn't get initialized explicitly by the programmer are set to zero.
So by typing some_struct_t mystructs[100] = {0}; you tell the compiler to explicitly set foo to zero. And then the rest of the struct members gets set to zero as well, implicitly.
This has nothing to do with C99, but works for all C standard versions. Although in C99/C11, a designated initializer {.foo=0} would have achieved the very same result.
In C language, an array cannot be copied to another array directly by assignment operator.
int arr1[]={1,3,2};
int arr2[]={0};
arr2=arr1;//not possible
Also, we cannot assign values to an array that is already defined, if I am not wrong...
int a[3];
a[]={1,3,2}; //this is not possible
In the code above, a[] and {1,3,2} act as two different arrays, and an assignment operator is used between them. So, is this following the same case mentioned at the first?
Please clarify.
Thanks.
is this following the same case mentioned at the first?
No, they are different.
In the first case, what you try to do is array assignment, which is not possible in C directly. The language grammar states an array name to be a non-modifiable lvalue, so
arr1 = arr2;
is invalid. The relevant excerpt from the C11 standard draft (§6.3.2.1):
A modifiable lvalue is an lvalue that does not have array type, ...
In the second case
int a[3];
a[]={1,3,2};
you are trying to assign the initializer list {1, 2, 3} to an invalid sub-expression a[] and hence is illegal. What is assigned to it is immaterial; a[] = 1 is invaild too.
It's not quite the same.
{1,2,3} is an initializer list. The compiler sees it and hard-codes its values into the array. Assigning an initializer-list to an array that already has a definition (int array[3]; array={1,2,3};) is a syntax error (clang says expected expression because it's expecting a closing square bracket before an assignment operator as in int array[]={1,2,3};).
Assigning one array to another is impossible in your case because array type 'int [3]' is not assignable. I'd call it something like logic error (the programmer just didn't know he can't do it, but the syntax is valid).
The other posts and comments are correct in that you cannot assign a bare array after initialization.
However, you can assign to a struct that contains such an array, even after initialization, using a compound literal,
typedef struct {
int v[3];
} A;
A a = {{1,2,3}}; // Valid.
a = (A){{4,5,6}}; // Also valid.
This question already has answers here:
Why are compound literals in C modifiable
(2 answers)
Closed 4 years ago.
In C99 we can use compound literals as unnamed array.
But are this literals constants like for example 100, 'c', 123.4f, etc.
I noticed that I can do:
((int []) {1,2,3})[0] = 100;
and, I have no compilation error and is guessable that the first element of that unnamed array is modified with 100.
So it seems as array as compound literal are lvalue and not constant value.
It is an lvalue, we can see this if we look at the draft C99 standard section 6.5.2.5 Compound literals it says (emphasis mine):
If the type name specifies an array of unknown size, the size is
determined by the initializer list as specified in 6.7.8, and the type
of the compound literal is that of the completed array type. Otherwise
(when the type name specifies an object type), the type of the
compound literal is that specified by the type name. In either case,
the result is an lvalue.
If you want a const version, later on in the same section it gives the following example:
EXAMPLE 4 A read-only compound literal can be specified through
constructions like:
(const float []){1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6}
We can find an explanation of the terminology in this Dr Dobb's article The New C: Compound Literals and says:
Compound literals are not true constants in that the value of the
literal might change, as is shown later. This brings us to a bit of
terminology. The C99 and C90 Standards [2, 3] use the word “constant”
for tokens that represent truly unchangeable values that are
impossible to modify in the language. Thus, 10 and 3.14 are an integer
decimal constant and a floating constant of type double, respectively.
The word “literal” is used for the representation of a value that
might not be so constant. For example, early C implementations
permitted the values of quoted strings to be modified. C90 and C99
banned the practice by saying that any program than modified a string
literal had undefined behavior, which is the Standard’s way of saying
it might work, or the program might fail in a mysterious way. [...]
As far I remeber you are right, compound literals are lvalues*, you can also take pointer of such literal (which points to its first element):
int *p = (int []){1, 2, 3};
*p = 5; /* modified first element */
It is also possible to apply const qualifier on such compound literal, so elements are read-only:
const int *p = (const int []){1, 2, 3};
*p = 5; /* wrong, violation of `const` qualifier */
*Note this not means it's automatically modifiable lvalue (so it can used as left operand for assignment operator) since it has array type and refering to C99 draft 6.3.2.1 Lvalues, arrays, and function designators:
A modifiable lvalue is an lvalue that does not have array type, [...]
Referring to the C11 standard draft N1570:
Section 6.5.2.5p4:
In either case, the result is an lvalue.
An "lvalue" is, roughly, an expression that designates an object -- but it's important to note that not all lvalues are modifiable. A simple example:
const int x = 42;
The name x is an lvalue, but it's not a modifiable lvalue. (Expressions of array type cannot be modifiable lvalues, because you can't assign to an array object, but the elements of an array may be modifiable.)
Paragraph 5 of the same section:
The value of the compound literal is that of an unnamed object
initialized by the initializer list. If the compound literal occurs
outside the body of a function, the object has static storage
duration; otherwise, it has automatic storage duration associated with
the enclosing block.
The section describing compound literals doesn't specifically say that whether the unnamed object is modifiable or not. In the absence of such a statement, the object is taken to be modifiable unless the type is const-qualified.
The example in the question:
((int []) {1,2,3})[0] = 100;
is not particularly useful, since there's no way to refer to the unnamed object after the assignment. But a similar construct can be quite useful. A contrived example:
#include <stdio.h>
int main(void) {
int *ptr = (int[]){1, 2, 3};
ptr[0] = 100;
printf("%d %d %d\n", ptr[0], ptr[1], ptr[2]);
}
As mentioned above, the array has automatic storage duration, which means that if it's created inside a function, it will cease to exist when the function returns. Compound literals are not a replacement for malloc.
Compound literals are lvalues and it's elements can be modifiable. You can assign value to it. Even pointer to compound literals are allowed.
I'm programming in C99 and use variable length arrays in one portion of my code. I know in C89 zero-length arrays are not allowed, but I'm unsure of C99 and variable length arrays.
In short, is the following well defined behavior?
int main()
{
int i = 0;
char array[i];
return 0;
}
No, zero-length arrays are explicitly prohibited by C language, even if they are created as VLA through a run-time size value (as in your code sample).
6.7.5.2 Array declarators
...
5 If the size is an expression that is not an integer constant
expression: if it occurs in a declaration at function prototype scope,
it is treated as if it were replaced by *; otherwise, each time it is
evaluated it shall have a value greater than zero.
Zero-length arrays are not allowed in C. Statically typed arrays must have a fixed, non-zero size that is a constant expression, and variable-length-arrays must have a size which evaluates non-zero; C11 6.7.6.2/5:
each time it [the size expression] is evaluated it shall have a value greater than zero
However, C99 and C11 have a notion of a flexible array member of a struct:
struct foo
{
int a;
int data[];
};
From C11, 6.7.21/18:
As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply. However, when a . (or ->) operator has a left operand that is
(a pointer to) a structure with a flexible array member and the right operand names that
member, it behaves as if that member were replaced with the longest array (with the same
element type) that would not make the structure larger than the object being accessed;
Zero-length arrays are not allowed in standard C(not even C99 or C11). But gcc does provide an extension to allow it. See http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
struct line {
int length;
char contents[0];
};
struct line *thisline = (struct line *)
malloc (sizeof (struct line) + this_length);
thisline->length = this_length;
I was very surprised when I saw this notation. What does it do and what kind of C notion is it?
This is a compound literal as defined in section 6.5.2.5 of the C99 standard.
It's not part of the C++ language, so it's not surprising that C++ compilers don't compile it. (or Java or Ada compilers for that matter)
The value of the compound literal is that of an unnamed object initialized by the
initializer list. If the compound literal occurs outside the body of a function, the object
has static storage duration; otherwise, it has automatic storage duration associated with
the enclosing block.
So no, it won't destroy the stack. The compiler allocates storage for the object.
Parenthesis are put around the type and it is then followed by an initializer list - it's not a cast, as a bare initialiser list has no meaning in C99 syntax; instead, it is a postfix operator applied to a type which yields an object of the given type. You are not creating { 0, 3 } and casting it to an array, you're initialising an int[2] with the values 0 and 3.
As to why it's used, I can't see a good reason for it in your single line, although it might be that a could be reassigned to point at some other array, and so it's a shorter way of doing the first two lines of:
int default_a[] = { 0, 2 };
int *a = default_a;
if (some_test) a = get_another_array();
I've found it useful for passing temporary unions to functions
// fills an array of unions with a value
kin_array_fill ( array, ( kin_variant_t ) { .ref = value } )
This is a c99 construct, called a compound literal.
From the May 2005 committee draft section 6.5.2.5:
A postfix expression that consists of
a parenthesized type name followed by
a brace- enclosed list of initializers
is a compound literal. It provides an
unnamed object whose value is given by
the initializer list.
...
EXAMPLE 1 The file scope definition
int *p = (int []){2, 4};
initializes p
to point to the first element of an
array of two ints, the first having
the value two and the second, four.
The expressions in this compound
literal are required to be constant.
The unnamed object has static storage
duration.
Allocates, on the stack, space for [an array of] two ints.
Populates [the array of] the two ints with the values 0 and 2, respectively.
Declares a local variable of type int* and assigns to that variable the address of [the array of] the two ints.
(int[2]) tells the compiler that the following expression should be casted to int[2]. This is required since {0, 2} can be casted to different types, like long[2]. Cast occurs at compile time - not runtime.
The entire expression creates an array in memory and sets a to point to this array.
{0, 2} is the notation for an array consisting of 0 and 2.
(int[2]) casts it to an array (don't know why).
int * a = assigns it to the int pointer a.