int q = {1,2}; peculiar initialization list - c

I came across the below initialization , it is seen that VS2012
shows an error complaining about too many initializers. in GCC it seems to
return the first element as the value.
why is this peculiar initialization supported in GCC?
#include <stdio.h>
int main()
{
int q = {1,2};
char c = {'s','t','\0'}; /* c is 's' */
printf("%d\n",q); /* prints 1*/
}

C11: 6.7.9 Initialization (p11):
The initializer for a scalar shall be a single expression, optionally enclosed in braces.
Therefore, this is allowed
int q = {1};
You can enclose the initializer for scalar objects in braces ({}). Note the verb shall is used here. The standard says:
5.1.1.3 Diagnostics (P1):
A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined
So, it is up to the compiler how it handles
int q = {1,2};
Compiled on GCC 4.8.1 with flags -pedantic -Wall -Wextra and it raised a warning
[Warning] excess elements in scalar initializer [enabled by default]
Now the question is: What happend with the remaining initializers?
It's a bug.
Note: C11: 6.5.17 (p3) says that the comma operator cannot appear in contexts where a comma is used to separate items in a list (such as arguments to functions or lists of initializers).
Do not confused the , in {1,2} with comma operator. As Keith Thompson pointed out that, the expression in initializer to be an assignment-expression and it must not contain comma operator at top-level. That means it can be used within a parenthesized expression or within the second expression of a conditional operator in such contexts. In the function call
f(a, (t=3, t+2), c)
the function has three arguments, the second of which has the value 5.

Related

MSVC: why "extern void x;" is "illegal use of type 'void'"?

Why this code:
extern void x;
leads to:
$ cl t555.c /std:c11 /Za
t555.c(1): error C2182: 'x': illegal use of type 'void'
What is illegal here?
UPD. Use case:
$ cat t555a.c t555.p.S
#include <stdio.h>
extern void x;
int main(void)
{
printf("%p\n", &x);
return 0;
}
.globl x
x:
.space 4
$ gcc t555a.c -std=c11 -pedantic -Wall -Wextra -c && as t555.p.S -o t555.p.o && gcc t555a.o t555.p.o && ./a.exe
t555a.c: In function ‘main’:
t555a.c:7:20: warning: taking address of expression of type ‘void’
7 | printf("%p\n", &x);
| ^
0x1004010c0
$ clang t555a.c -std=c11 -pedantic -Wall -Wextra -c && as t555.p.S -o t555.p.o && clang t555a.o t555.p.o && ./a.exe
t555a.c:7:20: warning: ISO C forbids taking the address of an expression of type 'void' [-Wpedantic]
printf("%p\n", &x);
^~
1 warning generated.
00007FF76E051120
This is an interesting case. It does not appear to violate any constraints to declare an identifier x of type void with external linkage, but it is nearly unusable.
void “is an incomplete object type that cannot be completed” (C 2018 6.2.5 19). When an identifier for an object is declared with no linkage, the type must “be complete by the end of its declarator” (6.7 7). But the same is not true for identifiers with external linkage; we can declare extern int a[]; extern struct foo b; and define a and b later, even in another translation unit.
If x is not used, I do not see that it violates any constraint. If the program attempted to use it, then 6.9 5 would apply:
… If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof or _Alignof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.
But we cannot define x in C code because it has an incomplete type, and its type cannot be completed. As long as it is not defined, we cannot use x in an expression other than as the operand of sizeof or _Alignof, due the above paragraph, and neither can we use it with sizeof or _Alignof, because those operators require a complete type.
We could imagine that x is defined outside of C and linked with this C code. So some assembly module might provide a definition for x that is unknown to the C code. Of course, the C code cannot use the value of the object without having a definition for the type. But it could use the address of x. For example, it could serve as a sentinel or other token for pointer values. E.g., we could pass a list of lists of pointers to another routine as a list of pointers where the sublists were separated by &x and the end of the whole list was marked by a null pointer. (So two sublists (&a, &b, &c) and (&d, &e, &f) would be passed as (void *[]) { &a, &b, &c, &x, &d, &e, &f, NULL };.)
However, compiling printf("%p\n", &x); with Clang and using -pedantic produces the error message “ISO C forbids taking the address of an expression of type 'void'”. The core reason for this appears to be that 6.3.2.1 1 excludes an object of void type from being an lvalue:
An lvalue is an expression (with an object type other than void) that potentially designates an object;…
and 6.5.3.2 1 requires the operand of unary & to be an lvalue:
The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object…
This is likely an incompletely designed part of the C standard, as it does not preclude a const void from being an lvalue, and Clang compiles extern const void x; printf("%p\n", &x); without complaint, but there seems to be no reason for the standard to treat const void and void differently in this regard.
On the one hand, Microsoft may have concluded there is no way to use this x and so are issuing a diagnostic for it as soon as the extern void x is found rather than letting an error happen when code attempts to use this x. However, while a compiler is free to issue additional diagnostic messages, it ought to accept a conforming program. That is, for a compiler that conforms to the C standard, the diagnostic may be a warning but may not be an error that prevents compilation.
Supplementary Note
Noting that the constraint for unary & allows “the result of a [] or unary * operator”, I tested this:
static void foo(void *p)
{
printf("%p\n", &*p);
}
Here, *p by itself is an lvalue of type void, and this is allowed for & because the constraint specifically allows it, while &x would seem to be a very similar expression, taking the address of a void, but the constraint does not allow it since x is neither an lvalue nor a result of *. Curious.

C Pre-Processor Macro code with () and {}

#include <stdio.h>
#define a (1,2,3)
#define b {1,2,3}
int main()
{
unsigned int c = a;
unsigned int d = b;
printf("%d\n",c);
printf("%d\n",d);
return 0;
}
Above C code will print output as 3 and 1.
But how are #define a (1,2,3) and #define b {1,2,3} taking a=3 and b=1 without build warning, and also how () and {} are giving different values?
Remember, pre-processor just replaces macros. So in your case you code will be converted to this:
#include <stdio.h>
int main()
{
unsigned int c = (1,2,3);
unsigned int d = {1,2,3};
printf("%d\n",c);
printf("%d\n",d);
return 0;
}
In first case, you get result from , operator, so c will be equal to 3. But in 2nd case you get first member of initializer list for d, so you will get 1 as result.
2nd lines creates error if you compile code as c++. But it seems that you can compile this code in c.
In addition to other answers,
unsigned int d = {1,2,3};
(after macro substitution)
is not valid in C. It violates 6.7.9 Initialization:
No initializer shall attempt to provide a value for an object not contained within the entity being initialized.
With stricter compilation options (gcc -std=c17 -Wall -Wextra -pedantic test.c), gcc produces:
warning: excess elements in scalar initializer
unsigned int d = {1,2,3};
^
However, note that
unsigned int d = {1};
is valid because initializing scalar with braces is allowed. Just the extra initializer values that's the problem with the former snippet.
For c, the initializer is an expression, and its value is 3. For d, the initializer is a list in braces, and it provides too many values, of which only the first is used.
After macro expansion, the definitions of c and d are:
unsigned int c = (1,2,3);
unsigned int d = {1,2,3};
In the C grammar, the initializer that appears after unsigned int c = or unsigned int d = may be either an assignment-expression or { initializer-list } (and may have a final comma in that list). (This comes from C 2018 6.7.9 1.)
In the first line, (1,2,3) is an assignment-expression. In particular, it is a primary-expression of the form ( expression ). In that, the expression uses the comma operator; it has the form expression , assignment-expression. I will omit the continued expansion of the grammar. Suffice it to say that 1,2,3 is an expression built with comma operators, and the value of the comma operator is simply its right-hand operand. So the value of 1,2 is 2, and the value of 1,2,3 is 3. And the value of the parentheses expression is the value of the expression inside it, so the value of (1,2,3) is 3. Therefore, c is initialized to 3.
In contrast, in the second line, {1,2,3} is { initializer-list }. According to the text in C clause 6.7.9, the initializer-list provides values used to initialize the object being defined. The { … } form is provided to initialize arrays and structures, but it can be used to initialize scalar objects too. If we wrote unsigned int d = {1};, this would initialize d to 1.
However, 6.7.9 2 is a constraint that says “No initializer shall attempt to provide a value for an object not contained within the entity being initialized.” This means you may not provide more initial values than there are things to be initialized. Therefore, unsigned int d = {1,2,3}; violates the constraint. A compiler is required to produce a diagnostic message. Additionally, your compiler seems to have gone on and used only the first value in the list to initialize d. The others were superfluous and were ignored.
(Additionally, 6.7.9 11 says “The initializer for a scalar shall be a single expression, optionally enclosed in braces.”)

Precedence of assignment [duplicate]

This question already has answers here:
int q = {1,2}; peculiar initialization list
(1 answer)
"int *nums = {5, 2, 1, 4}" causes a segmentation fault
(5 answers)
Closed 5 years ago.
In c ,
main() {
int a = (1,2,3,4);
printf("%d",a);
}
yields an output of
4
This is because comma(,)operator has a right to left precedence .
But
main() {
int a = {1,2,3,4};
printf("%d",a);
}
yields an output
1
anyone pls explain the logic behind this.
Thanks
{1,2,3,4} is the syntax for an initializer, used to initialize something that has more than one value, like an array or a struct. So the commas inside it are not operators, they are just part of the initializer syntax.
When initializing, C uses the values from the initializer from the left. As you initialize a single scalar variable, only one element is needed.
Your compiler is supposed to tell you that this code doesn't make sense:
x.c:2:12: warning: excess elements in scalar initializer
int a = {1,2,3,4};
^
I should add that this code violates a constraint of the standard, see C11 draft N1570, § 6.7.9 -- 2:
No initializer shall attempt to provide a value for an object not contained within the entity being initialized.
This requires the compiler to emit a diagnostic when compiling such broken code.
From the same paragraph is the following rule (number 17):
Each brace-enclosed initializer list has an associated current object. When no
designations are present, subobjects of the current object are initialized in order according
to the type of the current object: array elements in increasing subscript order, structure
members in declaration order, and the first named member of a union.1
So your compiler decides to do the "next closest" thing to the standard and just use the first value you provide.
That's an initializer (usually used for arrays), but you are not using it wisely, didn't your compiler tell you?
Georgioss-MacBook-Pro:~ gsamaras$ gcc -Wall main.c
main.c:2:13: warning: excess elements in scalar initializer
int a = {1,2,3,4};
^
The precedence here is from left to right, and since you have only one element to initialize, one element is chosen from the initializer (that is 1 in this case).
The curly braces mean initialization of the variable, mostly useful for arrays.
In your case, compiling with gcc yields:
test.c:6:12: warning: excess elements in scalar initializer
int a = {1,2,3,4};
^
(same for 3 & 4)
Means that only value 1 is useful for your case (it's a scalar)

Declare and use variable in same statement

In C is it valid to use a variable in the same statement in which it is declared?
In both gcc 4.9 and clang 3.5 the following program compiles and runs without error:
#include "stdio.h"
int main() {
int x = x;
printf("%d\n", x);
}
In gcc it outputs 0 and in clang 32767 (which is largest positive 2-byte integer value).
Why does this not cause a compilation error? Is this valid in any particular C specification? Is its behavior explicitly undefined?
int x = x;
This is "valid" in the sense that it doesn't violate a constraint or syntax rule, so no compile-time diagnostic is required. The name x is visible within the initializer, and refers to the object being declared. The scope is defined in N1570 6.2.1 paragraph 7:
Any other identifier [other than a struct, union, or enum tag, or
an enum constant] has scope that begins just after the completion of
its declarator.
The declarator in this case is int x.
This allows for things like:
int x = 10, y = x + 1;
But the declaration has undefined behavior, because the initializer refers to an object that hasn't been initialized.
The explicit statement that the behavior is undefined is in N1570 6.3.2.1 paragraph 2, which describes the "conversion" of an lvalue (an expression that designates an object) to the value stored in that object.
Except when [list of cases that don't apply here], an
lvalue that does not have array type is converted to the value stored
in the designated object (and is no longer an lvalue); this is called
lvalue conversion.
[...]
If the lvalue designates an object of automatic storage duration that
could have been declared with the register storage class (never
had its address taken), and that object is uninitialized (not declared
with an initializer and no assignment to it has been performed prior
to use), the behavior is undefined.
The object in question is x, referenced in the initializer. At that point, no value has been assigned to x, so the expression has undefined behavior.
In practice, you'll probably get a compile-time warning if you enable a high enough warning level. The actual behavior might be the same as if you had omitted the initializer:
int x;
but don't count on it.
According to the language specification
6.7.8.10 If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.
Further, it says
6.7.8.11 The initializer for a scalar shall be a single expression, optionally enclosed in braces. The initial value of the object is that of the expression (after conversion).
Hence, the value of the initializer expression (x to the right of =) is indeterminate, so we are dealing with undefined behavior, because initializer reads from variable x that has indeterminate value.
Various compilers provide warning settings to catch such conditions.
int x = x;
is cause for undefined behavior. Don't count on any predictable behavior.
Clang does warn about this:
$ clang -c -Wall ub_or_not_ub.c
ub_or_not_ub.c:4:11: warning: variable 'x' is uninitialized when used within its own initialization [-Wuninitialized]
int x = x;
~ ^
So I guess it's undefined behavior.

static struct initialization in c99

I have encountered a strange behaviour when using compound literals for static struct initialization in GCC in c99/gnu99 modes.
Apparently this is fine:
struct Test
{
int a;
};
static struct Test tt = {1}; /* 1 */
However, this is not:
static struct Test tt = (struct Test) {1}; /* 2 */
This triggers following error:
initializer element is not constant
Also this does not help either:
static struct Test tt = (const struct Test) {1}; /* 3 */
I do understand that initializer value for a static struct should be a compile-time constant. But I do not understand why this simplest initializer expression is not considered constant anymore? Is this defined by the standard?
The reason I'm asking is that I have encountered some legacy code written in GCC in gnu90 mode, that used such compound literal construct for static struct initialization (2). Apparently this was a GNU extension at the time, which was later adopted by C99.
And now it results in that the code that successfully compiled with GNU90 cannot be compiled with neither C99, nor even GNU99.
Why would they do this to me?
This is/was a gcc bug (HT to cremno), the bug report says:
I believe we should just allow initializing objects with static
storage duration with compound literals even in gnu99/gnu11. [...]
(But warn with -pedantic.)
We can see from the gcc document on compound literals that initialization of objects with static storage duration should be supported as an extension:
As a GNU extension, GCC allows initialization of objects with static
storage duration by compound literals (which is not possible in ISO
C99, because the initializer is not a constant).
This is fixed in gcc 5.2. So, in gcc 5.2 you will only get this warning when using the -pedantic flag see it live, which does not complain without -pedantic.
Using -pedantic means that gcc should provide diagnostics as the standard requires:
to obtain all the diagnostics required by the standard, you should
also specify -pedantic (or -pedantic-errors if you want them to be
errors rather than warnings)
A compound literal is not a constant expression as covered by the C99 draft standard section 6.6 Constant expressions, we see from section 6.7.8 Initialization that:
All the expressions in an initializer for an object that has static storage duration shall be
constant expressions or string literals.
gcc is allowed to accept other forms of constant expressions as an extension, from section 6.6:
An implementation may accept other forms of constant expressions.
interesting to note that clang does not complain about this using -pedantic
C language relies on an exact definition of what is constant expression. Just because something looks "known at compile time" does not mean that it satisfies the formal definition of constant expression.
C language does not define the constant expressions of non-scalar types. It allows implementations to introduce their own kinds of constant expressions, but the one defined by the standard are restricted to scalar types only.
In other words, C language does not define the concept of constant expression for your type struct Test. Any value of struct Test is not a constant. Your compound literal (struct Test) {1} is not a constant (and is not a string literal) and, for this reason, it cannot be used as an initializer for objects with static storage duration. Adding a const qualifier to it will not change anything since in C const qualifier has no relation whatsoever to the concept of constant expression. It will never make any difference in such contexts.
Note that your first variant does not involve a compound literal at all. It uses a raw { ... } initializer syntax with constant expressions inside. This is explicitly allowed for objects with static storage duration.
So, in the most restrictive sense, the initialization with a compound literal is illegal, while the initialization with ordinary { ... } initializer is fine. Some compilers might accept compound literal initialization as an extension. (By extending the concept of constant expression or by taking some other extension path. Consult compiler documentation to figure out why it compiles.)
Interestingly, the clang does not complain with this code, even with -pedantic-errors flag.
This is most certainly about C11 §6.7.9/p4 Initialization (emphasis mine going forward)
All the expressions in an initializer for an object that has static or
thread storage duration shall be constant expressions or string
literals.
Another subclause to look into is §6.5.2.5/p5 Compound literals:
The value of the compound literal is that of an unnamed object
initialized by the initializer list. If the compound literal occurs
outside the body of a function, the object has static storage
duration; otherwise, it has automatic storage duration associated with
the enclosing block.
and (for completeness) §6.5.2.5/p4:
In either case, the result is an lvalue.
but this does not mean, that such unnamed object can be treated as constant expression. The §6.6 Constant expressions says inter alia:
2) A constant expression can be evaluated during translation rather
than runtime, and accordingly may be used in any place that a constant
may be.
3) Constant expressions shall not contain assignment, increment,
decrement, function-call, or comma operators, except when they are
contained within a subexpression that is not evaluated.
10) An implementation may accept other forms of constant expressions.
There is no explicit mention about compound literals though, thus I would interpret this, they are invalid as constant expressions in strictly conforming program (thus I'd say, that clang has a bug).
Section J.2 Undefined behavior (informative) also clarifies that:
A constant expression in an initializer is not, or does not evaluate
to, one of the following: an arithmetic constant expression, a null
pointer constant, an address constant, or an address constant for a
complete object type plus or minus an integer constant expression
(6.6).
Again, no mention about compound literals.
Neverthless, there is a light in the tunnel. Another way, that is fully sanitized is to convey such unnamed object as address constant. The standard states in §6.6/p9 that:
An address constant is a null pointer, a pointer to an lvalue
designating an object of static storage duration, or a pointer to a
function designator; it shall be created explicitly using the unary &
operator or an integer constant cast to pointer type, or implicitly by
the use of an expression of array or function type. The
array-subscript [] and member-access . and -> operators, the address &
and indirection * unary operators, and pointer casts may be used in
the creation of an address constant, but the value of an object shall
not be accessed by use of these operators.
hence you can safely initialize it with constant expression in this form, because such compound literal indeed designates an lvalue of object, that has static storage duration:
#include <stdio.h>
struct Test
{
int a;
};
static struct Test *tt = &((struct Test) {1}); /* 2 */
int main(void)
{
printf("%d\n", tt->a);
return 0;
}
As checked it compiles fine with -std=c99 -pedantic-errors flags on both gcc 5.2.0 and clang 3.6.
Note, that as opposite to C++, in C the const qualifier has no effect on constant expressions.
ISO C99 does support compound literals (according to this). However, currently only the GNU extension provides for initialization of objects with static storage duration by compound literals, but only for C90 and C++.
A compound literal looks like a cast containing an initializer. Its value is an object of the type specified in the cast, containing the elements specified in the initializer; it is an lvalue. As an extension, GCC supports compound literals in C90 mode and in C++, though the semantics are somewhat different in C++.
Usually, the specified type is a structure. Assume that struct foo and structure are declared as shown:
struct foo {int a; char b[2];} structure;
Here is an example of constructing a struct foo with a compound literal:
structure = ((struct foo) {x + y, 'a', 0});
This is equivalent to writing the following:
{
struct foo temp = {x + y, 'a', 0};
structure = temp;
}
GCC Extension:
As a GNU extension, GCC allows initialization of objects with static storage duration by compound literals ( which is not possible in ISO C99, because the initializer is not a constant ). It is handled as if the object is initialized only with the bracket enclosed list if the types of the compound literal and the object match. The initializer list of the compound literal must be constant. If the object being initialized has array type of unknown size, the size is determined by compound literal size.
static struct foo x = (struct foo) {1, 'a', 'b'};
static int y[] = (int []) {1, 2, 3};
static int z[] = (int [3]) {1};
Note:
The compiler tags on your post include only GCC; however, you make comparisons to C99, (and multiple GCC versions). It is important to note that GCC is quicker to add extended capabilities to its compilers than the larger C standard groups are. This has sometimes lead to buggy behavior and inconsistencies between versions. Also important to note, extensions to a well known and popular compiler, but that do not comply with an accepted C standard, lead to potentially non-portable code. It is always worth considering target customers when deciding to use an extension that has not yet been accepted by the larger C working groups/standards organizations. (See ISO (Wikipedia) and ANSI (Wikipedia).)
There are several examples where the smaller more nimble Open Source C working groups or committees have responded to user base expressed interest by adding extensions. For example, the switch case range extension.
Quoting the C11 standard, chapter §6.5.2.5, Compound literals, paragraph 3, (emphasis mine)
A postfix expression that consists of a parenthesized type name followed by a brace-enclosed list of initializers is a compound literal. It provides an unnamed object whose value is given by the initializer list.
So, a compound literal is tread as an unnamed object, which is not considered a compile time constant.
Just like you cannot use another variable to initialize a static variable, onward C99, you cannot use this compound literal either to initialize a static variable anymore.

Resources