This question is by mere curiosity. It is not about the empty struct.
I just stumbled over an interesting typo of the kind
struct {
int member1; /*comment*/ ; /* <-- note the ';' */
int member2;
} variable[] = { /* initializers */ };
which the compiler (xc32, derived from gcc) accepted without any
complaints. Of course, I corrected this but the software was running
smoothly before and after the correction and the additional ; seemingly
causes no problems. I then tried various lengthes of ;;; in he struct definition an they seem to make no difference neither to functionality
nor to sizeof. So in a struct any sequence of ;;;; seems to be
equivalent to a single ;.
I couldn't find anything about such "empty members" of a struct/union
in the specification, neither that they are allowed nor that they are
disallowed. To me it seems as if the grammar rejected them. This contrasts
to "empty declarations" ; at the top level of a compilation unit, which the
standard clearly forbids and the "null" statement ; in functions which is a
clearly allowed language feature.
Does anyone know about this behaviour? Is it compiler specific or does the
C specification somehow tolerate such empty struct members?
The syntax is specified in C11 6.7.2.1
struct-declaration:
specifier-qualifier-list struct-declarator-listopt ;
static_assert-declaration
There is 1 semicolon at the end, so that's the only allowed syntax. You cannot skip the semicolon, you can't add extra ones. And that's that.
(You can however have a static assert inside a struct declaration, from C11.)
The standard doesn't talk about that, it's just a gcc tolerance. See 6.7.2.1:
struct-or-union-specifier:
struct-or-union identifieropt { struct-declaration-list }
struct-or-union identifier
struct-or-union:
struct
union
struct-declaration-list:
struct-declaration
struct-declaration-list struct-declaration
struct-declaration:
specifier-qualifier-list struct-declarator-listopt ;
static_assert-declaration
specifier-qualifier-list:
type-specifier specifier-qualifier-listopt
type-qualifier specifier-qualifier-listopt
(type-specifier and type-qualifier can't be empty, see the related sections in the standard for details.)
Some compilers, like gcc, tolerate extra semi-colons, but -Wpedantic option reveals that it's only a tolerance:
struct foo {
int a;
;;;
};
int main() {
;;;
}
With -pedantic option gcc complains, not on the main empty statements, but on the extra semicolons of the structure declaration.
<source>:3:5: warning: extra semicolon in struct or union specified [-Wpedantic]
;;;
^
<source>:3:6: warning: extra semicolon in struct or union specified [-Wpedantic]
;;;
^
<source>:3:7: warning: extra semicolon in struct or union specified [-Wpedantic]
;;;
Other compilers may not be that friendly, so the typo must be fixed, since it doesn't bring anything useful.
Related
When looking through C's BNF grammar, I thought it was weird that the production rule for a declaration looked like this (according to https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm):
<declaration> ::= {<declaration-specifier>}+ {<init-declarator>}* ;
Why use an * quantifier (meaning zero or more occurrences) for the init-declarator? This allows statements such as int; or void; to be syntactically valid, even though they're semantically invalid. Couldn't they have just used a + quantifier (one or more occurrences) instead of * in the production rule?
I tried compiling a simple program to see what the compiler outputs and all it does is issue a warning.
Input:
int main(void) {
int;
}
Output:
test.c: In function ‘main’:
test.c:2:5: warning: useless type name in empty declaration
int;
^~~
declaration-specifier includes type-specifier, which includes enum-specifier. A construct like
enum stuff {x, y};
is a valid declaration with no init-declarator.
Constructs like int; are ruled out by constraints beyond the grammar:
A declaration other than a static_assert declaration shall declare at least a declarator (other than the parameters of a function or the members of a structure or union), a tag, or the members of an enumeration.
I would guess that there are backward compatibility reasons behind your compiler only issuing a warning.
A declaration without an init declarator:
<declaration> ::= {<declaration-specifier>}+ {<init-declarator>}* ;
is harmless for declaration specifier lists that aren't a single enum/struct/union specifier and it usefully matches those that are.
In any case, the presented grammar will also erroneously match declarations like int struct foo x; or double _Bool y; (it allows multiple specifiers in order to match things like long long int), but all these can be detected later, in a semantic check.
The BNF grammar itself won't weed out all illegal constructs.
I expected it to be possible to apply alignas/_Alignas to an entire struct declaration, like this:
#include <stddef.h>
#include <stdalign.h>
struct alignas(max_align_t) S {
int field;
};
struct S s = { 0 };
but both gcc and clang reject the declaration:
(gcc 6.3)
test.c:4:8: error: expected ‘{’ before ‘_Alignas’
struct alignas(max_align_t) S {
^
(clang 3.8)
test.c:4:1: error: declaration of anonymous struct must be a definition
struct alignas(max_align_t) S {
^
What gives? Note that both compilers accept this construct if I compile the file as C++, or if alignas is replaced with the equivalent GCC extension,
struct __attribute__((aligned(__alignof__(max_align_t)))) S {
int field;
};
Also note that the other plausible placements of alignas,
alignas(max_align_t) struct S { ... };
struct S alignas(max_align_t) { ... };
struct S { ... } alignas(max_align_t);
also throw syntax errors (albeit different ones).
C11 is not very clear on these things, but a consensus has emerged how this is to be interpreted. C17 will have some of this clarified. The idea of not allowing types to be aligned is that there should never be different alignment requirements for compatible types between compilation units. If you want to force the alignment of a struct type, you'd have to impose an alignment on the first member. By that you'd create an incompatible type.
The start of the "Constraint" section as voted by the committee reads:
An alignment specifier shall appear only in the declaration specifiers
of a declaration, or in the specifier-qualifier list of a member
declaration, or in the type name of a compound literal. An alignment
specifier shall not be used in conjunction with either of the
storage-class specifiers typedef or register, nor in a declaration of
a function or bit-field.
With GCC 7.2.0 (compiled on macOS Sierra, running on macOS High Sierra), and also with GCC 6.3.0 on the same platform, and with Clang from XCode 9, I get:
$ cat align17.c
#include <stddef.h>
#include <stdalign.h>
alignas(max_align_t) struct S
{
int field;
}; // Line 7
struct S s = { 0 };
// Alternative
struct S1
{
int field;
};
alignas(max_align_t) struct S1 s1 = { 1 };
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror -c align17.c
align17.c:7:1: error: useless ‘_Alignas’ in empty declaration [-Werror]
};
^
cc1: all warnings being treated as errors
$ clang -O3 -g -std=c11 -Wall -Wextra -Werror -c align17.c
align17.c:4:1: error: attribute '_Alignas' is ignored, place it after "struct" to apply attribute to type
declaration [-Werror,-Wignored-attributes]
alignas(max_align_t) struct S
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/9.0.0/include/stdalign.h:28:17: note:
expanded from macro 'alignas'
#define alignas _Alignas
^
1 error generated.
$
It appears that these compilers only allow you to apply the _Alignas (or alignas) alignment specifier to a variable declaration, not to a type definition. Your code — the code which produces the error (warning converted to error) — attempts to apply the alignment to the type.
Note that putting alignas(max_align_t) betweeen struct and S clearly violates the standard. Putting it in other places does not seem to work unless there is actually a variable defined, despite the message hinting "place it after "struct" to apply attribute to type declaration". Thus, this code compiles because it defines variable s0. Omit s0 and it fails to compile.
struct S
{
int field;
} alignas(max_align_t) s0;
ISO/IEC 9899:2011 — The C11 Standard
6.7 Declarations
Syntax
1 declaration:
declaration-specifiers init-declarator-listopt ;
static_assert-declaration
declaration-specifiers:
storage-class-specifier declaration-specifiersopt
type-specifier declaration-specifiersopt
type-qualifier declaration-specifiersopt
function-specifier declaration-specifiersopt
alignment-specifier declaration-specifiersopt
6.7.2 Type specifiers
Syntax
1 type-specifier:
void
char
short
int
long
float
double
signed
unsigned
_Bool
_Complex
atomic-type-specifier
struct-or-union-specifier
enum-specifier
typedef-name
6.7.2.1 Structure and union specifiers
Syntax
1 struct-or-union-specifier:
struct-or-union identifieropt { struct-declaration-list }
struct-or-union identifier
struct-or-union:
struct
union
6.7.5 Alignment specifier
Syntax
1 alignment-specifier:
_Alignas ( type-name )
_Alignas ( constant-expression )
Constraints
2 An alignment attribute shall not be specified in a declaration of a typedef, or a bit-field, or a function, or a parameter, or an object declared with the register storage-class specifier.
Semantics
6 The alignment requirement of the declared object or member is taken to be the specified
alignment. An alignment specification of zero has no effect.141) When multiple
alignment specifiers occur in a declaration, the effective alignment requirement is the
strictest specified alignment.
7 If the definition of an object has an alignment specifier, any other declaration of that
object shall either specify equivalent alignment or have no alignment specifier. If the
definition of an object does not have an alignment specifier, any other declaration of that
object shall also have no alignment specifier. If declarations of an object in different
translation units have different alignment specifiers, the behavior is undefined.
141) An alignment specification of zero also does not affect other alignment specifications in the same
declaration.
Interpretation
These rules do not allow the alignment specifier between struct and its tag, or between structtag and the { … } structure definition (even if the GCC __attribute__ notation worked in such contexts).
The rules in §6.7.5 ¶2 do not self-evidently preclude the alignment specifier in a structure type definition, though other wording implies that it could be used with a structure member (inside the { … } section), or associated with a variable declaration or definition. However, the fact that an alignment specifier is not allowed in a typedef has implications that it shouldn't be allowed in a type definition either.
In the short term, you will, I think, have to accept that what you tried to do (attach the alignment specifier to the structure type declaration) doesn't work. You can consider whether it is worth looking for a GCC (or Clang) bug report, and if there is none, whether it is worth creating one. It is marginal, to my way of thinking, whether it is actually a bug, but I can't see anything in the quoted standard material that precludes what you attempted directly. However, if you can't apply alignment to a typedef, it also makes sense not to be able to apply it to a structure type declaration — which is why I consider it marginal (I wouldn't be surprised to find that any attempted bug report is rejected).
I have the following struct types:
typedef struct PG_Point PG_Point;
struct PG_Point
{
int x;
int y;
};
typedef struct PG_Size PG_Size;
struct PG_Size
{
int width;
int height;
};
typedef struct PG_Bounds PG_Bounds;
struct PG_Bounds
{
union
{
struct
{
PG_Point topLeft;
PG_Size size;
};
struct
{
struct
{
int x;
int y;
};
struct
{
int width;
int height;
};
};
};
};
with the following initializers:
#define PG_Point_init(ix, iy) {.x=(ix), .y=(iy)}
#define PG_Size_init(iwidth, iheight) {.width=(iwidth), .height=(iheight)}
#define PG_Bounds_init(ix, iy, iwidth, iheight) { \
.topLeft=PG_Point_init((ix),(iy)), \
.size=PG_Size_init((iwidth),(iheight)) }
From what I understand, it's correct in c11 to initialize the fields of an anonymous struct as if they were directly fields of the containing struct? But with gcc 4.9.2, this gives the following warning:
warning: missing initializer for field ‘size’ of ‘struct <anonymous>’ [-Wmissing-field-initializers]
It works if I change the initializer to this version:
#define PG_Bounds_init(ix, iy, iwidth, iheight) {{{ \
.topLeft=PG_Point_init((ix),(iy)), \
.size=PG_Size_init((iwidth),(iheight)) }}}
That is, explicitly having the union and struct as sub aggregates.
Is this even allowed? Do I have to expect other compilers to reject this?
From what I understand, it's correct in c11 to initialize the fields of an anonymous struct as if they were directly fields of the containing struct?
There are two parts to that. First of all, we need to tackle the question of whether such members can be initialized at all, because Paragraph 6.7.2.1/13 identifies anonymous structure and union members as specific kinds of "unnamed members", and paragraph 6.7.9/9 says
Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization.
The rest of section 6.7.9 (Initialization) nowhere says anything that I would interpret as explicitly applying to anonymous structure and anonymous union members themselves, but I don't think the intent is to prevent initialization of the named members of anonymous members, especially given that they are considered members of the containing structure or union (see below). Thus, I do not interpret the standard to forbid the initialization you are trying to perform.
So yes, I read C11 to allow your initializer and to specify that it has the effect you appear to intend. In particular, paragraph 6.7.2.1/13 of the standard says, in part,
The members of an anonymous structure or union are considered to be members of the containing structure or union. This applies recursively if the containing structure or union is also anonymous.
Your initializer therefore satisfies the constraint in paragraph 6.7.9/7, that the designators within specify names of members of the current object (in your case, a struct PG_Bounds). The following paragraphs of section 6.7.9 present the semantics for initializers, and I see no reason to interpret them to specify anything other than initialization of the overall object with the values you have provided.
At this point, I reiterate that gcc is issuing a warning, not rejecting your code, and in this case I think the warning is spurious. I wrote a test program such as I suggested in comments that you do, and tried it on gcc 4.8.5 in C11 mode. Although gcc emitted the same warning you presented (but only with -Wextra enabled), I was able to demonstrate that your initializer initialized all members of a subject struct PG_Bounds to the intended values.
You also observe that gcc does not warn if you change the initializer to a version that uses nested brace-enclosed initializers, and ask
Is this even allowed? Do I have to expect other compilers to reject this?
This could be viewed as more problematic with respect to paragraph 6.7.9/9, so in that sense it is perhaps riskier. I am uncertain whether there is any compiler that actually rejects it or does the wrong thing with it. I think the intent of the standard is to allow this initializer, but I would prefer the other form, myself.
I'm trying to write a lex/yacc grammar for C11 based off of N1570. Most of my grammar is copied verbatim from the informative syntax summary, but some yacc conflicts arose. I've managed to resolve all of them except for one: there seems to be some ambiguity between when '_Atomic' is used as a type specifier and when it's used as a type qualifier.
In the specifier form, _Atomic is followed immediately by parentheses, so I'm assuming it has something to do with C's little-used syntax which allows declarators to be in parentheses, thus allowing parentheses to immediately follow a qualifier. But my grammar already knows how to differentiate typedef names from other identifiers, so yacc should know the difference, shouldn't it?
I can't for the life of me think of a case when it would actually be ambiguous.
I doubt it helps, but here's the relevant state output I get when I use yacc's -v flag. "ATOMIC" is obviously my token name for "_Atomic"
state 23
152 atomic_type_specifier: ATOMIC . '(' type_name ')'
156 type_qualifier: ATOMIC .
'(' shift, and go to state 49
'(' [reduce using rule 156 (type_qualifier)]
$default reduce using rule 156 (type_qualifier)
Okay, whether or not we can come up with a grammatically ambiguous case doesn't matter. Section 6.7.2.4 paragraph 4 of N1570 states that:
If the _Atomic keyword is immediately followed by a left parenthesis, it is interpreted as a type specifier (with a type name), not as a type qualifier.
To enforce this, I simply made _Atomic as a specifier and _Atomic as a qualifier separate tokens in my lex rules.
"_Atomic"/{WHITESPACE}*"(" {return ATOMIC_SPECIFIER;}
"_Atomic" {return ATOMIC_QUALIFIER;}
I'm relatively new to lex/yacc and parser generators in general, but my gut says this is kind of a hack. At the same time, what else would the trailing context syntax in lex be for?
Yes, I think there is ambiguity in the specification. Take
_Atomic int (*f)(int);
here the _Atomic is a type-qualifier. (As return type of a function it makes not much sense, but is valid, I think). Now take this alternative form
int _Atomic (*f)(int);
normally type-qualifiers can come after the int and this should be equivalent to the other declaration. But now _Atomic is followed by parenthesis, so it must be interpreted as a type-specifier which then is a syntax error. I think it would even be possible to cook up an example where *f could be replace by a valid typedef.
Have a look at the first phrase of 6.7.2.4 p4
The properties associated with atomic types are meaningful only for
expressions that are lvalues.
This clearly indicates that they don't expect return types of functions to be _Atomic qualified.
Edit:
The same ambiguity would occur for
_Atomic int (*A)[3];
which makes perfect sense (a pointer to an array of three atomic integers) and which we should be able to rewrite as
int _Atomic (*A)[3];
Edit 2: To see that the criteria of having a type in the parenthesis is not disambiguating take the following valid C99 code:
typedef int toto;
int main(void) {
const int toto(void);
int const toto(void);
const int (toto)(void);
int const (toto)(void);
return toto();
}
This redeclares toto inside main as a function. And all four lines are valid prototypes for the same function. Now use the _Atomic as a qualifier
typedef int toto;
int main(void) {
int _Atomic (toto)(void);
return toto();
}
this should be valid as the version with const. Now we have here a case where _Atomic is followed by parenthesis with a type inside, but yet it is not a type-specifier.
According to me, it is zero but there seems to be bit confusion here
I have tested it with gcc compiler and it gives me zero as output. I know that in C++, size of an empty class is 1. Let me know if I am missing anything here.
A struct cannot be empty in C because the syntax forbids it. Furthermore, there is a semantic constraint that makes behavior undefined if a struct has no named member:
struct-or-union-specifier:
struct-or-union identifieropt { struct-declaration-list }
struct-or-union identifier
struct-or-union:
struct
union
struct-declaration-list:
struct-declaration
struct-declaration-list struct-declaration
struct-declaration:
specifier-qualifier-list struct-declarator-list ;
/* type-specifier or qualifier required here! */
specifier-qualifier-list:
type-specifier specifier-qualifier-listopt
type-qualifier specifier-qualifier-listopt
struct-declarator-list:
struct-declarator
struct-declarator-list , struct-declarator
struct-declarator:
declarator
declaratoropt : constant-expression
If you write
struct identifier { };
It will give you a diagnostic message, because you violate syntactic rules. If you write
struct identifier { int : 0; };
Then you have a non-empty struct with no named members, thus making behavior undefined, and not requiring a diagnostic:
If the struct-declaration-list contains no named members, the behavior is undefined.
Notice that the following is disallowed because a flexible array member cannot be the first member:
struct identifier { type ident[]; };
The C grammar doesn't allow the contents of a struct to be empty - there has to be at least an unnamed bitfield or a named member (as far as the grammar is concerned - I'm not sure if a struct that contains only an unnamed bitfield is otherwise valid).
Support for empty structs in C are an extension in GCC.
In C++ and empty struct/class member-specification is explicitly permitted, but the size is defined to be 1 - unless as part of the empty base optimization the compiler is allowed to make an empty base class take no space in the derived class.
In C99: "If the struct-declaration-list contains no named members, the behavior is undefined."
The syntax doesn't really allow it anyway, though I don't see anything that says a diagnostic is required, which puts it pretty much back in the "undefined behavior" camp.
on VC 8 It gives error if we try to get the sizeof empty struct, on the other way round on linux with gcc it gives size 1 because it uses gcc extention instead of c language specification which says this is undefined behaviour.
struct node
{
// empty struct.
};
int main()
{
printf("%d", sizeof(struct node));
return 0;
}
on windows vc 2005 It gives compilation error
on linux with gcc it gives size 1 because gcc extension
http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Empty-Structures.html#Empty-Structures
(As Pointed out by Michael Burr)