Does union support flexible array members? - c

I have declared a flexible array member in union, like this:
#include <stdio.h>
union ut
{
int i;
int a[]; // flexible array member
};
int main(void)
{
union ut s;
return 0;
}
and compiler gives an error :
source_file.c:8:9: error: flexible array member in union
int a[];
But, Declared array zero size like this :
union ut
{
int i;
int a[0]; // Zero length array
};
And it's working fine.
Why does zero length array work fine union?

No, unions do not support flexible array members, only structs. C11 6.7.2.1 §18
As a special case, the last element of a structure with more than one
named member may have an incomplete array type; this is called a
flexible array member.
In addition, zero-length arrays is not valid C, that's a gcc non-standard extension. The reason why you get this to work is because your compiler, gcc, is configured to compile code for the "non-standard GNU language". If you would rather have it compile code for the C programming language, you need to add compiler options -std=c11 -pedantic-errors.

int a[] is the C standard notation (since C99).
int a[0] is GNU C syntax, which predates C99. Other compilers might also support it, I don't know.
Your compiler seems to default to C90 standard with GNU extensions, which is why latter compiles, but first one does.
Furthermore, as stated in Lundin's answer, standard C does not support flexible array members in union at all.
Try adding -std=c99 or -std=c11 to your compiler options (gcc docs here).
Also -pedantic or -pedantic-errors is probably a good idea too, it'll enforce more strict standard compliance.
And, obligatory aside, -Wall -Wextra won't hurt either...

I'm not sure what the standard would say about this, but G++' unions seems to accept flexible arrays just fine. If you wrap them in an anonymous struct first, like so:
union {
unsigned long int ul;
char fixed[4][2];
struct {
char flexible[][2];
};
};

Related

C: Reading 8 bytes from a region of size 0 [-Wstringop-overread] [duplicate]

Just curious, what actually happens if I define a zero-length array int array[0]; in code? GCC doesn't complain at all.
Sample Program
#include <stdio.h>
int main() {
int arr[0];
return 0;
}
Clarification
I'm actually trying to figure out if zero-length arrays initialised this way, instead of being pointed at like the variable length in Darhazer's comments, are optimised out or not.
This is because I have to release some code out into the wild, so I'm trying to figure out if I have to handle cases where the SIZE is defined as 0, which happens in some code with a statically defined int array[SIZE];
I was actually surprised that GCC does not complain, which led to my question. From the answers I've received, I believe the lack of a warning is largely due to supporting old code which has not been updated with the new [] syntax.
Because I was mainly wondering about the error, I am tagging Lundin's answer as correct (Nawaz's was first, but it wasn't as complete) -- the others were pointing out its actual use for tail-padded structures, while relevant, isn't exactly what I was looking for.
An array cannot have zero size.
ISO 9899:2011 6.7.6.2:
If the expression is a constant expression, it shall have a value greater than zero.
The above text is true both for a plain array (paragraph 1). For a VLA (variable length array), the behavior is undefined if the expression's value is less than or equal to zero (paragraph 5). This is normative text in the C standard. A compiler is not allowed to implement it differently.
gcc -std=c99 -pedantic gives a warning for the non-VLA case.
As per the standard, it is not allowed.
However it's been current practice in C compilers to treat those declarations as a flexible array member (FAM) declaration:
C99 6.7.2.1, §16: As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.
The standard syntax of a FAM is:
struct Array {
size_t size;
int content[];
};
The idea is that you would then allocate it so:
void foo(size_t x) {
Array* array = malloc(sizeof(size_t) + x * sizeof(int));
array->size = x;
for (size_t i = 0; i != x; ++i) {
array->content[i] = 0;
}
}
You might also use it statically (gcc extension):
Array a = { 3, { 1, 2, 3 } };
This is also known as tail-padded structures (this term predates the publication of the C99 Standard) or struct hack (thanks to Joe Wreschnig for pointing it out).
However this syntax was standardized (and the effects guaranteed) only lately in C99. Before a constant size was necessary.
1 was the portable way to go, though it was rather strange.
0 was better at indicating intent, but not legal as far as the Standard was concerned and supported as an extension by some compilers (including gcc).
The tail padding practice, however, relies on the fact that storage is available (careful malloc) so is not suited to stack usage in general.
In Standard C and C++, zero-size array is not allowed..
If you're using GCC, compile it with -pedantic option. It will give warning, saying:
zero.c:3:6: warning: ISO C forbids zero-size array 'a' [-pedantic]
In case of C++, it gives similar warning.
It's totally illegal, and always has been, but a lot of compilers
neglect to signal the error. I'm not sure why you want to do this.
The one use I know of is to trigger a compile time error from a boolean:
char someCondition[ condition ];
If condition is a false, then I get a compile time error. Because
compilers do allow this, however, I've taken to using:
char someCondition[ 2 * condition - 1 ];
This gives a size of either 1 or -1, and I've never found a compiler
which would accept a size of -1.
Another use of zero-length arrays is for making variable-length object (pre-C99). Zero-length arrays are different from flexible arrays which have [] without 0.
Quoted from gcc doc:
Zero-length arrays are allowed in GNU C. They are very useful as the last element of a structure that is really a header for a variable-length object:
struct line {
int length;
char contents[0];
};
struct line *thisline = (struct line *)
malloc (sizeof (struct line) + this_length);
thisline->length = this_length;
In ISO C99, you would use a flexible array member, which is slightly different in syntax and semantics:
Flexible array members are written as contents[] without the 0.
Flexible array members have incomplete type, and so the sizeof operator may not be applied.
A real-world example is zero-length arrays of struct kdbus_item in kdbus.h (a Linux kernel module).
I'll add that there is a whole page of the online documentation of gcc on this argument.
Some quotes:
Zero-length arrays are allowed in GNU C.
In ISO C90, you would have to give contents a length of 1
and
GCC versions before 3.0 allowed zero-length arrays to be statically initialized, as if they were flexible arrays. In addition to those cases that were useful, it also allowed initializations in situations that would corrupt later data
so you could
int arr[0] = { 1 };
and boom :-)
Zero-size array declarations within structs would be useful if they were allowed, and if the semantics were such that (1) they would force alignment but otherwise not allocate any space, and (2) indexing the array would be considered defined behavior in the case where the resulting pointer would be within the same block of memory as the struct. Such behavior was never permitted by any C standard, but some older compilers allowed it before it became standard for compilers to allow incomplete array declarations with empty brackets.
The struct hack, as commonly implemented using an array of size 1, is dodgy and I don't think there's any requirement that compilers refrain from breaking it. For example, I would expect that if a compiler sees int a[1], it would be within its rights to regard a[i] as a[0]. If someone tries to work around the alignment issues of the struct hack via something like
typedef struct {
uint32_t size;
uint8_t data[4]; // Use four, to avoid having padding throw off the size of the struct
}
a compiler might get clever and assume the array size really is four:
; As written
foo = myStruct->data[i];
; As interpreted (assuming little-endian hardware)
foo = ((*(uint32_t*)myStruct->data) >> (i << 3)) & 0xFF;
Such an optimization might be reasonable, especially if myStruct->data could be loaded into a register in the same operation as myStruct->size. I know nothing in the standard that would forbid such optimization, though of course it would break any code which might expect to access stuff beyond the fourth element.
Definitely you can't have zero sized arrays by standard, but actually every most popular compiler gives you to do that. So I will try to explain why it can be bad
#include <cstdio>
int main() {
struct A {
A() {
printf("A()\n");
}
~A() {
printf("~A()\n");
}
int empty[0];
};
A vals[3];
}
I am like a human would expect such output:
A()
A()
A()
~A()
~A()
~A()
Clang prints this:
A()
~A()
GCC prints this:
A()
A()
A()
It is totally strange, so it is a good reason not to use empty arrays in C++ if you can.
Also there is extension in GNU C, which gives you to create zero length array in C, but as I understand it right, there should be at least one member in structure prior, or you will get very strange examples as above if you use C++.

Union Zero Initialization with clang vs gcc

Given the following C code:
union Test {
struct {
int f1;
int f2;
};
struct {
int f3;
int f4;
int f5;
};
};
union Test test = {.f1 = 1, .f2 = 2};
When I compile this with gcc 6.1.1 f5 will be zero initialized. When I do with clang 3.8.0 it is not. I tried with -O0 and -O2 for both compilers which did not make any difference. This is on Linux x64.
Which is the correct behavior and can I tell clang to behave like gcc in this case? Reason is I try to compile some code with clang that assumes zero initialization in this case.
Update
Since the answers so far cite C11. Were there any changes in the standard that changed the behavior in later versions?
C11 specifies at section 6.2.6.1.7 :
When a value is stored in a member of an object of union type, the bytes of the object
representation that do not correspond to that member but do correspond to other members
take unspecified values.
You access the union via the first struct, accessing members of the second struct can produce unspecified values, so clang is not wrong neither is gcc.
Update: anonymous members were added in C11. Designated inits appeared in C99.

Initialize an Array Literal Without a Size

I'm curious about the following expression:
int ints[] = { 1, 2, 3 };
This seems to compile fine even in c89 land with clang. Is there documentation about this? I can't seem to figure out the correct terminology to use when searching for it (and I'd rather not go through and read the entire c89 spec again).
Is this an extension? Is the compiler simply inferring the size of the array?
EDIT: I just remembered you guys like chunks of code that actually compile so here it is:
/* clang tst.c -o tst -Wall -Wextra -Werror -std=c89 */
int main(int argc, const char *argv[]) {
int ints[] = { 1, 2, 3 };
(void)(ints); (void)(argc); (void)(argv);
return 0;
}
It's part of standard C since C89:
§3.5.7 Initialization
If an array of unknown size is initialized, its size is determined by the number of initializers provided for its members. At the end of its initializer list, the array no longer has incomplete type.
In fact, there is an almost exact example:
Example:
The declaration
int x[] = { 1, 3, 5 };
defines and initializes x as a one-dimensional array object that has three members, as no size was specified and there are three initializers.
Is this an extension?
no, this is standard, for all versions of the C standard
by the = the array type is "incomplete" and then is completed by means of the initialization
Is the compiler simply inferring the size of the
array?
yes

Compiling C structs

This is my code:
#include <stdio.h>
typedef struct {
const char *description;
float value;
int age;
} swag;
typedef struct {
swag *swag;
const char *sequence;
} combination;
typedef struct {
combination numbers;
const char *make;
} safe;
int main(void)
{
swag gold = { "GOLD!", 100000.0 };
combination numbers = { &gold, "6503" };
safe s = { numbers, "RAMCON" };
printf("Contents = %s\n", s.numbers.swag->description);
getchar();
return 0;
}
Whenever I compile it with the VS developer console, I get this error: error C2440: 'initializing' : cannot convert from 'combination' to 'swag *'.
However if I use gcc the console just prints: "GOLD!". Don't understand what's going on here.
What you stumbled upon is an implementation-specific variant of a popular non-standard compiler extension used in various C89/90 compilers.
The strict rules of classic C89/90 prohibited the use of non-constant objects in {} initializers. This immediately meant that it was impossible to specify an entire struct object between the {} in the initializer, since that would violate the above requirement. Under that rule you could only use scalar constants between the {}.
However, many C89/90 compilers ignored that standard requirement and allowed users to specify non-constant values when writing {} initializers for local objects. Unfortunately, this immediately created an ambiguity if user specified a complex struct object inside the {} initializer, as in your
safe s = { numbers, "RAMCON" };
The language standard did not allow this, for which reason it was not clear what this numbers initializer should apply to. There are two ways to interpret this:
The existing rules of the language said that the compiler must automatically enter each level of struct nesting and apply sequential initializers from the {} to all sequential scalar fields found in that way (actually, it is a bit more complicated, but that's the general idea).
This is exactly what your compiler did. It took the first initializer numbers, it found the first scalar field s.numbers.swag and attempted to apply the former to the latter. This expectedly produced the error you observed.
Other compiler took a more elaborate approach to that extension. When the compiler saw that the next initializer from the {} list had the same type as the target field on the left-hand side, it did not "open" the target field and did not enter the next level of nesting, but rather used the whole initializer value to initialize the whole target field.
This latter behavior is what you expected in your example (and, if I am not mistaken, this is the behavior required by C99), but your C89/90 compiler behaved in accordance with the first approach.
In other words, when you are writing C89/90 code, it is generally OK to use that non-standard extension when you specify non-constant objects in local {} initializers. But it is a good idea to avoid using struct objects in such initializers and stick to scalar initializers only.
Looks like an issue with the initializers. If you use the proper options with gcc, it will tell you this:
$ gcc -Wall -ansi -pedantic x.c
x.c: In function ‘main’:
x.c:21: warning: initializer element is not computable at load time
x.c:22: warning: initializer element is not computable at load time
which is propably the same issue VS is trying to tell you. You can make these go away if you declare gold and numbers static.

Casting a constant to a union

The following code:
#include <stdio.h>
typedef union {
int n;
char *s;
} val_t;
int main(void) {
val_t v1,v2;
v1 = (val_t)"Hello World";
v2 = (val_t)10;
printf("%s %d\n", v1.s, v2.n);
return(1);
}
compiles and executes correctly with gcc. If one tries to cast a constant for which there's not a suitable field in the union, an error message is produced.
Looking at the (C99) standard, though, I've not been able to locate the section where this behaviour is described. Hence, my question:
Does the C standard guarantee that I can cast a constant to a union type, provided that the union type has a field with a compatible type?
or, in other words:
Is ((val_t)10) a valid rvalue of type val_t?
It would also be interesting to know if this behaviour is supported by other compilers (or at least MS Visual C++). Does anybody know?
EDIT:
Casting to a union is a GCC extension, so it's not a good idea to use it.
Thanks to Maurits and Neil! I didn't think about using -pedantic to check!
In GNU C language extensions casting to a union is marked as an extension to the C standard. So most probably you won't find it in the C99 or any other C standard. The IBM C compiler supports this extension as well.
[neilb#GONERIL NeilB]$ gcc -Wall -pedantic sw.c
sw.c: In function 'main':
sw.c:11: warning: ISO C forbids casts to union type
sw.c:12: warning: ISO C forbids casts to union type

Resources