Initialize an Array Literal Without a Size - c

I'm curious about the following expression:
int ints[] = { 1, 2, 3 };
This seems to compile fine even in c89 land with clang. Is there documentation about this? I can't seem to figure out the correct terminology to use when searching for it (and I'd rather not go through and read the entire c89 spec again).
Is this an extension? Is the compiler simply inferring the size of the array?
EDIT: I just remembered you guys like chunks of code that actually compile so here it is:
/* clang tst.c -o tst -Wall -Wextra -Werror -std=c89 */
int main(int argc, const char *argv[]) {
int ints[] = { 1, 2, 3 };
(void)(ints); (void)(argc); (void)(argv);
return 0;
}

It's part of standard C since C89:
§3.5.7 Initialization
If an array of unknown size is initialized, its size is determined by the number of initializers provided for its members. At the end of its initializer list, the array no longer has incomplete type.
In fact, there is an almost exact example:
Example:
The declaration
int x[] = { 1, 3, 5 };
defines and initializes x as a one-dimensional array object that has three members, as no size was specified and there are three initializers.

Is this an extension?
no, this is standard, for all versions of the C standard
by the = the array type is "incomplete" and then is completed by means of the initialization
Is the compiler simply inferring the size of the
array?
yes

Related

C: Reading 8 bytes from a region of size 0 [-Wstringop-overread] [duplicate]

Just curious, what actually happens if I define a zero-length array int array[0]; in code? GCC doesn't complain at all.
Sample Program
#include <stdio.h>
int main() {
int arr[0];
return 0;
}
Clarification
I'm actually trying to figure out if zero-length arrays initialised this way, instead of being pointed at like the variable length in Darhazer's comments, are optimised out or not.
This is because I have to release some code out into the wild, so I'm trying to figure out if I have to handle cases where the SIZE is defined as 0, which happens in some code with a statically defined int array[SIZE];
I was actually surprised that GCC does not complain, which led to my question. From the answers I've received, I believe the lack of a warning is largely due to supporting old code which has not been updated with the new [] syntax.
Because I was mainly wondering about the error, I am tagging Lundin's answer as correct (Nawaz's was first, but it wasn't as complete) -- the others were pointing out its actual use for tail-padded structures, while relevant, isn't exactly what I was looking for.
An array cannot have zero size.
ISO 9899:2011 6.7.6.2:
If the expression is a constant expression, it shall have a value greater than zero.
The above text is true both for a plain array (paragraph 1). For a VLA (variable length array), the behavior is undefined if the expression's value is less than or equal to zero (paragraph 5). This is normative text in the C standard. A compiler is not allowed to implement it differently.
gcc -std=c99 -pedantic gives a warning for the non-VLA case.
As per the standard, it is not allowed.
However it's been current practice in C compilers to treat those declarations as a flexible array member (FAM) declaration:
C99 6.7.2.1, §16: As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.
The standard syntax of a FAM is:
struct Array {
size_t size;
int content[];
};
The idea is that you would then allocate it so:
void foo(size_t x) {
Array* array = malloc(sizeof(size_t) + x * sizeof(int));
array->size = x;
for (size_t i = 0; i != x; ++i) {
array->content[i] = 0;
}
}
You might also use it statically (gcc extension):
Array a = { 3, { 1, 2, 3 } };
This is also known as tail-padded structures (this term predates the publication of the C99 Standard) or struct hack (thanks to Joe Wreschnig for pointing it out).
However this syntax was standardized (and the effects guaranteed) only lately in C99. Before a constant size was necessary.
1 was the portable way to go, though it was rather strange.
0 was better at indicating intent, but not legal as far as the Standard was concerned and supported as an extension by some compilers (including gcc).
The tail padding practice, however, relies on the fact that storage is available (careful malloc) so is not suited to stack usage in general.
In Standard C and C++, zero-size array is not allowed..
If you're using GCC, compile it with -pedantic option. It will give warning, saying:
zero.c:3:6: warning: ISO C forbids zero-size array 'a' [-pedantic]
In case of C++, it gives similar warning.
It's totally illegal, and always has been, but a lot of compilers
neglect to signal the error. I'm not sure why you want to do this.
The one use I know of is to trigger a compile time error from a boolean:
char someCondition[ condition ];
If condition is a false, then I get a compile time error. Because
compilers do allow this, however, I've taken to using:
char someCondition[ 2 * condition - 1 ];
This gives a size of either 1 or -1, and I've never found a compiler
which would accept a size of -1.
Another use of zero-length arrays is for making variable-length object (pre-C99). Zero-length arrays are different from flexible arrays which have [] without 0.
Quoted from gcc doc:
Zero-length arrays are allowed in GNU C. They are very useful as the last element of a structure that is really a header for a variable-length object:
struct line {
int length;
char contents[0];
};
struct line *thisline = (struct line *)
malloc (sizeof (struct line) + this_length);
thisline->length = this_length;
In ISO C99, you would use a flexible array member, which is slightly different in syntax and semantics:
Flexible array members are written as contents[] without the 0.
Flexible array members have incomplete type, and so the sizeof operator may not be applied.
A real-world example is zero-length arrays of struct kdbus_item in kdbus.h (a Linux kernel module).
I'll add that there is a whole page of the online documentation of gcc on this argument.
Some quotes:
Zero-length arrays are allowed in GNU C.
In ISO C90, you would have to give contents a length of 1
and
GCC versions before 3.0 allowed zero-length arrays to be statically initialized, as if they were flexible arrays. In addition to those cases that were useful, it also allowed initializations in situations that would corrupt later data
so you could
int arr[0] = { 1 };
and boom :-)
Zero-size array declarations within structs would be useful if they were allowed, and if the semantics were such that (1) they would force alignment but otherwise not allocate any space, and (2) indexing the array would be considered defined behavior in the case where the resulting pointer would be within the same block of memory as the struct. Such behavior was never permitted by any C standard, but some older compilers allowed it before it became standard for compilers to allow incomplete array declarations with empty brackets.
The struct hack, as commonly implemented using an array of size 1, is dodgy and I don't think there's any requirement that compilers refrain from breaking it. For example, I would expect that if a compiler sees int a[1], it would be within its rights to regard a[i] as a[0]. If someone tries to work around the alignment issues of the struct hack via something like
typedef struct {
uint32_t size;
uint8_t data[4]; // Use four, to avoid having padding throw off the size of the struct
}
a compiler might get clever and assume the array size really is four:
; As written
foo = myStruct->data[i];
; As interpreted (assuming little-endian hardware)
foo = ((*(uint32_t*)myStruct->data) >> (i << 3)) & 0xFF;
Such an optimization might be reasonable, especially if myStruct->data could be loaded into a register in the same operation as myStruct->size. I know nothing in the standard that would forbid such optimization, though of course it would break any code which might expect to access stuff beyond the fourth element.
Definitely you can't have zero sized arrays by standard, but actually every most popular compiler gives you to do that. So I will try to explain why it can be bad
#include <cstdio>
int main() {
struct A {
A() {
printf("A()\n");
}
~A() {
printf("~A()\n");
}
int empty[0];
};
A vals[3];
}
I am like a human would expect such output:
A()
A()
A()
~A()
~A()
~A()
Clang prints this:
A()
~A()
GCC prints this:
A()
A()
A()
It is totally strange, so it is a good reason not to use empty arrays in C++ if you can.
Also there is extension in GNU C, which gives you to create zero length array in C, but as I understand it right, there should be at least one member in structure prior, or you will get very strange examples as above if you use C++.

What happen if we initialize all member of union at the same time?

If it happens and we initialize the union with two values I know that it will take the int number but I really want to know what happens behind the scenes
#include <stdio.h>
typedef union x
{
int y;
char x[6];
};
int main(void)
{
union x first={4,"AAAAAA"};
printf("%d\n",first.y);
printf("%s\n",first.x);
return 0;
}
Did this compile? it should not. the memory allocated will be written over for every time you repurpose the union.
You can define a union with many members, but only one member can contain a value at any given time. Unions provide an efficient way of using the same memory location for multiple-purpose.
The code in the question should not compile without at least a diagnostic about 'too many initializers' for the union variable. You might also get a warning about a useless storage class specifier in empty declaration because the typedef doesn't actually define an alias for union x.
Suppose you revised the code to use designated initializers, like this:
#include <stdio.h>
union x
{
int y;
char x[6];
};
int main(void)
{
union x first = { .y = 4, .x = "AAAAA" };
printf("%d\n", first.y);
printf("%s\n", first.x);
return 0;
}
This would compile and run, but with the compiler set to fussy, you might get warnings like warning: initialized field overwritten [-Woverride-init].
Note that there is one less A in the initializer for .x shown above than in the original. That ensures that the value is a (null-terminated) string, not just an array of bytes. In this context, the designated initializer for .x overrides the designated initializer for .y, and therefore the value in .x is fully valid. The output I got, for example, was:
1094795585
AAAAA
The decimal number corresponds to hex 0x41414141 as might be expected.
Note that I removed the pointless typedef. My default compilation rules wouldn't accept the code; I had to cancel -Werror and -Wextra options to get it to compile. The original code compiled with warnings without the -Werror to convert the warnings into error. Even adding -pedantic didn't trigger an error for the extra initializer (though the warning was always given, as required by the standard).

Initialize array with size specified by a variable

I was writing the following code
#include<stdio.h>
void fun(int n) {
int a[n] = {0};
}
void main() {
int a[4] = {0};
int i = 0;
fun(3);
}
and got this error
test.c: In function 'fun':
test.c:5:5: error: variable-sized object may not be initialized
while if I change the function fun to:-
void fun(int n) {
int a[n], i = 0;
for(i = 0; i < n; i++) {
a[i] = 0;
}
}
it works fine.
I know that the error is occuring because it's not allowed in the compiler's specification but what i want to know is why is it not possible to be implemented?
Is it due to some compile time or run time evaluation issue?
I have seen the answer of other question but i need a more elaborated answer.
Variable Length Array cannot be initialized like this
int a[n]={0};
From C Standards#6.7.9p3 Initialization [emphasis added]
The type of the entity to be initialized shall be an array of unknown size or a complete object type that is not a variable length array type.
Using loop is one way to initialize the variable length array's. You can also use memset like this:
memset(a, 0, sizeof a);
Additional:
The C99 compiler should support the Variable Length Array's but they were made optional in C11 compiler.
An easy way is to send the size of array along with other parameters
Remember that you should send size before an array with that size
void fun(int n,int a[n]){
}
Although you have other alternatives like sizeof()
As an addition to H.S. answer:
From C Standards#6.7.9p3 Initialization [emphasis added]
The type of the entity to be initialized shall be an array of unknown size or a complete object type that is not a variable length array type.
This is probably because Initializers have to be constant expressions. Constant expression have a definite value at compile time.
A {0} is an incomplete Initializer and the compiler would fill up the remaining values with 0.
If you have a VLA the compiler does not know the length of the Array and thus can not generate the initializer for it.
This depends on your compiler actually.
In old C You couldn't have variable size arrays. In function fun you use a as an array with variable size n. This is not allowed in old C. However, C99 and C11 standards support variable size arrays, so perhaps you have an old compiler. (DevC?) If you want to use some type of variable arrays in older C compilers, you have to use malloc and free.
Perhaps you wanted to write this code in C++? C++ doesn't support variable size arrays also, but the gcc compiler can run this code.
Check this out:
Why aren't variable-length arrays part of the C++ standard?
If you are using DevC, I think that if you change your file from test.c to test.cpp this code will work.

Interpretation of an 'Empty' C Array (int a = {};)

I have a snippet of code that defines (what I believe to be) an empty array, i.e. an array containing no elements:
int a[] = {};
I compiled the snippet with gcc with no problem
A colleague attempting to get that same code to compile under MSVS made the modification:
int* a = NULL;
No he obviously thought that was an equivalent statemnent that would be acceptable to the MSVS compiler.
However, later in the code I retrieve the no. of elements in the array using the following macro:
#define sizearray(a) (sizeof(a) / sizeof((a)[0]))
when doing so:
sizearray({}) returns 0
this is as I would expect for what I believe to be a definition of an empty array
sizearray(NULL) returns 1
I'm thinking that sizeof(NULL)/sizeof((NULL)[0])) is actually 4/4 == 1
as NULL == (void*)0
My question is whether:
int a[] = {};
is a valid way of expressing an empty array, or whether its poor programming practice.
Also, is it the case that you can't use such an expression with the MSVS compiler, i.e. is this some sort of C99 compatibility issue?
UPDATE:
Just compiled this:
#include <stdio.h>
#define sizearray(a) (sizeof(a) / sizeof((a)[0]))
int main()
{
int a[] = {};
int b[] = {0};
int c[] = {0,1};
printf("sizearray a = %lu\n", sizearray(a));
printf("sizearray b = %lu\n", sizearray(b));
printf("sizearray c = %lu\n", sizearray(c));
return 0;
}
using this Makefile:
array: array.c
gcc -g -o array array.c
My compiler is:
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)
compiles without any complaint, output looks like this:
bph#marvin:~/projects/scratch/c/array$ ./array
sizearray a = 0
sizearray b = 1
sizearray c = 2
very curious? could it secretly be a C++ compiler, not a C compiler?
Tried John Bodes suggestion of additional compiler flags and can confirm that the compilation does then fail:
gcc --std=c11 --pedantic -Wall -g -o array array.c
array.c: In function ‘main’:
array.c:7:15: warning: ISO C forbids empty initializer braces [-Wpedantic]
int a[] = {};
^
array.c:7:9: error: zero or negative size array ‘a’
int a[] = {};
^
Makefile:2: recipe for target 'array' failed
make: *** [array] Error 1
Empty initializers are invalid in C. So
int a = {};
is ill-formed. See 6.7.9 Initialization.
sizearray(NULL) is not valid either. Because the sizearray macro would expand to:
sizeof 0 /sizeof 0[0])
If NULL is defined as 0. This is not valid because the 0[0] isn't valid because of there's no pointer or array involved (as required for pointer arithmetic - remember a[b] is equivalent to *(a + b)).
Or, it would expand to:
(sizeof(((void *)0)) / sizeof((((void *)0))[0]))
if NULL was as ((void*)0). This is not valid because pointer arithmetic is not allowed on void pointers. See 6.5.6, 2 and void* is an incomplete type. Similar issue be present for whatever the definition of NULL is in an implementation (C standard is flexible with the definition of null pointer constant i.e., NULL. See 7.19, 3).
So in both cases, what you see is compiler specific behaviours for non-standard code.
This is not an array, but it's not a scalar either: it's a syntax error.
The C11 draft says, in §6.7.9.11 (Initialization Semantics):
The initializer for a scalar shall be a single expression, optionally
enclosed in braces. The initial value of the object is that of the
expression (after conversion); the same type constraints and conversions as
for simple assignment apply, taking the type of the scalar to be the
unqualified version of its declared type.
But there has to be something between the braces, it can't be empty.
So I'd argue that the question is missing something, and this was not the actual code.
It's not an array, it's the brace initialization syntax.
Short word, you can write this:
int a = {1234};
It does not initialize a with an array, it just assigns 1234. If there are 2 or more values, that's be an error.
Brace initialization disables value truncating, so:
char b = 258; // Valid, same as b = 2
char b = {258}; // Wrong, can't truncate value in braces
And empty braces are just zero-initializers, so int a = {} is equivalent to int a = {0}

Why does this code print addresses?

Why didn't I get a compile time error while accidentally printing only one dimension of a 2D array?
#include <stdio.h>
void main() {
int i;
int arr[2][3] = { 1, 2, 3, 4, 5, 6 }; //<- Declared a 2D array
for (i = 0; i < 6; i++) {
printf("%d\n", arr[i]); // <- Accidently forgot a dimension
}
}
I should have received a compile time error but instead I got a group of addresses! Why? What did arr[0] mean in this context to the compiler?
An expression with array type evaluates to a pointer to the first array element in most contexts (a notable exception, among others, is the sizeof operator).
In your example, arr[i] has array type. So it evaluates to a pointer of type int (*)[] (a pointer to an array). That's what's getting printed. Printing a pointer with %d is undefined behavior, because printf() will read the pointer as if it was an int.
Felix Palmen's answer explains the observed behavior.
Regarding your second question: the reason why you don't get a warning is you did not ask for them.
Compilers are notoriously lenient by default and will accept broken code including obvious undefined behavior. This particular one is not obvious because printf() accepts any number of extra arguments after the initial format string.
You can instruct your compiler to emit many useful warnings to avoid silly mistakes and detect non obvious programming errors.
gcc -Wall -Wextra -Werror
clang -Weverything -Werror
option /W3 or /W4 with Microsoft Visual Studio.
gcc and clang will complain about the sloppy initializer for array arr. It should read:
int arr[2][3] = { { 1, 2, 3 }, { 4, 5, 6 } };
The print loop is indeed surprising, did you really mean to print the array with a single loop?
Note also that the standard prototype for main without arguments is int main(void).

Resources