I'm looking through the "Processor Modeling Guide" provided by a company named OVP (a product similar to qemu). In it, there's a little code snippet resembling the following:
static or1kDispatchTableC dispatchTable = {
// handle arithmetic instructions
[OR1K_IT_ADDI] = disDefault,
[OR1K_IT_ADDIC] = disDefault,
[OR1K_IT_ANDI] = disDefault,
[OR1K_IT_ORI] = disDefault,
[OR1K_IT_XORI] = disDefault,
[OR1K_IT_MULI] = disDefault
};
I've never seen syntax like this before. irrelevant stuff about C++ removed
At the moment I don't have the ability to download/look at their stuff to look at how anything is defined, hence my question. If you recognize this syntax, can you weigh in?
edit
or1kDispatchTableC is a typedef for a pointer of type or1kDispatchTableCP, but I still don't have anything on what or1kDispatchTableCP is.
Well, assuming your first line is a typo, or or1kDispatchTableC is an array type, so that this is actually an array declaration, this looks like a C11 explicitly initialized array. The line
[OR1K_IT_ADDI] = disDefault,
initializes element OR1K_IT_ADDI to disDefault. Both of those need to be constant expressions -- OR1K_IT_ADDI is probably a #define or an enum tag.
I'm pretty sure that C++11 does NOT support this syntax, though some compilers (that also support C11) might support it as an extension.
From the names, I would guess that this is actually an array of function pointers.
This is called designated initializers and is a C feature (supported since C99). It allows addressing array and structure/union elements directly, filling the gaps with default values.
struct foo { int a[10]; };
struct foo f[] = { [5].a[3] = 20 };
Now this results in 5 elements of struct foo, all initialized to zero followed by a sixth element of struct foo with the 4th element of a initialized to 20.
Like someone else suspected, this is not supported by C++.
Related
Just curious, what actually happens if I define a zero-length array int array[0]; in code? GCC doesn't complain at all.
Sample Program
#include <stdio.h>
int main() {
int arr[0];
return 0;
}
Clarification
I'm actually trying to figure out if zero-length arrays initialised this way, instead of being pointed at like the variable length in Darhazer's comments, are optimised out or not.
This is because I have to release some code out into the wild, so I'm trying to figure out if I have to handle cases where the SIZE is defined as 0, which happens in some code with a statically defined int array[SIZE];
I was actually surprised that GCC does not complain, which led to my question. From the answers I've received, I believe the lack of a warning is largely due to supporting old code which has not been updated with the new [] syntax.
Because I was mainly wondering about the error, I am tagging Lundin's answer as correct (Nawaz's was first, but it wasn't as complete) -- the others were pointing out its actual use for tail-padded structures, while relevant, isn't exactly what I was looking for.
An array cannot have zero size.
ISO 9899:2011 6.7.6.2:
If the expression is a constant expression, it shall have a value greater than zero.
The above text is true both for a plain array (paragraph 1). For a VLA (variable length array), the behavior is undefined if the expression's value is less than or equal to zero (paragraph 5). This is normative text in the C standard. A compiler is not allowed to implement it differently.
gcc -std=c99 -pedantic gives a warning for the non-VLA case.
As per the standard, it is not allowed.
However it's been current practice in C compilers to treat those declarations as a flexible array member (FAM) declaration:
C99 6.7.2.1, §16: As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.
The standard syntax of a FAM is:
struct Array {
size_t size;
int content[];
};
The idea is that you would then allocate it so:
void foo(size_t x) {
Array* array = malloc(sizeof(size_t) + x * sizeof(int));
array->size = x;
for (size_t i = 0; i != x; ++i) {
array->content[i] = 0;
}
}
You might also use it statically (gcc extension):
Array a = { 3, { 1, 2, 3 } };
This is also known as tail-padded structures (this term predates the publication of the C99 Standard) or struct hack (thanks to Joe Wreschnig for pointing it out).
However this syntax was standardized (and the effects guaranteed) only lately in C99. Before a constant size was necessary.
1 was the portable way to go, though it was rather strange.
0 was better at indicating intent, but not legal as far as the Standard was concerned and supported as an extension by some compilers (including gcc).
The tail padding practice, however, relies on the fact that storage is available (careful malloc) so is not suited to stack usage in general.
In Standard C and C++, zero-size array is not allowed..
If you're using GCC, compile it with -pedantic option. It will give warning, saying:
zero.c:3:6: warning: ISO C forbids zero-size array 'a' [-pedantic]
In case of C++, it gives similar warning.
It's totally illegal, and always has been, but a lot of compilers
neglect to signal the error. I'm not sure why you want to do this.
The one use I know of is to trigger a compile time error from a boolean:
char someCondition[ condition ];
If condition is a false, then I get a compile time error. Because
compilers do allow this, however, I've taken to using:
char someCondition[ 2 * condition - 1 ];
This gives a size of either 1 or -1, and I've never found a compiler
which would accept a size of -1.
Another use of zero-length arrays is for making variable-length object (pre-C99). Zero-length arrays are different from flexible arrays which have [] without 0.
Quoted from gcc doc:
Zero-length arrays are allowed in GNU C. They are very useful as the last element of a structure that is really a header for a variable-length object:
struct line {
int length;
char contents[0];
};
struct line *thisline = (struct line *)
malloc (sizeof (struct line) + this_length);
thisline->length = this_length;
In ISO C99, you would use a flexible array member, which is slightly different in syntax and semantics:
Flexible array members are written as contents[] without the 0.
Flexible array members have incomplete type, and so the sizeof operator may not be applied.
A real-world example is zero-length arrays of struct kdbus_item in kdbus.h (a Linux kernel module).
I'll add that there is a whole page of the online documentation of gcc on this argument.
Some quotes:
Zero-length arrays are allowed in GNU C.
In ISO C90, you would have to give contents a length of 1
and
GCC versions before 3.0 allowed zero-length arrays to be statically initialized, as if they were flexible arrays. In addition to those cases that were useful, it also allowed initializations in situations that would corrupt later data
so you could
int arr[0] = { 1 };
and boom :-)
Zero-size array declarations within structs would be useful if they were allowed, and if the semantics were such that (1) they would force alignment but otherwise not allocate any space, and (2) indexing the array would be considered defined behavior in the case where the resulting pointer would be within the same block of memory as the struct. Such behavior was never permitted by any C standard, but some older compilers allowed it before it became standard for compilers to allow incomplete array declarations with empty brackets.
The struct hack, as commonly implemented using an array of size 1, is dodgy and I don't think there's any requirement that compilers refrain from breaking it. For example, I would expect that if a compiler sees int a[1], it would be within its rights to regard a[i] as a[0]. If someone tries to work around the alignment issues of the struct hack via something like
typedef struct {
uint32_t size;
uint8_t data[4]; // Use four, to avoid having padding throw off the size of the struct
}
a compiler might get clever and assume the array size really is four:
; As written
foo = myStruct->data[i];
; As interpreted (assuming little-endian hardware)
foo = ((*(uint32_t*)myStruct->data) >> (i << 3)) & 0xFF;
Such an optimization might be reasonable, especially if myStruct->data could be loaded into a register in the same operation as myStruct->size. I know nothing in the standard that would forbid such optimization, though of course it would break any code which might expect to access stuff beyond the fourth element.
Definitely you can't have zero sized arrays by standard, but actually every most popular compiler gives you to do that. So I will try to explain why it can be bad
#include <cstdio>
int main() {
struct A {
A() {
printf("A()\n");
}
~A() {
printf("~A()\n");
}
int empty[0];
};
A vals[3];
}
I am like a human would expect such output:
A()
A()
A()
~A()
~A()
~A()
Clang prints this:
A()
~A()
GCC prints this:
A()
A()
A()
It is totally strange, so it is a good reason not to use empty arrays in C++ if you can.
Also there is extension in GNU C, which gives you to create zero length array in C, but as I understand it right, there should be at least one member in structure prior, or you will get very strange examples as above if you use C++.
In C you can declare a variable that points to an array like this:
int int_arr[4] = {1,2,3,4};
int (*ptr_to_arr)[4] = &int_arr;
Although practically it is the same as just declaring a pointer to int:
int *ptr_to_arr2 = int_arr;
But syntactically it is something different.
Now, how would a function look like, that returns such a pointer to an array (of int e.g.) ?
A declaration of int is int foo;.
A declaration of an array of 4 int is int foo[4];.
A declaration of a pointer to an array of 4 int is int (*foo)[4];.
A declaration of a function returning a pointer to an array of 4 int is int (*foo())[4];. The () may be filled in with parameter declarations.
As already mentioned, the correct syntax is int (*foo(void))[4]; And as you can tell, it is very hard to read.
Questionable solutions:
Use the syntax as C would have you write it. This is in my opinion something you should avoid, since it's incredibly hard to read, to the point where it is completely useless. This should simply be outlawed in your coding standard, just like any sensible coding standard enforces function pointers to be used with a typedef.
Oh so we just typedef this just like when using function pointers? One might get tempted to hide all this goo behind a typedef indeed, but that's problematic as well. And this is since both arrays and pointers are fundamental "building blocks" in C, with a specific syntax that the programmer expects to see whenever dealing with them. And the absensce of that syntax suggests an object that can be addressed, "lvalue accessed" and copied like any other variable. Hiding them behind typedef might in the end create even more confusion than the original syntax.
Take this example:
typedef int(*arr)[4];
...
arr a = create(); // calls malloc etc
...
// somewhere later, lets make a hard copy! (or so we thought)
arr b = a;
...
cleanup(a);
...
print(b); // mysterious crash here
So this "hide behind typedef" system heavily relies on us naming types somethingptr to indicate that it is a pointer. Or lets say... LPWORD... and there it is, "Hungarian notation", the heavily criticized type system of the Windows API.
A slightly more sensible work-around is to return the array through one of the parameters. This isn't exactly pretty either, but at least somewhat easier to read since the strange syntax is centralized to one parameter:
void foo (int(**result)[4])
{
...
*result = &arr;
}
That is: a pointer to a pointer-to-array of int[4].
If one is prepared to throw type safety out the window, then of course void* foo (void) solves all of these problems... but creates new ones. Very easy to read, but now the problem is type safety and uncertainty regarding what the function actually returns. Not good either.
So what to do then, if these versions are all problematic? There are a few perfectly sensible approaches.
Good solutions:
Leave allocation to the caller. This is by far the best method, if you have the option. Your function would become void foo (int arr[4]); which is readable and type safe both.
Old school C. Just return a pointer to the first item in the array and pass the size along separately. This may or may not be acceptable from case to case.
Wrap it in a struct. For example this could be a sensible implementation of some generic array type:
typedef struct
{
size_t size;
int arr[];
} array_t;
array_t* alloc (size_t items)
{
array_t* result = malloc(sizeof *result + sizeof(int[items]));
return result;
}
The typedef keyword can make things a lot clearer/simpler in this case:
int int_arr[4] = { 1,2,3,4 };
typedef int(*arrptr)[4]; // Define a pointer to an array of 4 ints ...
arrptr func(void) // ... and use that for the function return type
{
return &int_arr;
}
Note: As pointed out in the comments and in Lundin's excellent answer, using a typedef to hide/bury a pointer is a practice that is frowned-upon by (most of) the professional C programming community – and for very good reasons. There is a good discussion about it here.
However, although, in your case, you aren't defining an actual function pointer (which is an exception to the 'rule' that most programmers will accept), you are defining a complicated (i.e. difficult to read) function return type. The discussion at the end of the linked post delves into the "too complicated" issue, which is what I would use to justify use of a typedef in a case like yours. But, if you should choose this road, then do so with caution.
First off, typechecking is not exactly the correct term I'm looking for, so I'll explain:
Say I want to use an anonymous union, I make the union declaration in the struct const, so after initialization the values will not change. This should allow for statically checking whether the uninitialized member of the union is being accessed. In the below example the instances are initialized for either a (int) or b (float). After this initialization I would like to not be able to access the other member:
struct Test{
const union{
const int a;
const float b;
};
};
int main(){
struct Test intContainer = { .a=5 };
struct Test floatContainer = { .b=3.0 };
int validInt = intContainer.a;
int validFloat = floatContainer.b;
// For these, it could be statically determined that these values are not in use (therefore invalid access)
int invalidInt = floatContainer.a;
float invalidFloat = intContainer.b;
return 0;
}
I'd hope to have the last two assignments to give an error (or at least a warning), but it gives none (using gcc 4.9.2). Is C designed to not check for this, or is it actually a shortcoming of the language/compiler? Or is it just plain stupid to want to use such a pattern?
In my eyes it looks like it has a lot of potential if this was a feature, so can someone explain to me why I can't use this as a way to differentiate between two "sub-types" of a same struct (one for each union value). (Potentially any suggestions how I can still do something like this?)
EDIT:
So apparently it is not in the language standard, and also compilers don't check it. Still I personally think it would be a good feature to have, since it's just eliminating manually checking for the union's contents using tagged unions. So I wonder, does anyone have an idea why it is not featured in the language (or it's compilers)?
I'd hope to have the last two assignments to give an error (or at least a warning), but it gives none (using gcc 4.9.2). Is C designed to not check for this, or is it actually a shortcoming of the language/compiler?
This is a correct behavior of the compiler.
float invalidInt = floatContainer.a;
float invalidFloat = intContainer.b;
In the first declaration you are initializing a float object with an int value and in the second you are initializing a float object with a float value. In C you can assign (or initialize) any arithmetic types to any arithmetic types without any cast required. So no diagnostic required.
In your specific case you are also reading union members that are not the same members as the union member last used to store its value. Assuming the union members are of the same size (e.g., float and int here), this is a specified behavior and no diagnostic is required. If the size of union members are different, the behavior is unspecified (but still, no diagnostic required).
I am working with C99. My compiler is IAR Embedded workbench but I assume this question will be valid for some other compilers too.
I have a typedef enum with a few items in it and I added an element to a struct of that new type
typedef enum
{
foo1,
foo2
} foo_t;
typedef struct
{
foo_t my_foo;
...
} bar_t;
Now I want to create an instance of bar_t and initialize all of its memory to 0.
bar_t bar = { 0u };
This generates a warning that I mixing an enumerated with another type. The IAR specific warning number is Pe188. It compiles and works just fine since an enum is an unsigned int at the end of the day. But I'd like to avoid a thousand naggy warnings. What's a clean way to initialize struct types that have enumerated types in them to 0?
for the sake of argument lets assume bar_t has a lot of members - I want to just set them all to 0. I'd rather not type out something like this:
bar_t bar = { foo1, 0u, some_symbol,... , 0u};
EDIT: Extra note: I am complying with MISRA. So if a workaround is going to violate MISRA it just moves the problem for me. I'll get nagged by the MISRA checker instead.
If you really, literally want to initialize all the memory of a struct to 0, then that's spelled
memset(&bar, 0, sizeof(bar_t));
That's actually fairly common, and it even tends to work, but technically it's incorrect for what most people actually want. It is not guaranteed to do the right thing for most element types, because C makes fewer guarantees than many people think about the representations of values of different types, and about the meaning of various bit patterns.
The correct way to initialize an aggregate object (such as a struct) as if assigning the value zero to every element is what you started with, though the canonical way to write it is simply
bar_t bar = { 0 };
(no need for the 'u' suffix). There is a remote possibility that that variant would avoid the compiler warning, though I wouldn't really expect so.
Ashalynd gives the answer I think is most correct (that is,
bar_t bar = { foo1 };
). If that somehow violates code conventions with which you must comply, then perhaps you could reorganize your struct so that the first element is of any type other than an enum type.
Not long ago I read about this elegant way of nested structure initialization in someone else's code, however I have never found any documentation about this particular kind of approach. In a nutshell, it can entirely do without { } braces, except for the ordinary brace after the = operator.
The specialty in here (at least for me) is that initializations start with a dot operator, which is a syntax that reminded me a lot of object literals declaration in JavaScript (cf. there)
However, I'm wondering: has this always been allowed in ISO-C? It looks entirely new to me, and that may as well be, since I still code a pre-1999 C style and have never occupied myself too much with the new C (not C++) features.
Nevertheless, I'd like to know about sources where this kind of initialization is documented and how it is called.
Enough of my talk, here is the code snippet (albeit stripped down a lot for simplicity's sake):
typedef struct plugin_s {
/* plugin version */
int16_t version_major;
int16_t version_minor;
} plugin_t;
typedef struct example_plugin_s {
plugin_t stdplugin;
int (*is_ongoing) (void);
int (*is_stalled) (void);
/* lots more stuff here, which is not relevant at this point */
} example_plugin_t;
/*
now the interesting bit is coming...
again: no braces used while initializing values
(as you would usually expect)!
*/
static example_plugin_t stdplugin = {
.stdplugin.version_major = 1,
.stdplugin.version_minor = 0,
.is_ongoing = funcref1,
.is_stalled = funcref2
};