Statically initialize array within structure - c

Ain't gonna speak for other compilers, but in GNU GCC compiler you can statically initialize array with the following syntax:
struct some_struct {
unsigned *some_array;
} some_var = {
.some_array = (unsigned[]) { 1u, 2u, 3u, 4u, 5u, },
};
First I've met this syntax searching for the answer of a question I was concerned and came to this answer. But I've not found any link to GNU reference which covers this kind of syntax yet.
I'd be very grateful if someone share me a link on this syntax. Thank you!

Well, if your question is about compound literal syntax, then one important detail here is that you are not initializing an array within a structure. You are initializing a pointer within a structure. The code that you have now is formally correct.
If you really had an array inside your structure, then such initialization with a compound literal would not work. You cannot initialize an array from another array. Arrays are not copyable (with the exception of char array initialization from string literal). However, in that case you'd be able to use an ordinary {}-enclosed initializer, not a compound literal.
Also keep in mind that the lifetime of the compound literal (unsigned[]) { 1u, 2u, 3u, 4u, 5u, } is determined by the scope in which it appears. If you do the above in local scope, the compound literal array will be destroyed at the end of the block. The pointer value (if you somehow manage to take it outside that block) will become invalid.

You won't likely find much GNU documentation on this because it is not a GCC extension - this is a part of standard C syntax called a compound literal. It is defined in the C standard, in sections 6.5.2.5 and 6.7.9 (the latter covers the part between the braces, which is the same for both compound literals and static initialisers, so the standard only describes it once).
You can use this syntax to describe dynamic object values as well, not just for static initialisations, even standing alone in an expression without having been assigned to any variable. A compound literal can appear essentially anywhere a variable name can appear: you can pass them to functions, create them just to access one element, take their address (you can even assign to them, although it's not obvious how that's useful).
The syntax is uniform across all C value types and can be used to create arrays (designate specific elements to set with [N]=), structs and unions (designate specific elements with .field=) and even numeric types (no elements, so don't designate, just put the value between the braces). The syntax is intended to be simple and consistent for macros and code generators to produce (in addition to being elegant to write by hand).

Related

Can't we initialize automatic array variables?

I was referring to the book "Theory and Problems of Programming with C" by Gottfried (Schaum's Outline series, 2nd Edition, 1996).
On page number 243 section 9.1 in chapter 9 on Arrays, it says:
Automatic arrays, unlike automatic variables, cannot be initialized. However, external and static array definitions can include the assignment of initial values if desired.
I did not understand the meaning of this highlighted statement. I tried to initialize array (with and without auto keyword) inside the function and do not see any issue with it.
void func1 (void)
{
auto int array1[5] ={1,0,4,1,5};
charVar1='M';
printf("%d", *(array1+4));
}
Added the image of the page
To answer the first part
Automatic arrays, unlike automatic variables, cannot be initialized
assuming the "Automatic arrays" are actually "array data structure of automatic storage duration whose length is determined at run time"
Yes, here what is referred to is called a variable length array. It cannot be initialized as for the simple logic, the size is determined at runtime.
To quote the C11 standard, chapter §6.7.9, Initialization (emphasis mine)
The type of the entity to be initialized shall be an array of unknown size or a complete object type that is not a variable length array type.
Otherwise, for a local variable without any storage class specifier, defaults to auto and an automatic array of non-VLA type, can be initialized, for sure.
Trivial demonstration that the statement is actually wrong (as opposed to e.g. array-initialization being a common but non-standard extension):
void doSomethingWithArray(size_t sz, int arr[static sz]);
int main(void) {
doSomethingWithArray(5,
(int[]){ 1, 2, 3, 4, 5 }
);
}
An anonymous array is created with automatic storage duration within the scope of main. Since it is anonymous, there is no way for code within main to refer to it to set element values. The only way to put values into this particular automatic array is via initialization. This feature - being able to initialize automatic arrays - is therefore legal, standard, and necessary.
QED.
Given that this book is rather ancient, the concepts of VLAs and compound literals were not invented.
Automatic variables is the formal term for local variables. There's even the keyword auto for it, but it is one of the most superfluous keywords in the language, since all local variables are implicitly declared as auto. That is:
{
auto int x = 1;
}
and
{
int x = 1;
}
are 100% equivalent, the auto keywords adds nothing (so nobody uses the former style).
So by the term automatic arrays, the author perhaps simply refers to plain local arrays. As we can see from the example in the question, you can initialize such arrays just fine. The book is incorrect and/or uses the wrong terms.
The author of that book seems confused in general: "assignment of initial values" is the very C definition of initialization. The formal definition can be found in the C standard syntax at 6.7.8, where "assignment-expression" is one of the valid forms for initialization.
I'd recommend to find another book, preferably one that covers the latest C standards C99 and C11.

Need help to understand the syntax in xv6 kernel

I am reading files of xv6 kernel and I cannot understand what the following means:
static int (*syscalls[])(void) = {
[SYS_fork] sys_fork,
[SYS_exit] sys_exit,
[SYS_wait] sys_wait,
[SYS_pipe] sys_pipe,
...
}
Can someone explain this to me? Especially what square brackets (e.g [SYS_fork]) mean.
Thank you
That code is making an array of function pointers, using an old alternative GNU extension for designated initialization.
Designated initializations is a feature that was added to C in C99 that lets you specify which array index to assign a specific value for arrays, so they need not be in order. The same feature exists for struct initializations where you can specify the specific field to assign a given value to.
The C99 syntax for array designated initializations is [index] = value. This code in particular though is using an older alternative syntax from GCC, which as per this document has been obsolete since GCC 2.5, in which there is no equals sign used.
In syscall.c the indices are specified using macros defined in syscall.h, the first of which is defined to 1 in syscall.h, et.c.
This is most likely a non-standard way of initializing an array of function pointers. The identifiers SYS_fork etc. are very likely macros or enum constants specifying the element index.
Another possibility is that this is not a C file, but is turned into a syntactically valid C file using some filtering tool prior to compilation.

Can't initialize static structure with function pointer from another translation unit?

The Python documentation claims that the following does not work on "some platforms or compilers":
int foo(int); // Defined in another translation unit.
struct X { int (*fptr)(int); } x = {&foo};
Specifically, the Python docs say:
We’d like to just assign this to the tp_new slot, but we can’t, for
portability sake, On some platforms or compilers, we can’t statically
initialize a structure member with a function defined in another C
module, so, instead, we’ll assign the tp_new slot in the module
initialization function just before calling PyType_Ready(). --http://docs.python.org/extending/newtypes.html
Is the above standard C89 and/or C99? What compilers specifically cannot handle the above?
That kind of initialization has been permitted since at least C90.
From C90 6.5.7 "Initialization"
All the expressions in an initializer for an object that has static storage duration or in an initializer list for an object that has aggregate or union type shall be constant expressions.
And 6.4 "Constant expressions":
An address constant is a pointer to an lvalue designating an object of static storage duration, or to a function designator; it shall be created explicitly, using the unary & operator...
But it's certainly possible that some implementations might have trouble with the construct - I'd guess that wouldn't be true for modern implementations.
According to n1570 6.6 paragraph 9, the address of a function is an address constant, according to 6.7.9 this means that it can be used to initialize global variables. I am almost certain this is also valid C89.
However,
On sane platforms, the value of a function pointer (or any pointer, other than NULL) is only known at runtime. This means that the initialization of your structure can't take place until runtime. This doesn't always apply to executables but it almost always applies to shared objects such as Python extensions. I recommend reading Ulrich Drepper's essay on the subject (link).
I am not aware of which platforms this is broken on, but if the Python developers mention it, it's almost certainly because one of them got bitten by it. If you're really curious, try looking at an old Python extension and seeing if there's an appropriate message in the commit logs.
Edit: It looks like most Python modules just do the normal thing and initialize type structures statically, e.g., static type obj = { function_ptr ... };. For example, look at the mmap module, which is loaded dynamically.
The example is definitively conforming to C99, and AFAIR also C89.
If some particular (oldish) compiler has a problem with it, I don't think that the proposed solution is the way to go. Don't impose dynamic initialization to platforms that behave well. Instead, special case the weirdos that need special treatment. And try to phase them out as quickly as you may.

The const modifier in C

I'm quite often confused when coming back to C by the inability to create an array using the following initialisation pattern...
const int SOME_ARRAY_SIZE = 6;
const int myArray[SOME_ARRAY_SIZE];
My understanding of the problem is that the const operator does not guarantee const-ness but rather merely asserts that the value pointed to by SOME_ARRAY_SIZE will not change at runtime. But why can the compiler not assume that the value is constant at compile time? It says 6 right there in the source code...
I think I'm missing something core in my fundamental understanding of C. Somebody help me out here. :)
[UPDATE]After reading a bit more around C99 and variable length arrays I think I understand this a bit better. What I was trying to create was a variable length array - const does not create a compile time constant but rather a runtime constant. Therfore I was initialising a variable length array, which is only valid in C99 at a function/block scope. A variable length array at the file scope is impossible as the compiler cannot assign a fixed memory address to an unbounded array.[/UPDATE]
Well, in C++ the semantics are a bit different. In C++ your code would work fine. You must distinguish between 2 things, const and constant expression. Const means simply, as you described, that the value is read-only. constant expression, on the other hand, means the value is known compile time and is a compile-time constant. The semantics of const in C are always of the first type. The only constant expressions in C are literals, that's why #define is used for such kind of things.
In C++ however, any const object initialized with a constant expression is in itself a constant expression.
I don't know exactly WHY this is so in C, it's just the way it is
The problem is that the language syntax demands a integer value between the [ ]. SOME_ARRAY_SIZE is still a variable (even if you told the compiler nobody is allowed to vary it!)
The const keyword is basically a read-only indication. It does not, really, indicate the underlying value will not change, even though that is the case in your example.
When it comes to pointers, this is more clear:
void foo(int const * p)
{
if (*p == 100)
{
bar();
/* Here, the compiler can not assume that *p is 100 */
}
}
In this case, a compiler should not accept the code in your example, as it requires the array size to be constant. If it would accept it, the user could later run into trouble when porting the code a more strict compiler.
You can do this in C99, and some compilers prior to C99 also had support for this as an extension to C89 (e.g. gcc). If you're stuck with an old compiler that doesn't have C99 support though (e.g. MSVC) then you'll have to do it the old skool way and use a #define for the array size.
Note that that above comments apply only to such declarations at local scope (i.e. automatic variables). C99 still doesn't allow such declarations at global scope.
i just did a very quick test with my Xcode and Objective C file I currently had open on my machine and put this in the .m file:
const int arrs = 6;
const int arr[arrs];
This compiles without any issues.

Why do most C developers use define instead of const? [duplicate]

This question already has answers here:
"static const" vs "#define" vs "enum"
(17 answers)
Closed 6 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
In many programs a #define serves the same purpose as a constant. For example.
#define FIELD_WIDTH 10
const int fieldWidth = 10;
I commonly see the first form preferred over the other, relying on the pre-processor to handle what is basically an application decision. Is there a reason for this tradition?
There is a very solid reason for this: const in C does not mean something is constant. It just means a variable is read-only.
In places where the compiler requires a true constant (such as for array sizes for non-VLA arrays), using a const variable, such as fieldWidth is just not possible.
They're different.
const is just a qualifier, which says that a variable cannot be changed at runtime. But all other features of the variable persist: it has allocated storage, and this storage may be addressed. So code does not just treat it as a literal, but refers to the variable by accessing the specified memory location (except if it is static const, then it can be optimized away), and loading its value at runtime. And as a const variable has allocated storage, if you add it to a header and include it in several C sources, you'll get a "multiple symbol definition" linkage error unless you mark it as extern. And in this case the compiler can't optimize code against its actual value (unless global optimization is on).
#define simply substitutes a name with its value. Furthermore, a #define'd constant may be used in the preprocessor: you can use it with #ifdef to do conditional compilation based on its value, or use the stringizing operator # to get a string with its value. And as the compiler knows its value at compile time it may optimize code based on that value.
For example:
#define SCALE 1
...
scaled_x = x * SCALE;
When SCALE is defined as 1 the compiler can eliminate the multiplication as it knows that x * 1 == x, but if SCALE is an (extern) const, it will need to generate code to fetch the value and perform the multiplication because the value will not be known until the linking stage. (extern is needed to use the constant from several source files.)
A closer equivalent to using #define is using enumerations:
enum dummy_enum {
constant_value = 10010
};
But this is restricted to integer values and doesn't have advantages of #define, so it is not widely used.
const is useful when you need to import a constant value from some library where it was compiled in. Or if it is used with pointers. Or if it is an array of constant values accessed through a variable index value. Otherwise, const has no advantages over #define.
The reason is that most of the time, you want a constant, not a const-qualified variable. The two are not remotely the same in the C language. For example, variables are not valid as part of initializers for static-storage-duration objects, as non-vla array dimensions (for example the size of an array in a structure, or any array pre-C99).
Expanding on R's answer a little bit: fieldWidth is not a constant expression; it's a const-qualified variable. Its value is not established until run-time, so it cannot be used where a compile-time constant expression is required (such as in an array declaration, or a case label in a switch statement, etc.).
Compare with the macro FIELD_WIDTH, which after preprocessing expands to the constant expression 10; this value is known at compile time, so it can be used for array dimensions, case labels, etc.
To add to R.'s and Bart's answer: there is only one way to define symbolic compile time constants in C: enumeration type constants. The standard imposes that these are of type int. I personally would write your example as
enum { fieldWidth = 10 };
But I guess that taste differs much among C programmers about that.
Although a const int will not always be appropriate, an enum will usually work as a substitute for the #define if you are defining something to be an integral value. This is actually my preference in such a case.
enum { FIELD_WIDTH = 16384 };
char buf[FIELD_WIDTH];
In C++ this is a huge advantage as you can scope your enum in a class or namespace, whereas you cannot scope a #define.
In C you don't have namespaces and cannot scope an enum inside a struct, and am not even sure you get the type-safety, so I cannot actually see any major advantage, although maybe some C programmer there will point it out to me.
According to K&R (2nd edition, page 211) the "const and volatile properties are new with the ANSI standard". This may imply that really old ANSI code did not have these keywords at all and it really is just a matter of tradition.
Moreover, it says that a compiler should detect attempts to change const variables but other than that it may ignore these qualifiers. I think it means that some compilers may not optimize code containing const variable to be represented as intermediate value in machine code (like #define does) and this might cost in additional time for accessing far memory and affect performance.
Some C compilers will store all const variables in the binary, which if preparing a large list of coefficients can use up a tremendous amount of space in the embedded world.
Conversely: using const allows flashing over an existing program to alter specific parameters.
The best way to define numeric constants in C is using enum. Read the corresponding chapter of K&R's The C Programming Language, page 39.

Resources