I was referring to the book "Theory and Problems of Programming with C" by Gottfried (Schaum's Outline series, 2nd Edition, 1996).
On page number 243 section 9.1 in chapter 9 on Arrays, it says:
Automatic arrays, unlike automatic variables, cannot be initialized. However, external and static array definitions can include the assignment of initial values if desired.
I did not understand the meaning of this highlighted statement. I tried to initialize array (with and without auto keyword) inside the function and do not see any issue with it.
void func1 (void)
{
auto int array1[5] ={1,0,4,1,5};
charVar1='M';
printf("%d", *(array1+4));
}
Added the image of the page
To answer the first part
Automatic arrays, unlike automatic variables, cannot be initialized
assuming the "Automatic arrays" are actually "array data structure of automatic storage duration whose length is determined at run time"
Yes, here what is referred to is called a variable length array. It cannot be initialized as for the simple logic, the size is determined at runtime.
To quote the C11 standard, chapter §6.7.9, Initialization (emphasis mine)
The type of the entity to be initialized shall be an array of unknown size or a complete object type that is not a variable length array type.
Otherwise, for a local variable without any storage class specifier, defaults to auto and an automatic array of non-VLA type, can be initialized, for sure.
Trivial demonstration that the statement is actually wrong (as opposed to e.g. array-initialization being a common but non-standard extension):
void doSomethingWithArray(size_t sz, int arr[static sz]);
int main(void) {
doSomethingWithArray(5,
(int[]){ 1, 2, 3, 4, 5 }
);
}
An anonymous array is created with automatic storage duration within the scope of main. Since it is anonymous, there is no way for code within main to refer to it to set element values. The only way to put values into this particular automatic array is via initialization. This feature - being able to initialize automatic arrays - is therefore legal, standard, and necessary.
QED.
Given that this book is rather ancient, the concepts of VLAs and compound literals were not invented.
Automatic variables is the formal term for local variables. There's even the keyword auto for it, but it is one of the most superfluous keywords in the language, since all local variables are implicitly declared as auto. That is:
{
auto int x = 1;
}
and
{
int x = 1;
}
are 100% equivalent, the auto keywords adds nothing (so nobody uses the former style).
So by the term automatic arrays, the author perhaps simply refers to plain local arrays. As we can see from the example in the question, you can initialize such arrays just fine. The book is incorrect and/or uses the wrong terms.
The author of that book seems confused in general: "assignment of initial values" is the very C definition of initialization. The formal definition can be found in the C standard syntax at 6.7.8, where "assignment-expression" is one of the valid forms for initialization.
I'd recommend to find another book, preferably one that covers the latest C standards C99 and C11.
Related
This is mainly a followup to Should definition and declaration match?
Question
Is it legal in C to have (for example) int a[10]; in one compilation unit and extern int a[4]; in another one ?
(You can find a working example in my answer to ref'd question)
Disclaimers :
I know it is dangerous and would not do it in production code
I know that if you have both in same compilation unit (typically through inclusion of a .h in the file containing the definition) compilers detects an error
I have already read Jonathan Leffler' excellent answer to How do I use extern to share variables between source files? but could not find the answer to this specific point there - even if Jonathan showed even worse usages ...
Even if different comments in referenced post spotted that as UB, I could not find any authoritative reference for it. So I would say that there is no UB here and that second compilation unit will have access to the beginning of the array, but I would really like a confirmation - or instead a reference about why it is UB
It is undefined behavior.
Section 6.2.7.2 of C99 states:
All declarations that refer to the same object or function shall have
compatible type; otherwise, the behavior is undefined.
NOTE: As mentioned in the comments below, the important part here is [...] that refer to the same object [...], which is further defined in 6.2.2:
In the set of translation units and libraries that constitutes an
entire program, each declaration of a particular identifier with
external linkage denotes the same object or function.
About the type compatibility rules for array types, section 6.7.5.2.4 of C99 clarifies what it means for two array types to be compatible:
For two array types to be compatible, both shall have compatible
element types, and if both size specifiers are present, and are
integer constant expressions, then both size specifiers shall have the
same constant value. If the two array types are used in a context
which requires them to be compatible, it is undefined behavior if the
two size specifiers evaluate to unequal values.
(Emphasis mine)
In the real world, as long as you stick to 1D arrays, it is probably harmless, because there is no bounds checking and the address of the first element remains the same regardless of the size specifier, but note that the sizeof operator will return different values in each source file (opening a wonderful opportunity to write buggy code).
Things start to get really ugly if you decide to extrapolate on this example and declare multidimensional arrays with different dimension sizes, because the offset of each element in the array will not match with the real dimensions any more.
Yes, it is legal. The language allows it.
In your specific case there will be no undefined behavior as the extern declared array is smaller than the actually allocated array.
It can be used in a case where the declaring module uses the "unpublished" array elements for e.g. housekeeping of its algorithms (abstraction hiding).
Ain't gonna speak for other compilers, but in GNU GCC compiler you can statically initialize array with the following syntax:
struct some_struct {
unsigned *some_array;
} some_var = {
.some_array = (unsigned[]) { 1u, 2u, 3u, 4u, 5u, },
};
First I've met this syntax searching for the answer of a question I was concerned and came to this answer. But I've not found any link to GNU reference which covers this kind of syntax yet.
I'd be very grateful if someone share me a link on this syntax. Thank you!
Well, if your question is about compound literal syntax, then one important detail here is that you are not initializing an array within a structure. You are initializing a pointer within a structure. The code that you have now is formally correct.
If you really had an array inside your structure, then such initialization with a compound literal would not work. You cannot initialize an array from another array. Arrays are not copyable (with the exception of char array initialization from string literal). However, in that case you'd be able to use an ordinary {}-enclosed initializer, not a compound literal.
Also keep in mind that the lifetime of the compound literal (unsigned[]) { 1u, 2u, 3u, 4u, 5u, } is determined by the scope in which it appears. If you do the above in local scope, the compound literal array will be destroyed at the end of the block. The pointer value (if you somehow manage to take it outside that block) will become invalid.
You won't likely find much GNU documentation on this because it is not a GCC extension - this is a part of standard C syntax called a compound literal. It is defined in the C standard, in sections 6.5.2.5 and 6.7.9 (the latter covers the part between the braces, which is the same for both compound literals and static initialisers, so the standard only describes it once).
You can use this syntax to describe dynamic object values as well, not just for static initialisations, even standing alone in an expression without having been assigned to any variable. A compound literal can appear essentially anywhere a variable name can appear: you can pass them to functions, create them just to access one element, take their address (you can even assign to them, although it's not obvious how that's useful).
The syntax is uniform across all C value types and can be used to create arrays (designate specific elements to set with [N]=), structs and unions (designate specific elements with .field=) and even numeric types (no elements, so don't designate, just put the value between the braces). The syntax is intended to be simple and consistent for macros and code generators to produce (in addition to being elegant to write by hand).
I am a newbie to C. I was reading the book by Kernighan & Ritchie and found that external variables must be initialized only with constant expressions. Why is it so? Can you explain me what happens internally? When are they initialized? Why can't we initialize an external variable using those defined before it?
According to C99 Standard: Section 6.7.8:
All the expressions in an initializer for an object that has static storage duration shall be constant expressions or string literals.
And external variables have static storage duration, so it must be initialized by constant expressions or string literals.
Here is a link maybe give you better explaination.
http://www.geeksforgeeks.org/understanding-extern-keyword-in-c/
They have a explaination I quote below:
extern int var = 0;
int main(void)
{
var = 10;
return 0;
}
Analysis: Guess this program will work? Well, here comes another
surprise from C standards. They say that..if a variable is only
declared and an initializer is also provided with that declaration,
then the memory for that variable will be allocated i.e. that variable
will be considered as defined. Therefore, as per the C standard, this
program will compile successfully and work.
Hope this could help.
Any object with static storage duration such as variables declared outside of a function or variables inside a function declared as static can only be initialized with constant values.
The basic reason for this is that executable statements can't be placed outside of a function.
If such objects are not explicitly initialized then they are initialized to zero for arithmetic types or the null pointer for pointer types.
The common implementation is for values assigned to objects with static storage duration to be written directly into the executable image as data and loaded with the program image.
The Python documentation claims that the following does not work on "some platforms or compilers":
int foo(int); // Defined in another translation unit.
struct X { int (*fptr)(int); } x = {&foo};
Specifically, the Python docs say:
We’d like to just assign this to the tp_new slot, but we can’t, for
portability sake, On some platforms or compilers, we can’t statically
initialize a structure member with a function defined in another C
module, so, instead, we’ll assign the tp_new slot in the module
initialization function just before calling PyType_Ready(). --http://docs.python.org/extending/newtypes.html
Is the above standard C89 and/or C99? What compilers specifically cannot handle the above?
That kind of initialization has been permitted since at least C90.
From C90 6.5.7 "Initialization"
All the expressions in an initializer for an object that has static storage duration or in an initializer list for an object that has aggregate or union type shall be constant expressions.
And 6.4 "Constant expressions":
An address constant is a pointer to an lvalue designating an object of static storage duration, or to a function designator; it shall be created explicitly, using the unary & operator...
But it's certainly possible that some implementations might have trouble with the construct - I'd guess that wouldn't be true for modern implementations.
According to n1570 6.6 paragraph 9, the address of a function is an address constant, according to 6.7.9 this means that it can be used to initialize global variables. I am almost certain this is also valid C89.
However,
On sane platforms, the value of a function pointer (or any pointer, other than NULL) is only known at runtime. This means that the initialization of your structure can't take place until runtime. This doesn't always apply to executables but it almost always applies to shared objects such as Python extensions. I recommend reading Ulrich Drepper's essay on the subject (link).
I am not aware of which platforms this is broken on, but if the Python developers mention it, it's almost certainly because one of them got bitten by it. If you're really curious, try looking at an old Python extension and seeing if there's an appropriate message in the commit logs.
Edit: It looks like most Python modules just do the normal thing and initialize type structures statically, e.g., static type obj = { function_ptr ... };. For example, look at the mmap module, which is loaded dynamically.
The example is definitively conforming to C99, and AFAIR also C89.
If some particular (oldish) compiler has a problem with it, I don't think that the proposed solution is the way to go. Don't impose dynamic initialization to platforms that behave well. Instead, special case the weirdos that need special treatment. And try to phase them out as quickly as you may.
In my college days I read about the auto keyword and in the course of time I actually forgot what it is. It is defined as:
defines a local variable as having a
local lifetime
I never found it is being used anywhere, is it really used and if so then where is it used and in which cases?
If you'd read the IAQ (Infrequently Asked Questions) list, you'd know that auto is useful primarily to define or declare a vehicle:
auto my_car;
A vehicle that's consistently parked outdoors:
extern auto my_car;
For those who lack any sense of humor and want "just the facts Ma'am": the short answer is that there's never any reason to use auto at all. The only time you're allowed to use auto is with a variable that already has auto storage class, so you're just specifying something that would happen anyway. Attempting to use auto on any variable that doesn't have the auto storage class already will result in the compiler rejecting your code. I suppose if you want to get technical, your implementation doesn't have to be a compiler (but it is) and it can theoretically continue to compile the code after issuing a diagnostic (but it won't).
Small addendum by kaz:
There is also:
static auto my_car;
which requires a diagnostic according to ISO C. This is correct, because it declares that the car is broken down. The diagnostic is free of charge, but turning off the dashboard light will cost you eighty dollars. (Twenty or less, if you purchase your own USB dongle for on-board diagnostics from eBay).
The aforementioned extern auto my_car also requires a diagnostic, and for that reason it is never run through the compiler, other than by city staff tasked with parking enforcement.
If you see a lot of extern static auto ... in any code base, you're in a bad neighborhood; look for a better job immediately, before the whole place turns to Rust.
auto is a modifier like static. It defines the storage class of a variable. However, since the default for local variables is auto, you don't normally need to manually specify it.
This page lists different storage classes in C.
The auto keyword is useless in the C language. It is there because before the C language there existed a B language in which that keyword was necessary for declaring local variables. (B was developed into NB, which became C).
Here is the reference manual for B.
As you can see, the manual is rife with examples in which auto is used. This is so because there is no int keyword. Some kind of keyword is needed to say "this is a declaration of a variable", and that keyword also indicates whether it is a local or external (auto versus extrn). If you do not use one or the other, you have a syntax error. That is to say, x, y; is not a declaration by itself, but auto x, y; is.
Since code bases written in B had to be ported to NB and to C as the language was developed, the newer versions of the language carried some baggage for improved backward compatibility that translated to less work. In the case of auto, the programmers did not have to hunt down every occurrence of auto and remove it.
It's obvious from the manual that the now obsolescent "implicit int" cruft in C (being able to write main() { ... } without any int in front) also comes from B. That's another backward compatibility feature to support B code. Functions do not have a return type specified in B because there are no types. Everything is a word, like in many assembly languages.
Note how a function can just be declared extrn putchar and then the only thing that makes it a function that identifier's use: it is used in a function call expression like putchar(x), and that's what tells the compiler to treat that typeless word as a function pointer.
In C auto is a keyword that indicates a variable is local to a block. Since that's the default for block-scoped variables, it's unnecessary and very rarely used (I don't think I've ever seen it use outside of examples in texts that discuss the keyword). I'd be interested if someone could point out a case where the use of auto was required to get a correct parse or behavior.
However, in the C++11 standard the auto keyword has been 'hijacked' to support type inference, where the type of a variable can be taken from the type of its initializer:
auto someVariable = 1.5; // someVariable will have type double
Type inference is being added mainly to support declaring variables in templates or returned from template functions where types based on a template parameter (or deduced by the compiler when a template is instantiated) can often be quite painful to declare manually.
With the old Aztec C compiler, it was possible to turn all automatic variables to static variables (for increased addressing speed) using a command-line switch.
But variables explicitly declared with auto were left as-is in that case. (A must for recursive functions which would otherwise not work properly!)
The auto keyword is similar to the inclusion of semicolons in Python, it was required by a previous language (B) but developers realized it was redundant because most things were auto.
I suspect it was left in to help with the transition from B to C. In short, one use is for B language compatibility.
For example in B and 80s C:
/* The following function will print a non-negative number, n, to
the base b, where 2<=b<=10. This routine uses the fact that
in the ASCII character set, the digits 0 to 9 have sequential
code values. */
printn(n, b) {
extern putchar;
auto a;
if (a = n / b) /* assignment, not test for equality */
printn(a, b); /* recursive */
putchar(n % b + '0');
}
auto can only be used for block-scoped variables. extern auto int is rubbish because the compiler can't determine whether this uses an external definition or whether to override the extern with an auto definition (also auto and extern are entirely different storage durations, like static auto int, which is also rubbish obviously). It could always choose to interpret it one way but instead chooses to treat it as an error.
There is one feature that auto does provide and that's enabling the 'everything is an int' rule inside a function. Unlike outside of a function, where a=3 is interpreted as a definition int a =3 because assignments don't exist at file scope, a=3 is an error inside a function because apparently the compiler always interprets it as an assignment to an external variable rather than a definition (even if there are no extern int a forward declarations in the function or in the file scope), but a specifier like static, const, volatile or auto would imply that it is a definition and the compiler takes it as a definition, except auto doesn't have the side effects of the other specifiers. auto a=3 is therefore implicitly auto int a = 3. Admittedly, signed a = 3 has the same effect and unsigned a = 3 is always an unsigned int.
Also note 'auto has no effect on whether an object will be allocated to a register (unless some particular compiler pays attention to it, but that seems unlikely)'
Auto keyword is a storage class (some sort of techniques that decides lifetime of variable and storage place) example. It has a behavior by which variable made by the Help of that keyword have lifespan (lifetime ) reside only within the curly braces
{
auto int x=8;
printf("%d",x); // here x is 8
{
auto int x=3;
printf("%d",x); // here x is 3
}
printf("%d",x); // here x is 8
}
I am sure you are familiar with storage class specifiers in C which are "extern", "static", "register" and "auto".
The definition of "auto" is pretty much given in other answers but here is a possible usage of "auto" keyword that I am not sure, but I think it is compiler dependent.
You see, with respect to storage class specifiers, there is a rule. We cannot use multiple storage class specifiers for a variable. That is why static global variables cannot be externed. Therefore, they are known only to their file.
When you go to your compiler setting, you can enable optimization flag for speed. one of the ways that compiler optimizes is, it looks for variables without storage class specifiers and then makes an assessment based on availability of cache memory and some other factors to see whether it should treat that variable using register specifier or not. Now, what if we want to optimize our code for speed while knowing that a specific variable in our program is not very important and we dont want compiler to even consider it as register. I though by putting auto, compiler will be unable to add register specifier to a variable since typing "register auto int a;" OR "auto register int a;" raises the error of using multiple storage class specifiers.
To sum it up, I thought auto can prohibit compiler from treating a variable as register through optimization.
This theory did not work for GCC compiler however I have not tried other compilers.