What is the defined behavior for something like the following?
#include <stdio.h>
typedef enum {
ENUM_VAL_1 = 1,
ENUM_VAL_2 = 2
} TEST_ENUM;
int main() {
TEST_ENUM testVar1 = ENUM_VAL_1;
TEST_ENUM ENUM_VAL_1 = ENUM_VAL_1;
TEST_ENUM testVar2 = ENUM_VAL_1;
printf("ENUM_VAL_1 = %u\n",ENUM_VAL_1);
printf("testVar1 = %u\n",testVar1);
printf("testVar2 = %u\n",testVar2);
return 0;
}
From my testing with both GCC and MSVC compilers, the behavior of this is that testVar1 will be set equal to the enumeration value "ENUM_VAL_1" or 1. However, the next statement will try to set the variable ENUM_VAL_1 equal to its own value, which is of course current uninitialized and thus garbage, instead of setting the variable ENUM_VAL_1 equal to the enumeration value ENUM_VAL_1. Then, of course, testVar2 will also get the same garbage value as the variable ENUM_VAL_1.
What is the defined behavior of this according to the C standards, or is this undefined behavior? Whether or not it is defined, I'm guessing this type of example is bad practice at very least due to the ambiguity.
Thanks!
According to the C Standard (6.2.1 Scopes of identifiers)
... If an identifier designates two different entities in the same name space, the scopes might overlap. If so, the scope of one entity
(the inner scope) will end strictly before the scope of the other
entity (the outer scope). Within the inner scope, the identifier
designates the entity declared in the inner scope; the entity declared
in the outer scope is hidden (and not visible) within the inner
scope.
And
7 Structure, union, and enumeration tags have scope that begins just
after the appearance of the tag in a type specifier that declares the
tag. Each enumeration constant has scope that begins just after the
appearance of its defining enumerator in an enumerator list. Any
other identifier has scope that begins just after the completion of
its declarator
So in this declaration
TEST_ENUM ENUM_VAL_1 = ENUM_VAL_1;
declarator ENUM_VAL_1 is considered completed before the sign =. So it hides the enumerator.
In fact it is initialized by itself and has an indeterminate value.
The same is valid for C++ (3.3.2 Point of declaration)
1 The point of declaration for a name is immediately after its
complete declarator (Clause 8) and before its initializer (if any),
except as noted below. [ Example:
int x = 12;
{ int x = x; }
Here the second x is initialized with its own (indeterminate) value.
—end example ]
I expected the TEST_ENUM ENUM_VAL_1 = ENUM_VAL_1; line to fail to compile, but it does. I changed the assigned value to ENUM_VAL_2, and the printing then gives ENUM_VAL_1 = 2, testVar1 = 1 and testVar2 = 2, so ENUM_VAL_1 is a local variable.
It is actually a routine scoping issue; it means that the variable declaration in main() shadows the declaration outside — and if the typedef were within main(), the code would not compile. Add -Wshadow to your compilation options to see the shadowing. After setting testVar1, ENUM_VAL_1 means the local variable, not the enumeration constant. Initializing a variable with itself doesn't really initialize the variable; it copies undefined garbage into the value.
Related
What name space is a typedef name in? Consider this code:
#include <stdio.h>
typedef struct x { // 'x' in tag name space
int x; // 'x' in member name space
int y;
} x; // ??
int main() {
x foo = { 1, 2 };
int x = 3; // 'x' in ordinary identifier name space
printf("%d %d %d\n", foo.x, foo.y, x);
}
This prints out '1 2 3' with gcc 4.4.7 (and g++ 4.4.7), so type names are separate from tag, member, and ordinary identifier names.
This code also compiles and runs on gcc/g++ 4.4.7, producing '1, 2':
#include <stdio.h>
typedef struct x { // 'x' in tag namespace
int x; // 'x' in member namespace
int y;
} x;
int main() {
x x = { 1, 2 };
printf("%d %d\n", x.x, x.y);
}
How are the x identifiers disambiguated in this case?
EDIT
A clarification, I hope. Consider these two lines from above:
x foo = { 1, 2 };
int x = 3; // 'x' in ordinary identifier name space
When the second line is executed, identifier x is in scope, and should logically be in the 'ordinary identifier' namespace. There doesn't appear to be a new scope at this point, because there is no opening brace between lines 1 and 2. So the second x can't hide the first x, and the second x is in error. What's the flaw in this argument, and how does this apply to the x x case? My assumption was that the flaw was that type names somehow had a different non-obvious name space, hence the title of this question.
It works not due to namespaces (the new type name and the variable identifier are in the same ordinary namespace), but due to scoping.
6.2.1 Scopes of identifiers
2 For each different entity that an identifier designates, the
identifier is visible (i.e., can be used) only within a region of
program text called its scope. Different entities designated by the
same identifier either have different scopes, or are in different name
spaces. There are four kinds of scopes: function, file, block, and
function prototype. (A function prototype is a declaration of a
function that declares the types of its parameters.)
4 Every other identifier has scope determined by the placement of
its declaration (in a declarator or type specifier). If the declarator
or type specifier that declares the identifier appears outside of any
block or list of parameters, the identifier has file scope, which
terminates at the end of the translation unit. If the declarator or
type specifier that declares the identifier appears inside a block or
within the list of parameter declarations in a function definition,
the identifier has block scope, which terminates at the end of the
associated block. If the declarator or type specifier that declares
the identifier appears within the list of parameter declarations in a
function prototype (not part of a function definition), the identifier
has function prototype scope, which terminates at the end of the
function declarator. If an identifier designates two different
entities in the same name space, the scopes might overlap. If so, the
scope of one entity (the inner scope) will end strictly before the
scope of the other entity (the outer scope). Within the inner scope,
the identifier designates the entity declared in the inner scope; the
entity declared in the outer scope is hidden (and not visible) within
the inner scope.
The variable named x is an an inner scope. So it hides the entity named x in the outer scope. Inside the scope of main, following the declaration of x x, it's a variable name.
The interesting bit is that in x x = { 1, 2 }; the meaning of x is changed mid declaration. In the beginning it denotes the type name, but once the declarator introduces an identifier, x starts denoting the variable.
Regarding your edit "What's the flaw in this argument?" Note that scopes may overlap (as the preceding paragraph mentions). The definition of the type alias is actually at file scope. The block scope of main is a new inner scope that overlaps the outer scope. That's why it can be used to hide the previous meaning of x. Had you tried to do this at file scope:
typedef struct x { /* ... */ } x;
int x = 1; // immediately at file scope
It would be ill-formed. Because now indeed the declarations appear at the exact same scope.
I have the below code and I see that the two variables have been assigned with same address. Both the variables are completely different type . Is there anyway I can void this ? And under what circumstances does the same memory gets allocated to both the variables .
static int Sw_Type [];
static BOOL Sw_Update;
void main()
{
int i;
int bytes = 3;
if (Sw_Update!= TRUE)
{
for(i = 0; i< bytes ;i++)
{
Sw_Type [i] = *Ver_Value;
Ver_Value++;
}
Sw_Update= TRUE;
}
}
This is a snippet of my code and "Ver_Value" is a structure which gets assigned in different function.
So the problem I am seeing is , when Sw_Update gets updated, Sw_Type [1] is getting updated and I see these two have same memory address.
static int Sw_Type []; constitutes a tentative definition, per C 2018 6.9.2 2:
A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
Since your program provides no non-tentative definition, it is as if it ended with static int Sw_Type [] = { 0 };. (In case it is not clear from the text quoted above that the result is indeed an array of one element, it is made clear by Example 2 in paragraph 5 of the same clause.)
Thus, Sw_Type is an array of one int. It contains only the element Sw_Type[0]. The behavior of accessing Sw_Type[1] is not defined by the C standard. From the observations you report, it appears as if Sw_Update follows Sw_Type in memory, and accessing Sw_Type[1] results in modifying Sw_Update. This behavior is of course not reliable.
To make Sw_Type larger, you must declare a size for it, as with static int Sw_Type[4];.
Note: 6.9.2 3 says “If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type.” While this might be read as applying to the declared type in each declaration that is a tentative definition, I think it might be intended to apply to the declared type of the object once its composite type is fully resolved at the end of the translation unit. Experimentally, Clang is okay with accepting an incomplete type at first and completing it later.
So the problem I am seeing is , when Sw_Update gets updated, Sw_Type [1] is getting updated and I see these two have same memory address.
There is no Sw_Type [1]. Only an array with two or more elements has a second entry, and Sw_Type is not an array with two or more elements. Accessing an array out of bounds can certainly stomp on other objects.
I have compiled following program using gcc prog.c -Wall -Wextra -std=gnu11 -pedantic command on GCC compiler. I wondered, it is working fine without any warnings or errors.
#include <stdio.h>
int main(void)
{
for (int i = 0; i == 0; i++)
{
printf("%d\n", i);
long int i = 1; // Why doesn't redeclaration error?
printf("%ld\n", i);
}
}
Why compiler doesn't generate redeclaration variable i error?
From standard §6.8.5.5 (N1570)
An iteration statement is a block whose scope is a strict subset of
the scope of its enclosing block. The loop body is also a block whose
scope is a strict subset of the scope of the iteration statement.
Emphasis added
In C language, the scope of statement is nested within the scope of for loop init-statement.
According to Cppreference :
While in C++, the scope of the init-statement and the scope of
statement are one and the same, in C the scope of statement is nested
within the scope of init-statement.
According to stmt:
The for statement
for ( for-init-statement conditionopt ; expressionopt ) statement
is equivalent to
{
for-init-statement
while ( condition ) {
statement
expression ;
}
}
except that names declared in the for-init-statement are in the same declarative-region as those declared in the condition,
and except that a continue in statement (not enclosed in another
iteration statement) will execute expression before re-evaluating
condition.
You have to set -Wshadow to get warnings on shadowed variables. Variable shadowing is allowed in C.
But this is an edge case. A var declared in the head of a for construction is not outside the brackets, because it has no scope after the construction.
This is not equivalent:
int i;
for( i = 0; …)
{ … }
// is is still in scope but wouldn't if declared in the head of for
But, it is not inside the brackets, too.
for( i = 0; …)
{
int i; // this would be strange, because i is used before it is declared.
…
}
The best approximative replacement of the code is this:
{
int i;
for( i = 0; …)
{
…
}
} // i loses scope
So it is no redeclaration, but a shadowing declaration inside the loop's body.
Why compiler doesn't generate redeclaration variable i error?
From C Standards#6.2.1p4 Scopes of identifiers
Every other identifier has scope determined by the placement of its declaration (in a declarator or type specifier). If the declarator or type specifier that declares the identifier appears outside of any block or list of parameters, the identifier has file scope, which terminates at the end of the translation unit. If the declarator or type specifier that declares the identifier appears inside a block or within the list of parameter declarations in a function definition, the identifier has block scope, which terminates at the end of the associated block. If the declarator or type specifier that declares the identifier appears within the list of parameter declarations in a function prototype (not part of a function definition), the identifier has function prototype scope, which terminates at the end of the function declarator. If an identifier designates two different entities in the same name space, the scopes might overlap. If so, the scope of one entity (the inner scope) will end strictly before the scope of the other entity (the outer scope). Within the inner scope, the identifier designates the entity declared in the inner scope; the entity declared in the outer scope is hidden (and not visible) within the inner scope.
From C standards#6.8.5p5 Iteration statements
An iteration statement is a block whose scope is a strict subset of the scope of its enclosing block. The loop body is also a block whose scope is a strict subset of the scope of the iteration statement.
So, in this code:
for (int i = 0; i == 0; i++)
{
printf("%d\n", i);
long int i = 1; // Why doesn't redeclaration error?
printf("%ld\n", i);
}
the scope of identifier with name i is overlapping and in this name space, the i declared in for (int i = 0; i == 0; i++) has outer scope and the one declared within loop body long int i = 1; has inner scope.
Within the loop body, after this statement:
long int i = 1;
the i declared in the outer scope is not visible and printf() printing the value of i visible in the inner scope which is 1.
This behavior is also known as Variable Shadowing which occurs when a variable declared within a certain scope has the same name as a variable declared in an outer scope.
C language allows variable shadowing and that's why the compiler does not throw any error for this. However, in gcc compiler, if you use -Wshadow option you will get a warning message - declaration shadows a local variable.
For further verification i checked this code in visual studio 2008 for the prog.c file. I found that the compiler does give error in the line for (int i = 0; i == 0; i++) . The compiler expects declaration of i to be in the beginning of the program itself. This behavior is correct for a C file. If the declaration is moved to the beginning of the program then there are no errors as expected. All scope related issues are solved.
If i try this code as prog.cpp file, then the compiler does give an error for redeclaration. This is also an expected behavior.
So i conclude this has to do with the gcc compiler, does any flag/parameters that is used for compiling/building the exe results in this behavior for the gcc compiler.
Can rsp post the make file details for further verification?
Reading a lot of definition of namespace and scopes cannot understand exactly the difference between the two terms.
For example:
If an identifier designates two different entities in the same name
space, the scopes might overlap.
It is really confusing me. Can someone clarify it as simple as possible underlining the difference.
You left off the second part of that statement:
If so, the scope of one entity (the inner scope) will end
strictly before the scope of the other entity (the outer scope). Within the inner scope, the
identifier designates the entity declared in the inner scope; the entity declared in the outer
scope is hidden (and not visible) within the inner scope.
Online draft of the C2011 standard, §6.2.1, para 4.
Example:
void foo( void )
{
int x = 42;
printf( "x = %d\n", x ); // will print 42
do
{
double x = 3.14;
printf( "x = %f\n", x ); // will print 3.14
} while ( 0 );
printf( "x = %d\n", x ); // will print 42 again
}
You have used the same identifier x to refer to two different objects in the same namespace1. The scope of the outer x overlaps the scope of the inner x. The scope of the inner x ends strictly before the scope of the outer x.
The inner x "shadows" or "hides" the outer x.
C defines four namespaces - label names (disambiguated by the `goto` keyword and `:`), struct/union/enum tag names (disambiguated by the `struct`, `union`, and `enum` keywords), struct and union member names (disambiguated by the `.` and `->` operators), and everything else (variable names, function names, typedef names enumeration constants, etc.).
"Namespace" in C has to do with the kinds of things that are named. Section 6.2.3 of the standard distinguishes four kinds of namespaces
one for labels
one each for the tags of structures, unions, and enumerations
one for the members of each structure or union
one for all other identifiers
Thus, for example, a structure tag never collides with a variable name or the name of any structure member:
struct foo {
int foo;
};
struct foo foo;
The usage of the same identifier, foo, for each of the entities it designates therein is permitted and usable because structure tags, structure members, and ordinary variable names all belong to different name spaces. The language supports this by disambiguating the uses of identifiers via language syntax.
"Scope", on the other hand, has to do with where -- in terms of the program source -- a given identifier is usable, and what entity it designates at that point. As section 6.2.1 of the standard puts it:
For each different entity that an identifier designates, the identifier is visible (i.e., can be used) only within a region of program text called its scope. Different entities designated by the same identifier either have different scopes, or are in different name spaces.
Identifiers belonging to the same namespace may have overlapping scope (subject to other requirements). The quotation in your question is an excerpt from paragraph 4 of section 6.2.1; it might help you to read it in context. As an example, however, consider
int bar;
void baz() {
int bar;
// ...
}
The bar outside the function has file scope, starting immediately after its declaration and continuing to the end of the enclosing translation unit. The one inside the function designates a separate, local variable; it has block scope, starting at the end of its declaration and ending at the closing brace of the innermost enclosing block. Within that block, the identifier bar designates the local variable, not the file-scoped one; the file-scoped bar is not directly accessible there.
Is it allowed to jump to a label that's inside an inner scope or a sibling scope? If so, is it allowed to use variables declared in that scope?
Consider this code:
int cond(void);
void use(int);
void foo()
{
{
int y = 2;
label:
use(y);
}
{
int z = 3;
use(z);
/* jump to sibling scope: */ if(cond()) goto label;
}
/* jump to inner scope: */ if(cond()) goto label;
}
Are these gotos legal?
If so, is y guaranteed to exist when I jump to label and to hold the last value assigned to it (2)?
Or is the compiler allowed to assume y won't be used after it goes out of scope, which means a single memory location may be used for both y and z?
If this code's behavior is undefined, how can I get GCC to emit a warning about it?
From the C99 standard (emphasis mine):
6.2.4 Storage durations of objects
[6] For such an object that does have a variable length array type, its lifetime extends from the declaration of the object until execution of the program leaves the scope of the declaration. ... If the scope is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate.
6.8.6.1 The goto statement
[1] The identifier in a goto statement shall name a label located somewhere in the enclosing function. A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier.
[4] ... A goto statement is not allowed to jump past any declarations of objects with variably modified types.
Conclusion
y is not a variably modified type, so, according to the standard, the jumps are legal.
y is guaranteed to exist, however, the jumps skip the initialization (y = 2), so the value of y is indeterminate.
You can use -Wjump-misses-init to get GCC to emit a warning like the following:
warning: jump skips variable initialization [-Wjump-misses-init]
In C++, the jumps are not legal, C++ does not allow to skip the initialization of y.
The jumps are legal (in C, in C++ they aren't).
is y guaranteed to exist when I jump to label
Yes.
and to hold the last value assigned to it (2)?
No.
From the C11 Standard (draft) 6.2.4/6:
For such an object [without the storage-class
specifier static] that does not have a variable length array type, its lifetime extends
from entry into the block with which it is associated until execution of that block ends in
any way. [...] The initial value of the object is indeterminate. If an
initialization is specified for the object, it is performed each time the declaration [...] is reached in the execution of the block; otherwise, the value becomes
indeterminate each time the declaration is reached.
From the above one would conclude the for the 2nd and 3rd time use(y) gets called the value of y ins "indeterminate[d]", as the initialisation of y is not "reached".