The C standard states (emphasis mine):
If an identifier designates two different entities in the same name space, the scopes might overlap. [...]
(section 6.2.1.4 from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)
When can an identifier refer to two different entities but their scopes do not overlap?
Or, put differently, why is there the word "might" in the quote?
These scopes for name overlap:
int f(void) {
int name = 4;
{
int name = 6;
}
}
These ones do not overlap:
int f(void) {
{
int name = 4;
}
{
int name = 6;
}
}
Read it as “The scopes might overlap, if an identifier designates two different entities in the same name space.” That is, the sentence is saying the scopes might overlap, and it is explaining the condition for which that occurs. English is unfortunately imprecise. This sentence is not meant to express the logic statement that if an identifier designates two entities in the same name space, there exist programs in which they overlap and there exist programs in which they do not. It expresses the fact that scopes might overlap and the fact that this occurs when an identifier designates two different entities in the same name space.
I think the word might refer to possibilty of the general case rather than probability of hapening (that means is allowed to happen and when it happens there would be an overlap).
And the following lines indactes that by telling what happens in this case (the inner scope would be a strict subscope of the outer one and in this scope we will be using the intety defined inside this inner scope
Related
I have the below code and I see that the two variables have been assigned with same address. Both the variables are completely different type . Is there anyway I can void this ? And under what circumstances does the same memory gets allocated to both the variables .
static int Sw_Type [];
static BOOL Sw_Update;
void main()
{
int i;
int bytes = 3;
if (Sw_Update!= TRUE)
{
for(i = 0; i< bytes ;i++)
{
Sw_Type [i] = *Ver_Value;
Ver_Value++;
}
Sw_Update= TRUE;
}
}
This is a snippet of my code and "Ver_Value" is a structure which gets assigned in different function.
So the problem I am seeing is , when Sw_Update gets updated, Sw_Type [1] is getting updated and I see these two have same memory address.
static int Sw_Type []; constitutes a tentative definition, per C 2018 6.9.2 2:
A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
Since your program provides no non-tentative definition, it is as if it ended with static int Sw_Type [] = { 0 };. (In case it is not clear from the text quoted above that the result is indeed an array of one element, it is made clear by Example 2 in paragraph 5 of the same clause.)
Thus, Sw_Type is an array of one int. It contains only the element Sw_Type[0]. The behavior of accessing Sw_Type[1] is not defined by the C standard. From the observations you report, it appears as if Sw_Update follows Sw_Type in memory, and accessing Sw_Type[1] results in modifying Sw_Update. This behavior is of course not reliable.
To make Sw_Type larger, you must declare a size for it, as with static int Sw_Type[4];.
Note: 6.9.2 3 says “If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type.” While this might be read as applying to the declared type in each declaration that is a tentative definition, I think it might be intended to apply to the declared type of the object once its composite type is fully resolved at the end of the translation unit. Experimentally, Clang is okay with accepting an incomplete type at first and completing it later.
So the problem I am seeing is , when Sw_Update gets updated, Sw_Type [1] is getting updated and I see these two have same memory address.
There is no Sw_Type [1]. Only an array with two or more elements has a second entry, and Sw_Type is not an array with two or more elements. Accessing an array out of bounds can certainly stomp on other objects.
I'm reading the N1570 Standard and have a problem to understand the wording of the name space definition. Here is it:
1 If more than one declaration of a particular identifier is visible
at any point in a translation unit, the syntactic context
disambiguates uses that refer to different entities. Thus, there are
separate name spaces for various categories of identifiers, as
follows:
— label names (disambiguated by the syntax of the label
declaration and use);
— the tags of structures, unions, and
enumerations (disambiguated by following any32) of the keywords
struct, union, or enum);
— the members of structures or unions; each
structure or union has a separate name space for its members
(disambiguated by the type of the expression used to access the member
via the . or -> operator);
— all other identifiers, called ordinary
identifiers (declared in ordinary declarators or as enumeration
constants).
32) There is only one name space for tags even though three are possible.
Here they are talking about in case of more than 1 declaration of particular identifiers is visible. Now words something like "To access an identifier one shall specify its namespace" or "To access an identifier in a specific namespace...".
Let me show an example first (this is strictly for understanding purpose, dont write code like this, ever)
#include <stdio.h>
int main(void)
{
int here = 0; //.......................ordinary identifier
struct here { //.......................structure tag
int here; //.......................member of a structure
} there;
here: //......... a label name
here++;
printf("Inside here\n");
there.here = here; //...........no conflict, both are in separate namespace
if (here > 2) {
return 0;
}
else
goto here; //......... a label name
printf("Hello, world!\n"); // control does not reach here..intentionally :)
return 0;
}
You see usage of identifier here. They belong to separate namespace(s) according to the rule, hence this program is fine.
However, say, for example, you change the structure variable name, from there to here, and you'll see a conflict, as then, there would be two separate declaration of same identifier (ordinary identifier) in the same namespace.
Reading a lot of definition of namespace and scopes cannot understand exactly the difference between the two terms.
For example:
If an identifier designates two different entities in the same name
space, the scopes might overlap.
It is really confusing me. Can someone clarify it as simple as possible underlining the difference.
You left off the second part of that statement:
If so, the scope of one entity (the inner scope) will end
strictly before the scope of the other entity (the outer scope). Within the inner scope, the
identifier designates the entity declared in the inner scope; the entity declared in the outer
scope is hidden (and not visible) within the inner scope.
Online draft of the C2011 standard, §6.2.1, para 4.
Example:
void foo( void )
{
int x = 42;
printf( "x = %d\n", x ); // will print 42
do
{
double x = 3.14;
printf( "x = %f\n", x ); // will print 3.14
} while ( 0 );
printf( "x = %d\n", x ); // will print 42 again
}
You have used the same identifier x to refer to two different objects in the same namespace1. The scope of the outer x overlaps the scope of the inner x. The scope of the inner x ends strictly before the scope of the outer x.
The inner x "shadows" or "hides" the outer x.
C defines four namespaces - label names (disambiguated by the `goto` keyword and `:`), struct/union/enum tag names (disambiguated by the `struct`, `union`, and `enum` keywords), struct and union member names (disambiguated by the `.` and `->` operators), and everything else (variable names, function names, typedef names enumeration constants, etc.).
"Namespace" in C has to do with the kinds of things that are named. Section 6.2.3 of the standard distinguishes four kinds of namespaces
one for labels
one each for the tags of structures, unions, and enumerations
one for the members of each structure or union
one for all other identifiers
Thus, for example, a structure tag never collides with a variable name or the name of any structure member:
struct foo {
int foo;
};
struct foo foo;
The usage of the same identifier, foo, for each of the entities it designates therein is permitted and usable because structure tags, structure members, and ordinary variable names all belong to different name spaces. The language supports this by disambiguating the uses of identifiers via language syntax.
"Scope", on the other hand, has to do with where -- in terms of the program source -- a given identifier is usable, and what entity it designates at that point. As section 6.2.1 of the standard puts it:
For each different entity that an identifier designates, the identifier is visible (i.e., can be used) only within a region of program text called its scope. Different entities designated by the same identifier either have different scopes, or are in different name spaces.
Identifiers belonging to the same namespace may have overlapping scope (subject to other requirements). The quotation in your question is an excerpt from paragraph 4 of section 6.2.1; it might help you to read it in context. As an example, however, consider
int bar;
void baz() {
int bar;
// ...
}
The bar outside the function has file scope, starting immediately after its declaration and continuing to the end of the enclosing translation unit. The one inside the function designates a separate, local variable; it has block scope, starting at the end of its declaration and ending at the closing brace of the innermost enclosing block. Within that block, the identifier bar designates the local variable, not the file-scoped one; the file-scoped bar is not directly accessible there.
In trying to answer a recent question (Passing restrict qualified pointers to functions?), I could not find how the C11 standard is consistent with practice.
I'm not trying to call out the standard or anything, most things that look inconsistent I just am not understanding right, but my question is best posed as an argument against the definition used in the standard, so here it is.
It seems to be commonly accepted that a function can take a restrict qualified pointer and both work on it and have its own function calls work on it. For example,
// set C to componentwise sum and D to componentwise difference
void dif_sum(float* restrict C, float* restrict D, size_t n)
{
size_t i = 0;
while(i<n) C[i] = C[i] - D[i],
D[i] += C[i] + D[i],
++i;
}
// set A to componentwise sum of squares of A and B
// set B to componentwise product of A and B
void prod_squdif(float* restrict A, float* restrict B, size_t n)
{
size_t i = 0;
float x;
dif_sum(A,B,n);
while(i<n) x = ( (A[i]*=A[i]) - B[i]*B[i] )/2,
A[i] -= x,
B[i++] = x/2;
}
What seems to be the common understanding is that restrict pointers need to reference independent space within their declaring block. So, prod_sqdif is valid because nothing lexically within its defining block accesses the arrays identified by A or B other than those pointers.
To demonstrate my concern with the standard, here is the standard formal definition of restrict (according to the committee draft, if you have the final version and it is different, let me know!):
6.7.3.1 Formal definition of restrict
1 Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to type T.
2 If D appears inside a block and does not have storage class extern, let B denote the block. If D appears in the list of parameter declarations of a function definition, let B denote the associated block. Otherwise, let B denote the block of main (or the block of whatever function is called at program startup in a freestanding environment).
3 In what follows, a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E. Note that ‘‘based’’ is defined only for expressions with pointer types.
4 During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified. Every other lvalue used to access the value of X shall also have its address based on P. Every access that modifies X shall be considered also to modify P, for the purposes of this subclause. If P is assigned the value of a pointer expression E that is based on another restricted pointer object P2, associated with block B2, then either the execution of B2 shall begin before the execution of B, or the execution of B2 shall end prior to the assignment. If these
requirements are not met, then the behavior is undefined.
5 Here an execution of B means that portion of the execution of the program that would correspond to the lifetime of an object with scalar type and automatic storage duration associated with B.
6 A translator is free to ignore any or all aliasing implications of uses of restrict.
[Examples not included because they are not formally significant.]
Identifying execution of B with expressions lexically contained therein might be seen as supported by the following excerpt from 6.2.4, item 6:
"...Entering an enclosed block or calling a function suspends, but does not end,
execution of the current block..."
However, part 5 of the formal definition of restrict explicitly defines the block B to correspond to the lifetime of an object with automatic storage declared in B (in my example, B is the body of prod_squdif). This clearly overrides any definition of the execution of a block found elsewhere in the standard. The following excerpt from the standard defines lifetime of an object.
6.2.4 Storage durations of objects, item 2
The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. 34) If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.
Then the execution of dif_sum is clearly included in the execution of B. I don't think there is any question there. But then the lvalues in dif_sum that read and modify elements of A and B (via C and D) are clearly not based on A and B (they follow sequence points where A and B could have been repointed to copies of their content without changing the locations identified by the lvalues). This is undefined behavior. Note that what lvalues or sequence points are discussed in item 4 is not restricted; as it is stated, there is no reason to restrict lvalues and sequence points to those lexically corresponding to the block B, and so lvalues and sequence points within a function call play just like they do in the body of the calling function.
On the other hand, the generally accepted use of restrict seems implied by the fact that the formal definition explicitly allows C and D to be assigned the values of A and B. This suggests that some meaningful access to A and B through C and D is allowed. However, as argued above, such access is undefined for any element modified through either the outer or inner function call, and at least read by the inner call. This seems contrary to the apparent intent of allowing the assignment in the first place.
Of course, intent has no formal place in the standard, but it does seem suggestive that the common interpretation of restrict, rather than what seems actually defined, is what is intended.
In summary, interpreting the execution of B as the execution of each statement during the lifetime of B's automatic storage, then function calls can't work with the contents of restrict pointers passed to them.
It seems unavoidable to me that there should be some exception stating reads and writes within functions or sub blocks are not considered, but that at most one assignment within such a sub block (and other sub blocks, recursively) may be based on any particular restrict pointer in the outer block.
I have really gone over the standard, both today and yesterday. I really can't see how the formal definition of restrict could possibly be consistent with the way it seems to be understood and implemented.
EDIT: As has been pointed out, violating the restrict contract results in undefined behavior. My question is not about what happens when the contract is violated. My question can be restated as follows:
How can the formal definition of restrict be consistent with access to array elements through function calls? Does such access, within a calling function, not constitute access not based on the restrict pointer passed to the function?
I am looking for an answer based in the standard, as I agree that restrict pointers should be able to be passed through function calls. It just seems that this is not the consequence of the formal definition in the standard.
EDIT
I think the main problem with communicating my question is related to the definition of "based on". I will try to present my question a little differently here.
The following is an informal tracking of a particular call to prod_squdif. This is not intended as C code, it is just an informal description of the execution of the function's block.
Note that this execution includes the execution of the called function, per item 5 of the formal definition of restrict: "Here an execution of B means that portion of the execution of the program that would correspond to the lifetime of an object with scalar type and automatic storage duration associated with B."
// 1. prod_squdif is called
prod_squdif( (float[1]){2}, (float[1]){1}, 1 )
// 2. dif_sum is called
dif_sum(A,B,n) // assigns C=A and D=B
// 3. while condition is evaluated
0<1 // true
// 4. 1st assignment expression
C[0] = C[0] - D[0] // C[0] == 0
// 5. 2nd assignment expression
D[0] += C[0] + D[0] // D[0] == 1
// 6. increment
++i // i == 1
// 7. test
1<1 // false
// return to calling function
// 8. test
0<1 // true
// 9. 1st assignment expression
x = ( (A[0]*=A[0]) - B[1]*B[1] )/2 // x == -.5
// 10. 2nd assignment expression
A[0] -= -.5 // A[0] == .5
// 11. 3rd assignment expression
B[i++/*evaluates to 0*/] = -.5/2 // B[0] == -.25
// 12. test
1<1 // false
// prod_squdif returns
So, the test for the restrict contract is given by item 4 in the formal definition of restrict: "During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: ... Every other lvalue used to access the value of X shall also have its address based on P..."
Let L be the lvalue on the left of the portion of the execution marked '4' above (C[0]). Is &L based on A? I.e., is C based on A?
See item 3 of the formal definition of restrict: "...a pointer expression E is said to be based on object P if (at some sequence point in the execution of B prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E...".
Take as a sequence point the end of item 3 above. (At this sequence point) modifying A to point to a coppy of the array object into which it formerly pointed would NOT change the value of C.
Thus C is not based on A. So A[0] is modified by an lvalue not based on A. Since it is also read by an lvalue that is based on A (item 10), this is undefined behavior.
My question is: Is it correct to therefore conclude that my example invokes undefined behavior and thus the formal definition of restrict is not consistent with common implementation?
Suppose we have a function with nested blocks like this:
void foo()
{
T *restrict A = getTptr();
{
T *restrict B = A;
{
#if hypothetical
A = copyT(A);
#endif
useTptr(B + 1);
}
}
}
It would seem that, at the point where useTptr(B + 1) is called, the hypothetical change to A would no longer affect the value of B + 1. However, a different sequence point can be found, such that a change to A does affect the value of B + 1:
void foo()
{
T *restrict A = getTptr();
#if hypothetical
A = copyT(A);
#endif
{
T *restrict B = A;
{
useTptr(B + 1);
}
}
}
and C11 draft standard n1570 6.7.3.1 Formal definition of restrict only demands that there be some such sequence point, not that all sequence points exhibit this behavior.
I'm really not sure exactly what your question is.
It sounds like you're asking:
Q: Gee, will "restrict" still apply if I violate the "restrict"
contract? Like in the "remove_zeroes()" example?
The answer, of course, is "No - it won't".
Here are two links that might clarify the discussion. Please update your post with (a) more explicit question(s):
Realistic usage of the C99 'restrict' keyword?
Is it legal to assign a restricted pointer to another pointer, and use the second pointer to modify the value?
https://en.wikipedia.org/wiki/Restrict
Example code is as follows, both 'a' are file scope:
1 ...
2 int a;
3 int a;
4 ... // which 'a' is visible?
I know that the two declarations for 'a' are for the same object. But every identifier has a scope, the scopes of the two 'a' should overlap at line 4, which one is visible? If the second 'a' is visible only, does that mean this situation is like the following:
{
int a;
{
int a; // the scope of the first 'a' is hidden
}
}
Thanks
At file-scope, something like
int a;
Is a declaration in the first place. It is also a tentative definition. For the same name you can have as many tentative definitions for the same name as you want iff they are technically (see the link for details) identical.
However, if you add an initialiser:
int a = 0;
You have a (regular) definition. Of these you can only have one for the same name. It must also be identical to all tentative definitions.
The second example is about scope. You can use the same name in different scopes. However, the innermost name will be used if you reference the object. There is no way to access an object with the same name in an outer scope. This is called shadowing and a properly configured compiler (i.e. enable warnings) should warn about it, but allow this.
In general this is bad coding style, because you have to check the scope when you read the code to see the declaration. (that's why the compiler should warn). Note that the inner definition need not even have the same type.
To paraphrase this answer:
In a global scope, what you have written will work fine (it will throw an error in the function scope). That is, say you write the following:
int a;
int a;
int main(...){...}
This will compile fine, but only because neither a is initialized. If you declare something with an assignment multiple times, things will break. Here is an example of this:
int a;
int a; // No problem yet
int a = 10; // a is now initialized
int a = 12; // Error: Redefinition of a
The first two definitions (and the kind you gave) are what are known as tentative definitions. The C spec says that multiple ones of these are treated as redundant. The important thing is that each one has either zero or one definitions with initializations.
Lastly, to answer your question:
...
int a;
int a;
... // which 'a' is visible?
My understanding is that the answer is both, since they are tentative definitions for the same variable.