From GNU document about volatile:
The minimum requirement is that at a sequence point all previous
accesses to volatile objects have stabilized and no subsequent
accesses have occurred
Ok, so we know what sequence points are, and we now know how volatile behaves with respect to them in gcc.
So, naively I would look at the following program:
volatile int x = 0;
int y = 0;
x = 1; /* sequence point at the end of the assignment */
y = 1; /* sequence point at the end of the assignment */
x = 2; /* sequence point at the end of the assignment */
And will apply the GNU requirement the following way:
At a sequence point (end of y=1) access to volatile x = 1 stabilize and no subsequence access x = 2 have occurred.
But that just wrong because non-volatiles y = 1 can be reordered across sequence points, for example y = 1 can actually be performed before x = 1 and x = 2, and furthermore it can be optimised away (without violating the as-if rule).
So I am very eager to know how can I apply the GNU requirement properly, is there a problem with my understanding? is the requirement written in a wrong way?
Maybe should the requirement be written as something like:
The minimum
requirement is that at a sequence point WHICH HAS A SIDE EFFECT all previous accesses to volatile objects have stabilized
and no subsequent accesses have occurre
Or as pmg elegantly suggested in the comment:
The minimum requirement is that at a sequence point all UNSEQUENCED previous accesses to volatile objects have
stabilized and no subsequent accesses have occurred
so we could only apply it on the sequence points of end of x = 1; and end of x = 2; on which is definitely true that previous accesses to volatile objects have stabilized and no subsequent accesses have occurred?
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I'm studying C in school and I had this question, and I'm having trouble solving it.
What does the following code do?
#define N (100)
int main(void)
{
unsigned short i = 0;
unsigned long arr[2*N + 1];
unsigned long a = 0;
for (i = 0 ; i < N ; ++i) {
a ^= arr[i];
}
printf("%lu", a);
return 0;
}
It would be really helpful if you could explain it to me!
Thanks!
It's usually a good idea to explain what you understand, so we don't have to treat you as though you know nothing. Important note: This code behaves erratically. I'll discuss that later.
The exclusive or operator (^) produces its result by applying the following pattern to the binary representation of the numbers in question:
Where both operands (sides of the operator) contain different bits, the result will contain a 1 bit. For example, if the left hand side contains a right-most bit of 0 and the right hand side contains a right-most bit of 1, then the result will contain a right-most bit of 1.
Where both operands (sides of the operator) contain the same bit, the result will contain a 0.
So as an example, the operands of 15 ^ 1 have the following binary notation:
1111 ^
0001
... and the result of this exclusive or operation will be:
1110
Converted back to decimal, that's 14. Xor it with 1 again and you'll end up back at 15 (which is the property the silly xor swap takes advantage of).
The array[index] operator obtains the element within array at the position indicated by index.
The ^= operator is a compound operator. It combines the exclusive or and assignment operators. a ^= arr[i]; is roughly equivalent to a = a ^ arr[i];. That means: Calculate the exclusive or of a and arr[i], and assign it back into a.
for (i = 0 ; i < N ; ++i) /*
* XXX: Insert statement or block of code
*/
This denotes a loop, which will start by assigning the value 0 to i, will repeatedly execute the statement or block of code while i is less than N (100), incrementing i each time.
In summary, this code produces the exclusive or of the first 100 elements of the array arr. This is a form of crude checksum algorithm; the idea is to take a group of values and reduce them to a single value so that you can perform some form of integrity check on them later on, perhaps after they've been trasmited via the internet or an unreliable filesystem.
However, this code invokes undefined behaviour because it uses unspecified values. In order to avoid erratic behaviour such as unpredictable values or segfaults (or worse yet, situations like the heartbleed OpenSSL vulnerability) you need to make sure you give your variables values before you try to use those values.
The following declaration would explicitly initialise the first element to 42, and implicitly initialise all of the others to 0:
unsigned long arr[2*N + 1] = { 42 };
It is important to realise that the initialisation part of the declaration = { ... } is necessary if you want any elements not explicitly initialised to be zeroed.
this function will print unpredictable value.
because of unsigned long arr[2*N + 1]; arr is not initialized and it will have random content based on data on you memory.
a ^= arr[i]; is equal to a = a^arr[i]; so it will do this for multiple times (because of loop) and then it will print it.
If I run the following code, graph[0][0] gets 1 while graph[0][1] gets 4.
In other words, the line graph[0][++graph[0][0]] = 4; puts 1 into graph[0][0] and 4 into graph[0][1].
I would really appreciate if anyone can offer reasonable explanation.
I observed this from Visual C++ 2015 as well as an Android C compiler (CppDriod).
static int graph[10][10];
void main(void)
{
graph[0][++graph[0][0]] = 4;
}
Let's break it down:
++graph[0][0]
This pre-increments the value at graph[0][0], which means that now graph[0][0] = 1, and then the value of the expression is 1 (because that is the final value of graph[0][0]).
Then,
graph[0][/*previous expression = 1*/] = 4;
So basically, graph[0][1] = 4;
That's it! Now graph[0][0] = 1 and graph[0][1] = 4.
First let's see what is the unary (prefix) increment operator does.
The value of the operand of the prefix ++ operator is incremented. The result is the new value of the operand after incrementation.
So, in case of
graph[0][++graph[0][0]] = 4;
first, the value of graph[0][0] is incremented by 1, and then the value is used in indexing.
Now, graph being a static global variable, due to implicit initialization, all the members in the array are initialized to 0 by default. So, ++graph[0][0] increments the value of graph[0][0] to 1 and returns the value of 1.
Then, the simpllified version of the instrucion looks like
graph[0][1] = 4;
Thus, you get
graph[0][0] as 1
graph[0][1] as 4.
Also, FWIW, the recommended signature of main() is int main(void).
You are adding one to graph[0][0], by doing ++graph[0][0]. And then setting graph[0][1] to 4. Maybe you want to do graph[0][graph[0][0]+1] = 4
At first your variable graph[10][10] is static so it will be initialized with value 0.
Then line graph[0][++graph[0][0]] = 4 ; here graph[0][0] = 0 in expression you just incrementing the value of graph[0][0] so basically you assigning graph[0][1] = 4; yourself
Note that you have used pre-increment operator (++x) so it first get incremented and value is changed but if you would have use post-increment operator(x++) then graph[0][0] = 4; itself
Let's line up the facts about this expression
graph[0][++graph[0][0]] = 4;
Per 6.5.1, the computation of the array index ++graph[0][0] is sequenced before the computation of array element graph[0][++graph[0][0]], which in turn is sequenced before the computation of the entire assignment operator.
The value of ++graph[0][0] is required to be 1. Note that this does not mean that the whole pre-increment together with its side-effects has to "happen first". It simply means that the result of that pre-increment (which is 1) has to be computed first. The actual modification of graph[0][0] (i.e. changing of graph[0][0] from 0 to 1) might happen much much later. Nobody knows when it will happen exactly (sometime before the end of the statement).
This means that the element being modified by the assignment operator is graph[0][1]. This is where that 4 should go to. Assignment of 4 to graph[0][1] is also a side-effect of = operator, which will happen sometime before the end of the statement.
Note, that in this case we could conclusively establish that ++ modifies graph[0][0], while = modifies graph[0][1]. We have two unsequenced side-effects (which is dangerous), but they act on two different objects (which makes them safe). This is exactly what saves us from undefined behavior in this case.
However, this is dependent on the initial value of graph array. If you try this
graph[0][0] = -1;
graph[0][++graph[0][0]] = 4;
the behavior will immediately become undefined, even though the expression itself looks the same. In this case the side-effect of ++ and the side-effect of = are applied to the same array element graph[0][0]. The side-effects are not sequenced with relation to each other, which means that the behavior is undefined.
For instance, assume I have a variable that cannot be accessed by the underlying processor in one instruction (e.g. 64 bit integer on a 32 bit architecture).
// let x, y, z of the same integral type of size > architecture
#pragma omp parallel shared(x), private(y,z)
y = ...;
z = ...;
if (x == y)
x = z;
While there could be races between the if statement and the actual assignment, could half of x be read before a context switch, and the other half afterwards? Or is it guaranteed that read and write access to shared variables always happens atomically? I cannot find any statements regarding this in the standard.
No and no. This code will result in resource race.
I've done a search and I’ve found nothing relevant to my query.
I am currently debugging a C optimizer and the code in question looks like this:
while( x-- )
array[x] = NULL;
What should happen in this instance? And should the result of this logic be consistent across all compilers?
Lets say that the initial value of x in this case is 5.
The problem is that the program crashes, my understanding is that it is caused by a negative array element reference.
Any help would be appreciated.
This cycle will end with x equal to -1 (assuming x is signed), but its body will not produce access to array[-1] at the last step. The last array access is to array[0]. The behavior is consistent across all implementations.
In other words, there's no problem with negative index array access in the code you quoted. But if you attempt to access array[x] immediately after the cycle, then you'll indeed access array[-1].
The code you quoted is a variation of a fairly well-known implementational pattern used when one needs to iterate backwards over an array using an unsigned variable as an index. For example
unsigned x;
int a[5];
for (x = 5; x-- > 0; )
a[x] = 0;
Sometimes less-experienced programmers have trouble using unsigned indices when iterating backwards over an array. (Since unsigned variables never have negative values, a naive implementation of the cycle termination condition as x >= 0 does not work.) This approach - i.e. post-increment in the cycle termination condition - is what works in such cases. (Of course, it works with signed indices as well).
If the initial value of x is 5, it will execute:
array[4] = NULL;
array[3] = NULL;
array[2] = NULL;
array[1] = NULL;
array[0] = NULL;
If x is a signed type, then the final value of x will be -1; otherwise, it will be the maximum value of the type.
Make sure x is non negative before processing the while loop(precondition).
Also x value will be -1 when the process leaves the while loop(post condition). Therefore, after leaving while loop, you should not access the array using x as index.
When the evaluation of l-value precedes the evaluation of r-value and the assignment also returns a value, which of the following is evaluated first?
int i = 2;
int x[] = {1, 2, 3};
int y[] = {4, 5, 6};
int z[] = {7, 8, 9};
x[--i] = y[++i] = z[i++]; // Out of bound exception or not?
NOTE: generic C-like language with l-value evaluation coming first. From my textbook:
In some languages, for example C,
assignment is considered to be an
operator whose evaluation, in addition
to producing a side effect, also
returns the r-value thus computed.
Thus, if we write in C:
x = 2;
the evaluation of such a command, in
addition to assigning the value 2 to x,
returns the value 2. Therefore, in C,
we can also write:
y = x = 2;
which should be interpreted as:
(y = (x = 2));
I'm quite certain that the behaviour in this case is undefined, because you are modifying and reading the value of the variable i multiple times between consecutive sequence points.
Also, in C, arrays are declared by placing the [] after the variable name, not after the type:
int x[] = {1, 2, 3};
Edit:
Remove the arrays from your example, because they are [for the most part] irrelevant. Consider now the following code:
int main(void)
{
int i = 2;
int x = --i + ++i + i++;
return x;
}
This code demonstrates the operations that are performed on the variable i in your original code but without the arrays. You can see more clearly that the variable i is being modified more than once in this statement. When you rely on the state of a variable that is modified between consecutive sequence points, the behaviour is undefined. Different compilers will (and do, GCC returns 6, Clang returns 5) give different results, and the same compiler can give different results with different optimization options, or for no apparent reason at all.
If this statement has no defined behaviour because i is modified several times between comsecutive sequence points, then the same can be said for your original code. The assignment operator does not introduce a new sequence point.
General
In C, the order of any operation between two sequence points should not be dependent on. I do not remember the exact wording from the standard, but it is for this reason
i = i++;
is undefined behaviour. The standard defines a list of things that makes up sequence points, from memory this is
the semicolon after a statement
the comma operator
evaluation of all function arguments before the call to the function
the && and || operand
Looking up the page on wikipedia, the lists is more complete and describes more in detail. Sequence points is an extremely important concept in C and if you do not already know what it means, do learn it immediately.
Specific
No matter how well defined the order of evaluation and assignment of the x, y and z variables are, for
x[--i] = y[++i] = z[i++];
this statement cannot be anything but undefined behaviour because of the i--, i++ and i++.
On the other hand
x[i] = y[i] = z[i];
is well defined, but I am not sure what the status for the order of evaluation for this. If this is important however I would rather prefer this to be split into two statements along with a comment "It is important that ... is assigned/initialized before ... because ...".
i think its the same as
x[3] = y[4] = z[2];
i = 3;