understand complex expression in C [duplicate] - c

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 6 years ago.
Help me please understand the following expression:
(seen in a book)
*((int *)marks + i++) = i+1
A large number of increments and symbols dereference confusing!

I hope the book had this as a bad example, because the behavior of that is undefined.
(int *)marks interprets marks (whatever that may be) as a pointer to int, then we have the result of i++ added to that. This pointer is dereferenced and i+1 is assigned to the corresponding object.
This expression has no defined behavior because it reads and modifies i at two different subexpressions that are not sequenced one before the other.

Burn the book.
The behaviour of the statement is undefined due to there being no sequence points. A far simpler case to understand is i++ = i which is also undefined.

Note: As others have said, the i++ expression is undefined behavior. In g++, the i++ operation will be executed as follows:
// Pointer to some data.
void* marks = ???;
// Typecast to an integer pointer
int* marksIntPointer = (int*)marks;
// Move the position in memory. I am assuming that 'marks' is an array.
int* marksIntPointerOffset = marksIntPointer + i;
// Undefined behaviour, could take place here or before, depending on the compiler (as others have said).
i++;
// Set the value of the desired memory.
*marksIntPointerOffset = i+1;

The mentioned expression yields an undefined behavior.
C99 6.5 §2 states:
1) Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression.
2) Furthermore, the prior value shall be read only to determine the value to be stored.
1) Storing and modifying the value of variable within a single expression is pretty straightforward:
i = ++i;
i++ = i;
++i = 7;
as well as modifying the value of the same variable multiple times within a single expression:
j = ++i + ++i;
2) Reading only to determine the value to be stored might be a bit tricky though. This means that even the following (just like the previous examples) invoke the undefined behavior:
j = (i + 1) + i++;
a[i++] = i;
*(ptr + i++) = i;
as well as:
*((int *)marks + i++) = i+1
You might look at: Undefined behavior and sequence points as well :)

As others have mentioned, the behavior is undefined; by getting rid of the ++ operator, the following code would be well-defined (but still ugly as sin):
*((int *)marks + i) = i+1
Here's how it breaks down:
marks -- take the expression marks
(int *)marks -- cast it as a pointer to int
(int *)marks + i -- offset i integer elements from that address
*((int *)marks + i) -- deference the result to get an array element
*((int *)marks + i) = i+1 -- assign the result of i+1 to that element
This is essentially treating marks as an array of int, and assigning the result of i+1 to the i'th element:
int *mp = (int *) marks;
mp[i] = i+1
The original expression was trying to do
mp[i++] = i+1
which invokes undefined behavior. Since the order in which i++ and i+1 are evaluated is not specified; the compiler is free to evaluate them in any order it feels like. Since i++ has a side effect (updating the value of i), this means that you will get different results based on the platform, optimization settings, even the surrounding code. The language standard explicitly leaves the behavior undefined so that the compiler isn't required to handle this code in any particular way. You will get a result, but it's not guaranteed to be consistent from compiler to compiler (or even from run to run).

(int *)marks
cast marks to a int pointer
+ i++
add i at marks (chaging the address pointed) and then increments i by 1
*(...) = i+1
set the VALUE of the cell pointed by our pointer to i+1 (pay attention at the fact that i as already been incremented before, so it will be greater of 2 by the start of the instruction.
I hope this is an example f how you should NOT write the code :)

Let's see.
*((int *)marks + i++) = i+1
-marks is cast to a pointer to int
-i is incremented by one
-the pointer (int *)marks is advanced by the value (i+1) using pointer arithmetic
-this pointer is now dereferenced, so...
-... the memory location to which it points is now written with the value of i+1

Related

Unexpected output after running this little bit strange c code. Can anyone explain how this happened?

I am trying to understand how this code works,
int main () {
int m, k;
m = (k=5)+(k=8)+(k=9)+(k=7);
printf("m=%d\n",m);
printf("k=%d\n",k);
}
The output:
m=32
k=7
I have no idea how is the value of m become 32.
I hope someone can help me to understand how this code works and how the outputs end up like this
Simplified explanation:
When you use k=... multiple times in the same expression, all assignments to that same variable are so-called "unsequenced side-effects". Simply put, it means that C doesn't specify which operand of + to evaluate/execute first nor does it specify the order in which the assignments will get carried out.
So the compiler has no way of knowing which k to evaluate/assign to first and therefore gets all confused. This is so-called "undefined behavior", anything can happen.
You have to solve this by splitting the expression up in several, each separated by a semicolon, which acts as a "sequence point", meaning all prior evaluations need to be done at the point where the ; is encounterd. Example:
k=5;
k+=8;
k+=9;
m = k + 7;
Detailed explanation with standard references here: Why can't we mix increment operators like i++ with other operators?
This is undefined behavior. Your compiler warn about this
warning: multiple unsequenced modifications to 'k' [-Wunsequenced]
You can learn more about this here:
A Guide to Undefined Behavior
What Every C Programmer Should Know About Undefined Behavior
What are “sequence points” and how do they affect undefined behavior?
The behaviour of the program is undefined.
There are multiple unsequenced writes on k in the expression
(k = 5) + (k = 8) + (k = 9) + (k = 7)

Increment and decrement operators in one statement in C [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Undefined, unspecified and implementation-defined behavior
(9 answers)
Closed 3 years ago.
I know it is theoretically undefined behavior and of course bad style. This is the example of a school (I am not the pupil).
But why do I get 7 (example a) resp. 1 (example b) out of this operation:
Online example:
https://onlinegdb.com/B172lj8k8
a:
#include <stdio.h>
int main()
{
int i = 2;
printf("%d; i=%d", ++i + i++, i);
return 0;
}
b:
#include <stdio.h>
int main()
{
int i = 2;
printf("%d; i=%d", ++i - i++, i);
return 0;
}
In my opinion the output should be 6 and 2.
Execute i++, yield 2, increment i
Execute ++i yield 4
Additon 2 + 4
The other example should be 4 - 2.
Exexcuting the increment in a statement seems to yield the result of the increment immediately, no matter if it is postfix or prefix, which is odd. Or do I get it wrong totally?
The order in which the arguments passed to a function are evaluated is not specified, and the order of evaluating the operants of + is unspecified, too.
So in printf("%d %d", i+1, i-1), for example, you cannot rely on the order of evaluation of the arguments; i+1 might be evaluated after i-1, actually; You will not recognise, since the evaluation of the one does not effect the result of the other.
In conjunction with "side effects" like the post-increment i++, however, the effect of incrementing i at a specific point in time might influence the result of other evaluations based on i. Therefore, it is "undefined behaviour" in C, if a variable is used more than once in an expression and a side effect changes its value (formally, to be precise, if there is no sequence point in between).

Multiple unsequenced modifications to 'a',a = a++ [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 1 year ago.
Why i can't do the following in Objective-C?
a = (a < 10) ? (a++) : a;
or
a = (a++)
The ++ in a++ is a post-increment. In short, a = a++ does the following:
int tmp = a; // get the value of a
a = a + 1 //post-increment a
a = tmp // do the assignment
As noted by others, in C this is actually undefined behavior and different compilers can order the operations differently.
Try to avoid using ++ and -- operators when you can. The operators introduce side effects which are hard to read. Why not simply write:
if (a < 10) {
a += 1;
}
To extend Sulthan's answer, there are several problem with your expressions, at least the simple assignment (case 2).
A. There is no sense in doing so. Even a++ has a value (is a non-void expression) that can be assigned, it automatically assigns the result to a itself. So the best you can expect is equivalent to
a++;
The assignment cannot improve the assignment at all. But this is not the error message.
B. Sulthans replacement of the statement is a better case. It is even worse: The ++ operator has the value of a (at the beginning of the expression) and the effect to increment a at some point in future: The increment can be delayed up to the next sequence point.
The side effect of updating the stored value of the operand shall occur between the previous and the next sequence point.
(ISO/IEC 9899:TC3, 6.5.2.4, 2)
But the assignment operator = is not a sequence point (Annex C).
The following are the sequence points described in 5.1.2.3:
[Neither assignment operator nor ) is included in the list]
Therefore the expression can be replaced with what Sulthan said:
int tmp = a; // get the value of a
a = a + 1 //post-increment a
a = tmp // do the assignment
with the result, that a still contains the old value.
Or the expression can be replaced with this code …:
int tmp = a; // get the value of a
a = tmp // assignemnt
a = a + 1 // increment
… with a different result (a is incremented). This is what the error message says: There is no defined sequence (order the operations has to be applied.)
You can insert a sequence point using the comma operator , (what is the primary use of it), …
a++, a=a; // first increment, then assign
… but this shows the whole leak of meaning of what you want to do.
It is the same with your first example. Though ? is a sequence point …:
The following are the sequence points described in 5.1.2.3:
… The end of the first operand of the following operators: […] conditional ? (6.5.15);[…].
… both the increment (a++) and the assignment (a=) are after the ? operator is evaluated and therefore unsequenced ("in random order") again.
To make the comment "Be careful" more concrete: Don't use an incremented object in an expression twice. (Unless there is a clear sequence point).
int a = 1;
… = a++ * a;
… evaluates to what? 2? 1? Undefined, because the increment can take place after reading a "the second time".
BTW: The Q is not related to Objective-C, but to pure C. There is no Objective-C influence to C in that point. I changed the tagging.

Strange behavior from a simple C program

If I run the following code, graph[0][0] gets 1 while graph[0][1] gets 4.
In other words, the line graph[0][++graph[0][0]] = 4; puts 1 into graph[0][0] and 4 into graph[0][1].
I would really appreciate if anyone can offer reasonable explanation.
I observed this from Visual C++ 2015 as well as an Android C compiler (CppDriod).
static int graph[10][10];
void main(void)
{
graph[0][++graph[0][0]] = 4;
}
Let's break it down:
++graph[0][0]
This pre-increments the value at graph[0][0], which means that now graph[0][0] = 1, and then the value of the expression is 1 (because that is the final value of graph[0][0]).
Then,
graph[0][/*previous expression = 1*/] = 4;
So basically, graph[0][1] = 4;
That's it! Now graph[0][0] = 1 and graph[0][1] = 4.
First let's see what is the unary (prefix) increment operator does.
The value of the operand of the prefix ++ operator is incremented. The result is the new value of the operand after incrementation.
So, in case of
graph[0][++graph[0][0]] = 4;
first, the value of graph[0][0] is incremented by 1, and then the value is used in indexing.
Now, graph being a static global variable, due to implicit initialization, all the members in the array are initialized to 0 by default. So, ++graph[0][0] increments the value of graph[0][0] to 1 and returns the value of 1.
Then, the simpllified version of the instrucion looks like
graph[0][1] = 4;
Thus, you get
graph[0][0] as 1
graph[0][1] as 4.
Also, FWIW, the recommended signature of main() is int main(void).
You are adding one to graph[0][0], by doing ++graph[0][0]. And then setting graph[0][1] to 4. Maybe you want to do graph[0][graph[0][0]+1] = 4
At first your variable graph[10][10] is static so it will be initialized with value 0.
Then line graph[0][++graph[0][0]] = 4 ; here graph[0][0] = 0 in expression you just incrementing the value of graph[0][0] so basically you assigning graph[0][1] = 4; yourself
Note that you have used pre-increment operator (++x) so it first get incremented and value is changed but if you would have use post-increment operator(x++) then graph[0][0] = 4; itself
Let's line up the facts about this expression
graph[0][++graph[0][0]] = 4;
Per 6.5.1, the computation of the array index ++graph[0][0] is sequenced before the computation of array element graph[0][++graph[0][0]], which in turn is sequenced before the computation of the entire assignment operator.
The value of ++graph[0][0] is required to be 1. Note that this does not mean that the whole pre-increment together with its side-effects has to "happen first". It simply means that the result of that pre-increment (which is 1) has to be computed first. The actual modification of graph[0][0] (i.e. changing of graph[0][0] from 0 to 1) might happen much much later. Nobody knows when it will happen exactly (sometime before the end of the statement).
This means that the element being modified by the assignment operator is graph[0][1]. This is where that 4 should go to. Assignment of 4 to graph[0][1] is also a side-effect of = operator, which will happen sometime before the end of the statement.
Note, that in this case we could conclusively establish that ++ modifies graph[0][0], while = modifies graph[0][1]. We have two unsequenced side-effects (which is dangerous), but they act on two different objects (which makes them safe). This is exactly what saves us from undefined behavior in this case.
However, this is dependent on the initial value of graph array. If you try this
graph[0][0] = -1;
graph[0][++graph[0][0]] = 4;
the behavior will immediately become undefined, even though the expression itself looks the same. In this case the side-effect of ++ and the side-effect of = are applied to the same array element graph[0][0]. The side-effects are not sequenced with relation to each other, which means that the behavior is undefined.

Garbage value with post increment

Question 1
int x;
if (x++)
{
printf ("\nASCII value of X is smaller than that of x");
}
Is x assigned here with a garbage value ??
Question 2:
main ()
{
int i;
for (i = 0; i++ < 10;)
{
printf ("%d\n", i);
}
}
Can anyone explain how i++ < 10 works?I mean it should end at 9 why 10
The value of x is indeterminate, and is possibly a trap representation, in which case the behavior of x++ is undefined.
The expression i++ evaluates to the current value of i; as a side effect, the value in i is incremented. So if i == 1, the expression i++ will evaluate to 1, and as a side effect i will be set to 2.
Chapter and verse:
6.5.2.4 Postfix increment and decrement operators
...
2 The result of the postfix ++ operator is the value of the operand. After the result is
obtained, the value of the operand is incremented. (That is, the value 1 of the appropriate
type is added to it.) See the discussions of additive operators and compound assignment
for information on constraints, types, and conversions and the effects of operations on
pointers. The side effect of updating the stored value of the operand shall occur between
the previous and the next sequence point.
Emphasis mine.
In the first question, you declare x
int x;
but you do not assign it, this reserves some memory to hold the value of x, but doesn't initialize it to a known value. That's a really bad thing. Then you read it, increment it, and possibly do something.
if ( x++ ) {
printf ( "\nascii value of X is smaller than that of x" ) ;
}
Since you don't know what it's value was before you read it, it is impossible to make an educated guess as to whether your if statement will print anything.
In your second question (please one question per question), you read the value of i, then increment it, and then do the comparison on the read value. Post increment basically means, "increment the value after I read it" and so the new value will be stored, then the comparison made on the old value, and the printf statement below will print the "current, new" value.
Question 1: Yes
Question 2: Yes. i is incremented by one then compared if it's lesser than 10.

Resources