Is this undefined C behaviour? - c

Our class was asked this question by the C programming prof:
You are given the code:
int x=1;
printf("%d",++x,x+1);
What output will it always produce ?
Most students said undefined behavior. Can anyone help me understand why it is so?
Thanks for the edit and the answers but I'm still confused.

The output is likely to be 2 in every reasonable case. In reality, what you have is undefined behavior though.
Specifically, the standard says:
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.
There is a sequence point before evaluating the arguments to a function, and a sequence point after all the arguments have been evaluated (but the function not yet called). Between those two (i.e., while the arguments are being evaluated) there is not a sequence point (unless an argument is an expression includes one internally, such as using the && || or , operator).
That means the call to printf is reading the prior value both to determine the value being stored (i.e., the ++x) and to determine the value of the second argument (i.e., the x+1). This clearly violates the requirement quoted above, resulting in undefined behavior.
The fact that you've provided an extra argument for which no conversion specifier is given does not result in undefined behavior. If you supply fewer arguments that conversion specifiers, or if the (promoted) type of the argument disagrees with that of the conversion specifier you get undefined behavior -- but passing an extra parameter does not.

Any time the behavior of a program is undefined, anything can happen — the classical phrase is that "demons may fly out of your nose" — although most implementations don't go that far.
The arguments of a function are conceptually evaluated in parallel (the technical term is that there is no sequence point between their evaluation). That means the expressions ++x and x+1 may be evaluated in this order, in the opposite order, or in some interleaved way. When you modify a variable and try to access its value in parallel, the behavior is undefined.
With many implementations, the arguments are evaluated in sequence (though not always from left to right). So you're unlikely to see anything but 2 in the real world.
However, a compiler could generate code like this:
Load x into register r1.
Calculate x+1 by adding 1 to r1.
Calculate ++x by adding 1 to r1. That's ok because x has been loaded into r1. Given how the compiler was designed, step 2 cannot have modified r1, because that could only happen if x was read as well as written between two sequence points. Which is forbidden by the C standard.
Store r1 into x.
And on this (hypothetical, but correct) compiler, the program would print 3.
(EDIT: passing an extra argument to printf is correct (§7.19.6.1-2 in N1256; thanks to Prasoon Saurav) for pointing this out. Also: added an example.)

The correct answer is: the code produces undefined behavior.
The reason the behavior is undefined is that the two expressions ++x and x + 1 are modifying x and reading x for an unrelated (to modification) reason and these two actions are not separated by a sequence point. This results in undefined behavior in C (and C++). The requirement is given in 6.5/2 of C language standard.
Note, that the undefined behavior in this case has absolutely nothing to do with the fact that printf function is given only one format specifier and two actual arguments. To give more arguments to printf than there are format specifiers in the format string is perfectly legal in C. Again, the problem is rooted in the violation of expression evaluation requirements of C language.
Also note, that some participants of this discussion fail to grasp the concept of undefined behavior, and insist on mixing it with the concept of unspecified behavior. To better illustrate the difference let's consider the following simple example
int inc_x(int *x) { return ++*x; }
int x_plus_1(int x) { return x + 1; }
int x = 1;
printf("%d", inc_x(&x), x_plus_1(x));
The above code is "equivalent" to the original one, except that the operations that involve our x are wrapped into functions. What is going to happen in this latest example?
There's no undefined behavior in this code. But since the order of evaluation of printf arguments is unspecified, this code produces unspecified behavior, i.e. it is possible that printf will be called as printf("%d", 2, 2) or as printf("%d", 2, 3). In both cases the output will indeed be 2. However, the important difference of this variant is that all accesses to x are wrapped into sequence points present at the beginning and at the end of each function, so this variant does not produce undefined behavior.
This is exactly the reasoning some other posters are trying to force onto the original example. But it cannot be done. The original example produces undefined behavior, which is a completely different beast. They are apparently trying to insist that in practice undefined behavior is always equivalent to unspecified behavior. This is a totally bogus claim that only indicate the lack of expertise in those who make it. The original code produces undefined behavior, period.
To continue with the example, let's modify the previous code sample to
printf("%d %d", inc_x(&x), x_plus_1(x));
the output of the code will become generally unpredictable. It can print 2 2 or it can print 2 3. However note that even though the behavior is unpredictable, it still does not produce the undefined behavior. The behavior is unspecified, bit not undefined. Unspecified behavior is restricted to two possibilities: either 2 2 or 2 3. Undefined behavior is not restricted to anything. It can format you hard drive instead of printing something. Feel the difference.

Most students said undefined behavior. Can anyone help me understand why it is so?
Because order in which function parameters are calculated is not specified.

What output will it always produce ?
It will produce 2 in all environments I can think of. Strict interpretation of the C99 standard however renders the behaviour undefined because the accesses to x do not meet the requirements that exist between sequence points.
Most students said undefined behavior.
Can anyone help me understand why it
is so?
I will now address the second question which I understand as "Why do most of the students of my class say that the shown code constitutes undefined behaviour?" and I think no other poster has answered so far. One part of the students will have remembered examples of undefined value of expressions like
f(++i,i)
The code you give fits this pattern but the students erroneously think that the behaviour is defined anyway because printf ignores the last parameter. This nuance confuses many students. Another part of the student will be as well versed in standard as David Thornley and say "undefined behaviour" for the correct reasons explained above.

The points made about undefined behavior are correct, but there is one additional wrinkle: printf may fail. It's doing file IO; there are any number of reasons it could fail, and it's impossible to eliminate them without knowing the complete program and the context in which it will be executed.

Echoing codaddict the answer is 2.
printf will be called with argument 2 and it will print it.
If this code is put in a context like:
void do_something()
{
int x=1;
printf("%d",++x,x+1);
}
Then the behaviour of that function is completely and unambiguously defined. I'm not of course arguing that this is good or correct or that the value of x is determinable afterwards.

The output will be always (for 99.98% of the most important stadard compliant compilers and systems) 2.
According to the standard, this seems to be, by definition, "undefined behaviour", a definition/answer that is self-justifying and that says nothing about what actually can happen, and especially why.
The utility splint (which is not a std compliance checking tool), and so splint's programmers, consider this as "unspecified behaviour". This means, basically, that the evaluation of (x+1) can give 1+1 or 2+1, depending on when the update of x is actually done. Since however the expression is discarded (printf format reads 1 argument), the output is unaffected, and we can still say it is 2.
undefined.c:7:20: Argument 2 modifies x, used by argument 3 (order of
evaluation of actual parameters is undefined): printf("%d\n", ++x, x + 1)
Code has unspecified behavior. Order of evaluation of function parameters or
subexpressions is not defined, so if a value is used and modified in
different places not separated by a sequence point constraining evaluation
order, then the result of the expression is unspecified.
As said before, the unspecified behaviour affect just the evaluation of (x+1), not the whole statement or other expressions of it. So in the case of "unspecified behaviour" we can say that the output is 2, and nobody could object.
But this is not unspecified behaviour, it seems to be "undefined behaviour". And the "undefined behaviour" seems to have to be something that affect the whole statement instead of the single expression. This is due to the mistery around where the "undefined behaviour" actually occur (i.e. what exactly affects).
If there would be motivations to attach the "undefined behaviour" just to the (x+1) expression, as in the "unspecified behaviour" case, then we still could say that the output is always (100%) 2. Attaching the "undefined behaviour" just to (x+1) means that we are not able to say if it is 1+1 or 2+1; it is just "anything". But again, that "anything" is dropped because of the printf, and this means that the answer would be "always (100%) 2".
Instead, because of misterious asymmetries, the "undefined behaviour" can't be attached just to the x+1, but indeed it must affect at least the ++x (which by the way is the responsible for the undefined behaviour), if not the whole statement. If it infects just the ++x expression, the output is a "undefined value", i.e. any integer, e.g. -5847834 or 9032. If it infects the whole statement, then you could see gargabe in your console output, likely you could have to stop the program with ctrl-c, possibly before it starts to choke your cpu.
According to an urban legend, the "undefined behaviour" infects not only the whole program, but also your computer and the laws of physics, so that misterious creatures can be created by your program and fly away or eat you.
No answers explain anything competently about the topic. They are just a "oh see the standard says this" (and it is just an interpretation, as usual!). So at least you have learned that "standards exist", and they make arid the educational questions (since of course, don't forget that your code is wrong, regardless undefined/unspecified behaviourism and other standard facts), unuseful the logic arguments and aimless the deep investigations and understanding.

Related

Why does this code print 1 2 2 and not the expected 3 3 1? [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 2 years ago.
Notice: this is a self-Q/A and a more visible targeting the erroneous information promoted by the book "Let us C". Also, please let's keep the c++ out of the discussion, this question is about C.
I am reading the book "Let us C" by Yashwant Kanetkar.
In the book there is the following example:
#include <stdio.h>
int main(void) {
int a = 1;
printf("%d %d %d", a, ++a, a++);
}
The author claims that this code should output 3 3 1:
Surprisingly, it outputs 3 3 1. This is
because C’s calling convention is from right to left. That is, firstly
1 is passed through the expression a++ and then a is incremented
to 2. Then result of ++a is passed. That is, a is incremented to 3
and then passed. Finally, latest value of a, i.e. 3, is passed. Thus in
right to left order 1, 3, 3 get passed. Once printf( ) collects them it
prints them in the order in which we have asked it to get them
printed (and not the order in which they were passed). Thus 3 3 1
gets printed.
However when I compile the code and run it with clang, the result is 1 2 2, not 3 3 1; why is that?
The author is wrong. Not only is the order of evaluation of function arguments unspecified in C, the evaluations are unsequenced with regards to each other. Adding to the injury, reading and modifying the same object without an intervening sequence point in independent expressions (here the value of a is evaluated in 3 independent expressions and modified in 2) has undefined behaviour, so the compiler has the liberty of producing any kind of code that it sees fit.
For details, see Why are these constructs using pre and post-increment undefined behavior?
C’s calling convention
This has nothing to do with calling convention! And C does not even specify a certain calling convention - "cdecl" etc are x86 PC inventions (and have nothing to do with this). The correct and formal C language term is order of evaluation.
The order of evaluation is unspecified behavior (formally defined term), meaning that we can't know if it is left to right or right to left. The compiler need not document it and need not have a consistent order from case to case basis.
But there is a more severe problem yet here: the so-called unsequenced side-effects. C17 6.5/2 states:
If a side effect on a scalar object is unsequenced relative to either a different side effect
on the same scalar object or a value computation using the value of the same scalar
object, the behavior is undefined. If there are multiple allowable orderings of the
subexpressions of an expression, the behavior is undefined if such an unsequenced side
effect occurs in any of the orderings.
This text is quite hard to digest for normal humans. A rough, simplified translation from language-lawyer nerd language to plain English:
In case the binary operators1) used in the expression don't explicitly state the order that the operands are executed2) and,
a side-effect, such as changing the value, happens to a variable in the expression, and,
that same variable is used elsewhere in the same expression,
then the program is broken and might do anything.
1) Operators with 2 operands.
2) Most operators don't do this, only a few exceptions like || && , operators do so.
The author is wrong and the book has multiple instances of incorrect statements like this one.
In C, the behavior of printf("%d %d %d", a, ++a, a++); is undefined both because the order of evaluation of function arguments is unspecified and because modifying the same object multiple times between 2 sequence points has undefined behavior just to name these two.
Note that the book is referenced as do not use in The Definitive C Book Guide and List for providing incorrect advice with this precise example.
Note also that other languages may have a different take on this kind of statement, notably java where the behavior is fully defined.
I read many years ago in University (C step by step by Mitchell Waite) which c compiler (some compilers) uses stack for printf by pushing arguments from right to left (to specifier) and then pop them one by one and print them.
I write this code and the output is 3 3 1 : Online Demo.
Based on the book in the stack we have something like this
But after a minor challenges with experts (in comments) I found that maybe in some compilers this sequence be true but not for all.
#Lundin provided this code and the output is 1 2 2 :Online Demo
and #Bob__ provided another example which output is totally different: Online Demo
It totally depends on compiler implementation and has undefined behaviour.

Order of evaluation of array indices (versus the expression) in C

Looking at this code:
static int global_var = 0;
int update_three(int val)
{
global_var = val;
return 3;
}
int main()
{
int arr[5];
arr[global_var] = update_three(2);
}
Which array entry gets updated? 0 or 2?
Is there a part in the specification of C that indicates the precedence of operation in this particular case?
Order of Left and Right Operands
To perform the assignment in arr[global_var] = update_three(2), the C implementation must evaluate the operands and, as a side effect, update the stored value of the left operand. C 2018 6.5.16 (which is about assignments) paragraph 3 tells us there is no sequencing in the left and right operands:
The evaluations of the operands are unsequenced.
This means the C implementation is free to compute the lvalue arr[global_var] first (by “computing the lvalue,” we mean figuring out what this expression refers to), then to evaluate update_three(2), and finally to assign the value of the latter to the former; or to evaluate update_three(2) first, then compute the lvalue, then assign the former to the latter; or to evaluate the lvalue and update_three(2) in some intermixed fashion and then assign the right value to the left lvalue.
In all cases, the assignment of the value to the lvalue must come last, because 6.5.16 3 also says:
… The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands…
Sequencing Violation
Some might ponder about undefined behavior due to both using global_var and separately updating it in violation of 6.5 2, which says:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined…
It is quite familiar to many C practitioners that the behavior of expressions such as x + x++ is not defined by the C standard because they both use the value of x and separately modify it in the same expression without sequencing. However, in this case, we have a function call, which provides some sequencing. global_var is used in arr[global_var] and is updated in the function call update_three(2).
6.5.2.2 10 tells us there is a sequence point before the function is called:
There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call…
Inside the function, global_var = val; is a full expression, and so is the 3 in return 3;, per 6.8 4:
A full expression is an expression that is not part of another expression, nor part of a declarator or abstract declarator…
Then there is a sequence point between these two expressions, again per 6.8 4:
… There is a sequence point between the evaluation of a full expression and the evaluation of the next full expression to be evaluated.
Thus, the C implementation may evaluate arr[global_var] first and then do the function call, in which case there is a sequence point between them because there is one before the function call, or it may evaluate global_var = val; in the function call and then arr[global_var], in which case there is a sequence point between them because there is one after the full expression. So the behavior is unspecified—either of those two things may be evaluated first—but it is not undefined.
The result here is unspecified.
While the order of operations in an expression, which dictate how subexpressions are grouped, is well defined, the order of evaluation is not specified. In this case it means that either global_var could be read first or the call to update_three could happen first, but there’s no way to know which.
There is not undefined behavior here because a function call introduces a sequence point, as does every statement in the function including the one that modifies global_var.
To clarify, the C standard defines undefined behavior in section 3.4.3 as:
undefined behavior
behavior, upon use of a nonportable or erroneous program construct or
of erroneous data,for which this International Standard imposes no
requirements
and defines unspecified behavior in section 3.4.4 as:
unspecified behavior
use of an unspecified value, or other behavior where this
International Standard provides two or more possibilities and imposes
no further requirements on which is chosen in any instance
The standard states that the evaluation order of function arguments is unspecified, which in this case means that either arr[0] gets set to 3 or arr[2] gets set to 3.
I tried and I got the entry 0 updated.
However according to this question: will right hand side of an expression always evaluated first
The order of evaluation is unspecified and unsequenced.
So I think a code like this should be avoided.
As it makes little sense to emit code for an assignment before you have a value to assign, most C compilers will first emit code that calls the function and save the result somewhere (register, stack, etc.), then they will emit code that writes this value to its final destination and therefore they will read the global variable after it has been changed. Let us call this the "natural order", not defined by any standard but by pure logic.
Yet in the process of optimization, compilers will try to eliminate the intermediate step of temporarily storing the value somewhere and try to write the function result as directly as possible to the final destination and in that case, they often will have to read the index first, e.g. to a register, to be able to directly move the function result to the array. This may cause the global variable to be read before it was changed.
So this is basically undefined behavior with the very bad property that its quite likely that the result will be different, depending on if optimization is performed and how aggressive this optimization is. It's your task as a developer to resolve that issue by either coding:
int idx = global_var;
arr[idx] = update_three(2);
or coding:
int temp = update_three(2);
arr[global_var] = temp;
As a good rule of the thumb: Unless global variables are const (or they are not but you know that no code will ever change them as a side effect), you should never use them directly in code, as in a multi-threaded environment, even this can be undefined:
int result = global_var + (2 * global_var);
// Is not guaranteed to be equal to `3 * global_var`!
Since the compiler may read it twice and another thread can change the value in between the two reads. Yet, again, optimization would definitely cause the code to only read it once, so you may again have different results that now also depend on the timing of another thread. Thus you will have a lot less headache if you store global variables to a temporary stack variable before usage. Keep in mind if the compiler thinks this is safe, it will most likely optimize even that away and instead use the global variable directly, so in the end, it may make no difference in performance or memory use.
(Just in case someone asks why would anyone do x + 2 * x instead of 3 * x - on some CPUs addition is ultra-fast and so is multiplication by a power two as the compiler will turn these into bit shifts (2 * x == x << 1), yet multiplication with arbitrary numbers can be very slow, thus instead of multiplying by 3, you get much faster code by bit shifting x by 1 and adding x to the result - and even that trick is performed by modern compilers if you multiply by 3 and turn on aggressive optimization unless it's a modern target CPU where multiplication is equally fast as addition since then the trick would slow down the calculation.)
Global edit: sorry guys, I got all fired up and wrote a lot of nonsense. Just an old geezer ranting.
I wanted to believe C had been spared, but alas since C11 it has been brought on par with C++. Apparently, knowing what the compiler will do with side effects in expressions requires now to solve a little maths riddle involving a partial ordering of code sequences based on a "is located before the synchronization point of".
I happen to have designed and implemented a few critical real-time embedded systems back in the K&R days (including the controller of an electric car that could send people crashing into the nearest wall if the engine was not kept in check, a 10 tons industrial robot that could squash people to a pulp if not properly commanded, and a system layer that, though harmless, would have a few dozen processors suck their data bus dry with less than 1% system overhead).
I might be too senile or stupid to get the difference between undefined and unspecified, but I think I still have a pretty good idea of what concurrent execution and data access mean. In my arguably informed opinion, this obsession of the C++ and now C guys with their pet languages taking over synchronization issues is a costly pipe dream. Either you know what concurrent execution is, and you don't need any of these gizmos, or you don't, and you would do the world at large a favour not trying to mess with it.
All this truckload of eye-watering memory barrier abstractions is simply due to a temporary set of limitations of the multi-CPU cache systems, all of which can be safely encapsulated in common OS synchronization objects like, for instance, the mutexes and condition variables C++ offers.
The cost of this encapsulation is but a minute drop in performances compared with what a use of fine grained specific CPU instructions could achieve is some cases.
The volatile keyword (or a #pragma dont-mess-with-that-variable for all I, as a system programmer, care) would have been quite enough to tell the compiler to stop reordering memory accesses.
Optimal code can easily be produced with direct asm directives to sprinkle low level driver and OS code with ad hoc CPU specific instructions. Without an intimate knowledge of how the underlying hardware (cache system or bus interface) works, you're bound to write useless, inefficient or faulty code anyway.
A minute adjustment of the volatile keyword and Bob would have been everybody but the most hardboiled low level programers' uncle.
Instead of that, the usual gang of C++ maths freaks had a field day designing yet another incomprehensible abstraction, yielding to their typical tendency to design solutions looking for non existent problems and mistaking the definition of a programming language with the specs of a compiler.
Only this time the change required to deface a fundamental aspect of C too, since these "barriers" had to be generated even in low level C code to work properly. That, among other things, wrought havoc in the definition of expressions, with no explanation or justification whatsoever.
As a conclusion, the fact that a compiler could produce a consistent machine code from this absurd piece of C is only a distant consequence of the way C++ guys coped with potential inconsistencies of the cache systems of the late 2000s.
It made a terrible mess of one fundamental aspect of C (expression definition), so that the vast majority of C programmers - who don't give a damn about cache systems, and rightly so - is now forced to rely on gurus to explain the difference between a = b() + c() and a = b + c.
Trying to guess what will become of this unfortunate array is a net loss of time and efforts anyway. Regardless of what the compiler will make of it, this code is pathologically wrong. The only responsible thing to do with it is send it to the bin.
Conceptually, side effects can always be moved out of expressions, with the trivial effort of explicitly letting the modification occur before or after the evaluation, in a separate statement.
This kind of shitty code might have been justified in the 80's, when you could not expect a compiler to optimize anything. But now that compilers have long become more clever than most programmers, all that remains is a piece of shitty code.
I also fail to understand the importance of this undefined / unspecified debate. Either you can rely on the compiler to generate code with a consistent behaviour or you can't. Whether you call that undefined or unspecified seems like a moot point.
In my arguably informed opinion, C is already dangerous enough in its K&R state. A useful evolution would be to add common sense safety measures. For instance, making use of this advanced code analysis tool the specs force the compiler to implement to at least generate warnings about bonkers code, instead of silently generating a code potentially unreliable to the extreme.
But instead the guys decided, for instance, to define a fixed evaluation order in C++17. Now every software imbecile is actively incited to put side effects in his/her code on purpose, basking in the certainty that the new compilers will eagerly handle the obfuscation in a deterministic way.
K&R was one of the true marvels of the computing world. For twenty bucks you got a comprehensive specification of the language (I've seen single individuals write complete compilers just using this book), an excellent reference manual (the table of contents would usually point you within a couple of pages of the answer to your question), and a textbook that would teach you to use the language in a sensible way. Complete with rationales, examples and wise words of warning about the numerous ways you could abuse the language to do very, very stupid things.
Destroying that heritage for so little gain seems like a cruel waste to me. But again I might very well fail to see the point completely.
Maybe some kind soul could point me in the direction of an example of new C code that takes a significant advantage of these side effects?

Logic program: increment and arithmetic operator of C [duplicate]

This question already has answers here:
sequence points in c
(4 answers)
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 6 years ago.
Can anyone help me solve this question
void main()
{
int num, a=5;
num = -a-- + + ++a;
printf("%d %d\n", num, a);
}
Answer: 0 5
But how? Can anyone explain the logic behind?
Once upon a time I watched two aggressive Boston drivers trying to park in the same parking space, such that they actually crashed their cars into each other. Something very much like that happens here.
There are other things wrong with this code, but let's just focus on what gets stored into a. If we imagine that the expression a-- + ++a gets evaluated from left to right, the first thing that happens is the a-- part. a started out as 5, so a-- has the value of 5, with a note to store 4 into a. Imagine that this note is given to the driver of the first car for delivery.
Next we get to ++a. Assume for the moment that the message to store 4 into a hasn't gotten through yet. So ++a has the value 6, with a note to store 6 into a. Imagine that this note is given to the driver of the second car for delivery.
One big question is, what is the final value of num, but I'm not worried about that for now. Let's just ask, what is the final value of a? We've got one car driver trying to set it to 4, and one trying to set to 6. But there's only one parking space labeled a. Maybe the first driver gets there first and sets it to 4, or maybe the second driver gets there and sets it to 6. Or maybe they get there at the exact same time, and crash into each other, and a gets set to a twisted piece of metal from the second car's front bumper. (Perhaps that twisted piece of metal looks a little like the number 5, so printing 5 was the best that printf could do.)
As I said, there are other things wrong with this code. For one thing, we shouldn't have imagined that it gets evaluated from left to right; there's no guarantee about that at all. We also don't know if the ++a part should operate on the original value of a, or the value affected by a--. In fact, we have no idea what this expression should do, and the perhaps surprising fact is, the compiler doesn't know, either! Different compilers will interpret an expression like this differently, giving wildly different answers. None of the answers are right, and none are wrong, because this expression is undefined.
The expression is undefined because there are two different attempts inside it to assign a new value to a. (You'll find more formal definitions of undefined behavior in the linked answers.) So a simple rule for avoiding undefined behavior is, "don't try to set a variable's value twice in one expression".
You might be thinking that this expression ought to have a well-defined interpretation. You might be thinking that it should obviously be evaluated from left to right. You might be thinking that a-- should store the new value into a immediately, such that it would be guaranteed to be the value seen by "later" parts of the expression. You might think those things, and they might be true about some other programming language, but not C. C does not guarantee that expressions are evaluated strictly left to right, and it does not guarantee that a-- or ++a update a's value immediately. Trying to update a's value twice in one expression leads to undefined behavior, and undefined behavior means that anything can happen.

Undefined expression with a definite answer?

I'm trying to understand the answer to an interview question.
The following code is presented:
int x = 5;
int y = x++ * ++x;
What is the value of y?
The answers are presented as a list of multiple choice answers, one of them being 35.
Writing and running this code on my machine results in y being equal to 35.
Therefore I would expect to mark the answer to this question as 35.
However, isn't this expression undefined due to the x++ i.e. the side of effect (the actual incrementation of x) could happen at any time depending on how the compile chooses to compile the code.
Therefore, I would not have thought that you could say for certain that the the answer is 35 as different compilers might produce difference results.
Another possible multiple choice answer is 30, which I would have thought was also viable i.e. if the post increment side effect takes place towards very end of the sequence point.
I don't have the answer key, so it's hard to determine what the would be the best answer to give.
Is this just a poor question or is the answer more obvious?
You are correct; the Standard does not impose any requirements at all on what this code might do.
If this was a multiple-choice question that required you to choose a specific defined answer, then — whoever wrote the question doesn't understand C as well as you do. Keep that in mind when you decide whether to accept an offer there. :-)
Undefined behavior means UNDEFINED, sometimes it could be the EXPECTED behavior, it doesn't mean it's DEFINED.
Perhaps they want you to answer what the expected behavior is, which is none since it's undefined behavior.
You are correct that this is undefined. However...
Often these questions are designed to make you think and, even though undefined by standard, the questioner may be looking to see if you:
Know what happens "in the real world" most of the time. (I'm not saying this is good, but I am saying that you will find constructs this bad and worse in codebases that you may be asked to take over and maintain.)
Know about operator precedence and order of operations.

Clarification c./ change in wording of C99 standard

I realize that merely asking about undefined behavior leads to downvotes by some, but I have a question comparing C99 v. Sep 2007 (the only one I have access to, and which so matters to me), and the one from 2011. The relevant quotes are from 6.5 (2) in either version:
2007: "Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored. (highlight added)"
2011: "If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the same scalar object, the behavior is undefined. (...)"
An example given to illustrate what contradicts this in the 2007 version is:
i = ++i + 1;
As C considers an assignment always an expression (no assignment statements), this expression is semantically delineated by 2 sequence points. It is fairly obvious that both versions declare the above to result in undefined behavior.
However, given the highlighted sentence of the 2007 version, it would be my understanding that even the following expression (lying again between two sequence points) would result in undefined behavior:
++i; // or i++; or a = ++i;
, clearly, the "value to be stored" is not only read ('stored' is a bit ambiguous, but I would naturally read it as the one read): it is read, incremented, then stored back. It is sequenced though, and so fine (as it probably should be) by the 2011 wording.
Was this adjustment to the wording made to address the above, in order to match intent to description?
Note: I realize that to an extent this is opinion-based, but (1) the best-case would be that someone actually involved in writing the standard sees this, and (2) while I believe my interpretation to be reasonable/"true", if someone argues convincingly against it, this would be useful too.
However, given the highlighted sentence of the 2007 version, it would be my understanding that even the following expression (lying again between two sequence points) would result in undefined behavior:
++i; // or i++; or a = ++i;
You understood wrong. This is best explained in c-faq question-3.8:
....And that's what the second sentence says: if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written. This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification. For example, the old standby i = i + 1 is allowed, because the access of i is used to determine i's final value. The example
a[i] = i++
is disallowed because one of the accesses of i (the one in a[i]) has nothing to do with the value which ends up being stored in i (which happens over in i++), and so there's no good way to define--either for our understanding or the compiler's--whether the access should take place before or after the incremented value is stored.
In case of i++; or ++i; the access of i and its incrementation has to do with the value which ends up being stored in i.

Resources