C order of evaluation of assignment statement - c

I've been encountered on a case where cross-platform code was behaving differently on a basic assignment statement.
One compiler evaluated the Lvalue first, Rvalue second and then the assignment.
Another compiler did the Rvalue first, Lvalue second and then the assignment.
This may have impact in case Lvalue influence the value of Rvalue as shown in the following case:
struct MM {
int m;
}
int helper (struct MM** ppmm ) {
(*ppmm) = (struct MM *) malloc (sizeof (struct MM));
(*ppmm)->m = 1000;
return 100;
}
int main() {
struct MM mm = {500};
struct MM* pmm = &mm
pmm->m = helper(&pmm);
printf(" %d %d " , mm.m , pmm->m);
}
The example above, the line pmm->m = helper(&mm);, depend on the order of evaluation. if Lvalue evaluated first, than pmm->m is equivalent to mm.m, and if Rvalue calculated first than pmm->m is equivalent to the MM instance that allocated on heap.
My question is whether there's a C standard to determine the order of evaluation (didn't find any), or each compiler can choose what to do.
are there any other similar pitfalls I should be aware of ?

The semantics for evaluation of an = expression include that
The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. The evaluations of the operands are unsequenced.
(C2011, 6.5.16/3; emphasis added)
The emphasized provision explicitly permits your observed difference in the behavior of the program when compiled by different compilers. Moreover, unsequenced means, among other things, that it is permissible for the evaluations to occur in different order even in different runs of the very same build of the program. If the function in which the unsequenced evaluations appear were called more than once, then it would be permissible for the evaluations to occur in different order during different calls within the same execution of the program.
That already answers the question, but it's important to see the bigger picture. Modifying an object or calling a function that does so is a side effect (C2011, 5.1.2.3/2). This key provision therefore comes into play:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
(C2011, 6.5/2)
The called function has the side effect of modifying the value stored in main()'s variable pmm, evaluation of the left-hand operand of the assignment involves a value computation using the value of pmm, and these are unsequenced, therefore the behavior is undefined.
Undefined behavior is to be avoided at all costs. Because your program's behavior is undefined, is not limited to the two alternatives you observed (in case that wasn't bad enough). The C standard places no limitations whatever on what it may do. It might instead crash, zero out your hard drive's partition table, or, if you have suitable hardware, summon nasal demons. Or anything else. Most of these are unlikely, but the best viewpoint is that if your program has undefined behavior then your program is wrong.

When using the simple assignment operator: =, the order of evaluation of operands is unspecified. There is also no sequence point in between the evaluations.
For example if you have two functions:
*Get() = logf(2.0f);
It is not specified in which order they are called at any time, and yet this behavior is completely defined.
A function call will introduce a sequence point. It will happen after the evaluation of the arguments and before the actual call. The operator ; will also introduce a sequence point. This is important because an object must not be modified twice without an intervening sequence point, otherwise the behavior is undefined.
Your example is particularly complicated due to unspecified behavior, and may have different results, depending the left or right operand is evaluated first.
The left operand is evaluated first.
The left operand is evaluated and the pointer pmm will point to the struct mm. Then the function is called, and a sequence point occurs. it modifies the pointer pmm by pointing it to allocated memory, followed by a sequence point because of the operator ;. Then it stores the value 1000 to the member m, followed by another sequence point because of ;. The function returns 100 and assigns it to the left operand, but since the left operand was evaluated first, the value 100, it is assigned to the object mm, more specifically its member m.
mm->m has the value 100 and ppm->m has the value 1000. This is defined behavior, no object is modified twice in-between sequence points.
The right operand is evaluated first.
The function is called first, the sequence point occurs, it modifies the pointer ppm by pointing it to new allocated struct, followed by a sequence point. Then it stores the value 1000 to the member m, followed by a sequence point. Then the function returns. Then the left operand is evaluated, ppm->m will point to the new allocated struct, and its member m, is modified by assigning it the value 100.
mm->m will have the value 500 since it was never modified, and pmm->m will have the value 100. No object was modified twice in-between sequence points. The behavior is defined.

Related

Why is my program evaluating arguments right-to-left?

I am learning C so I tried the below code and am getting an output of 7,6 instead of 6,7. Why?
#include <stdio.h>
int f1(int);
void main()
{
int b = 5;
printf("%d,%d", f1(b), f1(b));
}
int f1(int b)
{
static int n = 5;
n++;
return n;
}
The order of the evaluation of the function arguments is unspecified in C. (Note there's no undefined behaviour here; the arguments are not allowed to be evaluated concurrently for example.)
Typically the evaluation of the arguments is either from right to left, or from left to right.
As a rule of thumb don't call the same function twice in a function parameter list if that function has side-effects (as it does in your case), or if you pass the same parameter twice which allows something in the calling site to be modified (e.g. passing a pointer).
https://en.cppreference.com/w/c/language/eval_order
Before C11, you must follow Rule (2)
There is a sequence point after evaluation of the first (left) operand and
before evaluation of the second (right) operand of the following binary
operators: && (logical AND), || (logical OR), and , (comma).
Because arguments are considered separated by comma operator before C11. This is not optimal because arguments are pushed right to left on some platform. Thus, C11 adds Rule (12) making it unspecified.
A function call that is not sequenced before or sequenced after another
function call is indeterminately sequenced (CPU instructions that
constitute different function calls cannot be interleaved, even if the
functions are inlined)
Even C99 designated initializers still go back to Rule (2), where earlier (left) initializers are resolved before later (right) initializers relative to the comma operator. That is, until C11 adds Rule (13) making it unspecified.
In initialization list expressions, all evaluations are indeterminately
sequenced
In other words, before Rule (12) and Rule (13), the comma operator from Rule (2) is the specified behavior. Rule (2) leads to inefficient code that cannot be optimized on some platform. There is not enough registers if the number of structure member or function parameter exceed some threshold. That is, "Register Pressure" becomes an issue.
Historically, aggregate type initializers and function arguments falls back to the comma operator. In C11, they specifically add the definition that commas in those aggregate type initializers and function arguments are not "comma operators" so that Rule (12) and Rule (13) makes sense, and that Rule (2) is not applied.

Using pointed to content in assignment of a pointer

It has always been my understanding that the lack of a sequence point after the reading of the right expression in an assignment makes an example like the following produce undefined behavior:
void f(void)
{
int *p;
/*...*/
p = (int [2]){*p};
/*...*/
}
// p is assigned the address of the first element of an array of two ints, the
// first having the value previously pointed to by p and the second, zero. The
// expressions in this compound literal need not be constant. The unnamed object
// has automatic storage duration.
However, this is EXAMPLE 2 under "6.5.2.5 Compound literals" in the committee draft for the C11 standard, the version identified as n1570, which I understand to be the final draft (I don't have access to the final version).
So, my question: Is there something in the standard that gives this defined and specified behavior?
EDIT
I would like to expound on exactly what I see as the problem, in response to some of the discussion that has come up.
We have two conditions under which an assignment is explicitly stated to have
undefined behavior, as per 6.5p2 of the standard quoted in the answer given by dbush:
1) A side effect on a scalar object is unsequenced relative to a different side
effect on the same scalar object.
2) A side effect on a scalar object is unsequenced relative to a value
computation using the value of the same scalar object.
An example of item 1 is "i = ++i + 1". In this case the side effect of
writing the value i+1 into i due to ++i is unsequenced relative to the side effect of assigning the RHS to the LHS. There is a sequence point between the value calculations of each side and the assignment of RHS to LHS, as described in 6.5.16.1 given in the answer by Jens Gustedt below. However, the modification of i due to ++i is not subject to that sequence point, otherwise the behavior would
be defined.
In the example I give above, we have a similar situation. There is a value computation, which involves the creation of an array and the conversion of that array to a pointer to its first element. There is also a side effect of writing a value to part of that array, *p to the first element.
So, I don't see what gaurantees we have in the standard that the modification
of the otherwise uninitialized first element of the array will be sequenced
before the writing of the array address to p. What about this modification (writing *p to the first element) is different from the modification of writing
i+1 to i?
To put it another way, suppose an implementation looked at the statement of interest in the example as three tasks: 1st, allocate space for the compound literal object; 2nd: assign a pointer to said space to p; 3rd: write *p to the first element in the newly allocated space. The value computation for both RHS and LHS would be sequenced before the assignment, as computing the value of the RHS only requires the address. In what way is this hypothetical implementation not standard compliant?
You need to look at the definition of the assignment operator in 6.5.16.1
The side effect of updating the stored value of the left operand is
sequenced after the value computations of the left and right operands.
The evaluations of the operands are unsequenced.
So here you clearly see that first it evaluates the expressions on both sides in any order or even concurrently, and then stores the value of the right into the object designated by the left.
Additionally, you should know that LHS and RHS of an assignment are evaluated differently. Citations are a bit too long, so here is a summary
For the LHS the evaluation leaves "lvalues", that is objects such as
p, untouched. In particular it doesn't look at the contents of the
object.
For the RHS there is "lvalue conversion", that is for any object that is found there (e.g *p) the contents of that object is loaded.
If the RHS contains an lvalue of array type, this array is converted to a pointer to its first element. This is what is happening to your compound literal.
Edit: You added another question
What about this modification (writing *p to the first element) is
different from the modification of writing i+1 to i?
The difference is simply that i in the LHS of the assignment and thus has to be updated. The array from the compound literal is not in the LHS and thus is of no concern for the update.
Section 6.5p2 of the C standard details why this is valid:
If a side effect on a scalar object is unsequenced relative to either
a different side effect on the same scalar object or a value
computation using the value of the same scalar object, the behavior is
undefined. If there are multiple allowable orderings of the
subexpressions of an expression, the behavior is undefined if such an
unsequenced side effect occurs in any of the orderings. 84)
And footnote 84 states:
84) This paragraph renders undefined statement expressions such as
i = ++i + 1;
a[i++] = i;
while allowing
i = i + 1;
a[i] = i;
The posted snippet from 6.5.2.5 falls under the latter, as there is no side effect.
In (int [2]){*p}, *p provides an initial value for the compound literal. This is not an assignment, and it is not a side effect. The initial value is part of the object when the object is created. There is no moment when the array exists and it is not initialized.
In p = (int [2]){*p}, we know the side effect of updating p is sequenced after the computation of the right side because C 2011 [N1570] 6.5.16 3 says “The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands.”

Operator precedence in the given expressions

Expression 1: *p++; where p is a pointer to integer.
p will be incremented first and then the value to which it is pointing to is taken due to associativity(right to left). Is it right?
Expression 2: a=*p++; where p is a pointer to integer.
Value of p is taken first and then assigned to a first then p is incremented due to post increment. Is it right?
First of all, let me tell you that, neither associativity nor order of evaluation is actually relevant here. It is all about the operator precedence. Let's see the definitions first. (emphasis mine)
Precedence : In mathematics and computer programming, the order of operations (or operator precedence) is a collection of rules that reflect conventions about which procedures to perform first in order to evaluate a given mathematical expression.
Associativity: In programming languages, the associativity (or fixity) of an operator is a property that determines how operators of the same precedence are grouped in the absence of parentheses.
Order of evaluation : Order of evaluation of the operands of any C operator, including the order of evaluation of function arguments in a function-call expression, and the order of evaluation of the subexpressions within any expression is unspecified, except a few cases. There's mainly two types of evaluation: a) value computation b) side effect.
Post-increment has higher precedence, so it will be evaluated first.
Now, it so happens that the value increment is a side effect of the operation which is sequenced after the " value computation". So, the value computation result, will be the unchanged value of the operand p (which again, here, gets dereferenced due to use of * operator) and then, the increment takes place.
Quoting C11, chapter §6.5.2.4,
The result of the postfix ++ operator is the value of the operand. As a side effect, the
value of the operand object is incremented (that is, the value 1 of the appropriate type is
added to it). See the discussions of additive operators and compound assignment for
information on constraints, types, and conversions and the effects of operations on
pointers. The value computation of the result is sequenced before the side effect of
updating the stored value of the operand. [.....]
The order of evaluation in both the cases are same, the only difference is, in the first case, the final value is discarded.
If you use the first expression "as-is", your compiler should produce a warning about unused value.
Postfix operators have higher priorities than unary operators.
Thus this expression
*p++
is equivalent to the expression
*( p++ )
According to the C Standard (6.5.2.4 Postfix increment and decrement operators)
2 The result of the postfix ++ operator is the value of the
operand. As a side effect, the value of the operand object is
incremented (that is, the value 1 of the appropriate type is added to
it). See the discussions of additive operators and compound assignment
for information on constraints, types, and conversions and the effects
of operations on pointers. The value computation of the result is
sequenced before the side effect of updating the stored value of the
operand.
So p++ yields the original value of the pointer p as the result of the operation and has also a side effect of incrementing the operand itself.
As for the unary operator then (6.5.3.2 Address and indirection operators)
4 The unary * operator denotes indirection. If the operand points to a
function, the result is a function designator; if it points to an
object, the result is an lvalue designating the object. If the operand
has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an
invalid value has been assigned to the pointer, the behavior of the
unary * operator is undefined
So the final result of the expression
*( p++ )
is the value of the object pointed to by the pointer p that also is incremented due to the side effect. This value is assigned to the variable a in the statement
a=*p++;
For example if there are the following declarations
char s[] = "Hello";
char *p = s;
char a;
then after this statement
a = *p++;
the object a will have the character 'H' and the pointer p will point to the second character of the array s that is to the character 'e'.
Associativity is not relevant here. Associativity only matters when you have adjacent operators with the same precedence. But in this case, ++ has higher precedence than *, so only precedence matters. Because of precedence, the expression is equivalent to:
*(p++)
Since it uses post-increment, p++ increments the pointer, but the expression returns the value of the pointer before it was incremented. The indirection then uses that original pointer to fetch the value. It's effectively equivalent to:
int *temp = p;
p = p + 1;
*temp;
The second expression is the same, except it assigns the value to another variable, so that last statement becomes:
a = *temp;
The expression
*p++
is equivalent to
*(p++)
This is due to precedende (i.e.: the postfix increment operator has higher precedence than the indirection operator)
and the expression
a=*p++
is for the same reason equivalent to
a=*(p++)
In both cases, the expression p++ is evaluated to p.
v = i++;: i is returned to the equality operation and then assigned to v. Subsequently, i is incremented (EDIT: technically it's not necessarily executed in this order). Thus v has the old value of i. I remember it like this: ++ is written last and therefore happens last.
v = ++i;: i is incremented, and then returned to be assigned to v. v and i has the same value.
When you don't use the returned value, they do the same (although different implementations may yield different performance in some cases). E.g. in for loops, for(int i=0; i<n; i++) is the same as for(int i=0; i<n; ++i). The latter is sometimes automatically preferred because it tends to be faster for some objects.
* has lower precedence than ++ so *p++ is the same as *(p++). Thus in this case p is returned to * which dereferences it. Then the address in p is incremented by one element. *++p increments the adress of p first, then dereferences it.
v = (*p)++; sets v equal to the old value pointed to by p and then increments it, while v = ++(*p); increments the value pointed to by p and then sets v equal to it. The address in p is unchanged.
Example: If,
int a[] = {1,2};
then
int v = *a++;
and
int v = *++a;
will both leave a incremented, but in the first case v will be 1 and in the latter it'll be 2.
*p++; where p is a pointer to integer.
p will be incremented first and then the value to which it is pointing to is taken due to associativity (right to left). Is it right?
No. In a post-increment, the value is copied to a temporary (an rvalue), then the lvalue is incremented as a side effect.
a=*p++; where p is a pointer to integer.
Value of p is taken first and then assigned to a first then p is incremented due to post increment. Is it right?
No, that's not correct either. The increment of p might happen before the write to a. What's important is that the value being stored in a was loaded using the temporary copy of the prior value of p.
Whether that memory fetch occurs before the memory write with the new value of p isn't specified, and any code that relies on the order is undefined behavior.
Any of these sequences are allowed:
Copy p into temporary THEN increment p, THEN load value at address indicated in temporary THEN store loaded value to a
Copy p into temporary THEN load value at address indicated in temporary (this value itself is placed in a temporary) THEN increment p THEN store loaded value to a
Copy p into temporary THEN load value at address indicated in temporary THEN store loaded value to a THEN increment p
Here are two code examples that are undefined behavior because they rely on the order of side effects:
int a = 7;
int *p = &a;
a = (*p)++; // undefined behavior, do not do this!!
void *pv;
pv = &pv;
void *pv2;
pv2 = *(pv++); // undefined behavior, do not do this!!!
The parentheses do not create a sequence point (or sequenced before relationship, in the new wording). The version of the code with parentheses is just as undefined as the version without.

Dereferencing null pointer valid and working (not sizeof) [duplicate]

Code sample:
struct name
{
int a, b;
};
int main()
{
&(((struct name *)NULL)->b);
}
Does this cause undefined behaviour? We could debate whether it "dereferences null", however C11 doesn't define the term "dereference".
6.5.3.2/4 clearly says that using * on a null pointer causes undefined behaviour; however it doesn't say the same for -> and also it does not define a -> b as being (*a).b ; it has separate definitions for each operator.
The semantics of -> in 6.5.2.3/4 says:
A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.
However, NULL does not point to an object, so the second sentence seems underspecified.
Also relevant might be 6.5.3.2/1:
Constraints:
The operand of the unary & operator shall be either a function designator, the result of a
[] or unary * operator, or an lvalue that designates an object that is not a bit-field and is
not declared with the register storage-class specifier.
However I feel that the bolded text is defective and should read lvalue that potentially designates an object , as per 6.3.2.1/1 (definition of lvalue) -- C99 messed up the definition of lvalue, so C11 had to rewrite it and perhaps this section got missed.
6.3.2.1/1 does say:
An lvalue is an expression (with an object type other than void) that potentially
designates an object; if an lvalue does not designate an object when it is evaluated, the
behavior is undefined
however the & operator does evaluate its operand. (It doesn't access the stored value but that is different).
This long chain of reasoning seems to suggest that the code causes UB however it is fairly tenuous and it's not clear to me what the writers of the Standard intended. If in fact they intended anything, rather than leaving it up to us to debate :)
From a lawyer point of view, the expression &(((struct name *)NULL)->b); should lead to UB, since you could not find a path in which there would be no UB. IMHO the root cause is that at a moment you apply the -> operator on an expression that does not point to an object.
From a compiler point of view, assuming the compiler programmer was not overcomplicated, it is clear that the expression returns the same value as offsetof(name, b) would, and I'm pretty sure that provided it is compiled without error any existing compiler will give that result.
As written, we could not blame a compiler that would note that in the inner part you use operator -> on an expression than cannot point to an object (since it is null) and issue a warning or an error.
My conclusion is that until there is a special paragraph saying that provided it is only to take its address it is legal do dereference a null pointer, this expression is not legal C.
Yes, this use of -> has undefined behavior in the direct sense of the English term undefined.
The behavior is only defined if the first expression points to an object and not defined (=undefined) otherwise. In general you shouldn't search more in the term undefined, it means just that: the standard doesn't provide a meaning for your code. (Sometimes it points explicitly to such situations that it doesn't define, but this doesn't change the general meaning of the term.)
This is a slackness that is introduced to help compiler builders to deal with things. They may defined a behavior, even for the code that you are presenting. In particular, for a compiler implementation it is perfectly fine to use such code or similar for the offsetof macro. Making this code a constraint violation would block that path for compiler implementations.
Let's start with the indirection operator *:
6.5.3.2 p4:
The unary * operator denotes indirection. If the operand points to a function, the result is
a function designator; if it points to an object, the result is an lvalue designating the
object. If the operand has type "pointer to type", the result has type "type". If an
invalid value has been assigned to the pointer, the behavior of the unary * operator is
undefined. 102)
*E, where E is a null pointer, is undefined behavior.
There is a footnote that states:
102) Thus, &*E is equivalent to E (even if E is a null pointer), and &(E1[E2]) to ((E1)+(E2)). It is
always true that if E is a function designator or an lvalue that is a valid operand of the unary &
operator, *&E is a function designator or an lvalue equal to E. If *P is an lvalue and T is the name of
an object pointer type, *(T)P is an lvalue that has a type compatible with that to which T points.
Which means that &*E, where E is NULL, is defined, but the question is whether the same is true for &(*E).m, where E is a null pointer and its type is a struct that has a member m?
C Standard doesn't define that behavior.
If it were defined, new problems would arise, one of which is listed below. C Standard is correct to keep it undefined, and provides a macro offsetof that handles the problem internally.
6.3.2.3 Pointers
An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant. 66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
This means that an integer constant expression with the value 0 is converted to a null pointer constant.
But the value of a null pointer constant is not defined as 0. The value is implementation defined.
7.19 Common definitions
The macros are
NULL
which expands to an implementation-defined null pointer constant
This means C allows an implementation where the null pointer will have a value where all bits are set and using member access on that value will result in an overflow which is undefined behavior
Another problem is how do you evaluate &(*E).m? Do the brackets apply and is * evaluated first. Keeping it undefined solves this problem.
First, let's establish that we need a pointer to an object:
6.5.2.3 Structure and union members
4 A postfix expression followed by the -> operator and an identifier designates a member
of a structure or union object. The value is that of the named member of the object to
which the first expression points, and is an lvalue.96) If the first expression is a pointer to
a qualified type, the result has the so-qualified version of the type of the designated
member.
Unfortunately, no null pointer ever points to an object.
6.3.2.3 Pointers
3 An integer constant expression with the value 0, or such an expression cast to type
void *, is called a null pointer constant.66) If a null pointer constant is converted to a
pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
Result: Undefined Behavior.
As a side-note, some other things to chew over:
6.3.2.3 Pointers
4 Conversion of a null pointer to another pointer type yields a null pointer of that type.
Any two null pointers shall compare equal.
5 An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.67)
6 Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.
67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.
So even if the UB should happen to be benign this time, it might still result in some totally unexpected number.
Nothing in the C standard would impose any requirements on what a system could do with the expression. It would, when the standard was written, have been perfectly reasonable for it to to cause the following sequence of events at runtime:
Code loads a null pointer into the addressing unit
Code asks the addressing unit to add the offset of field b.
The addressing unit trigger a trap when attempting to add an integer to a null pointer (which should for robustness be a run-time trap, even though many systems don't catch it)
The system starts executing essentially random code after being dispatched through a trap vector that was never set because code to set it would have wasted been a waste of memory, as addressing traps shouldn't occur.
The very essence of what Undefined Behavior meant at the time.
Note that most of the compilers that have appeared since the early days of C would regard the address of a member of an object located at a constant address as being a compile-time constant, but I don't think such behavior was mandated then, nor has anything been added to the standard which would mandate that compile-time address calculations involving null pointers be defined in cases where run-time calculations would not.
No. Let's take this apart:
&(((struct name *)NULL)->b);
is the same as:
struct name * ptr = NULL;
&(ptr->b);
The first line is obviously valid and well defined.
In the second line, we calculate the address of a field relative to the address 0x0 which is perfectly legal as well. The Amiga, for example, had the pointer to the kernel in the address 0x4. So you could use a method like this to call kernel functions.
In fact, the same approach is used on the C macro offsetof (wikipedia):
#define offsetof(st, m) ((size_t)(&((st *)0)->m))
So the confusion here revolves around the fact that NULL pointers are scary. But from a compiler and standard point of view, the expression is legal in C (C++ is a different beast since you can overload the & operator).

Assignment and pointers, undefined behavior?

int func(int **a)
{
*a = NULL;
return 1234;
}
int main()
{
int x = 0, *ptr = &x;
*ptr = func(&ptr); // <-???
printf("%d\n", x); // print '1234'
printf("%p\n", ptr); // print 'nil'
return 0;
}
Is this an example of undefined behavior or has to do with sequence points?
why the line:
*ptr = func(&ptr);
doesn't behave like:
*NULL = 1234;
EDIT: I forgot to mention that I get the output '1234' and 'nil' with gcc 4.7.
Since there is no sequence point between evaluations of the left and right hand sides of the assignment operator, it is not specified whether *ptr or func(&ptr) is evaluated first. Thus it is not guaranteed that the evaluation of *ptr is allowed, and the program has undefined behaviour.
The language does not guarantee you that the right-hand side subexpression func(&ptr) in
*ptr = func(&ptr);
is evaluated first, and the left-hand side subexpression *ptr is evaluated later (which is apparently what you expected to happen). The left-hand side can legally be evaluated first, before call to func. And this is exactly what happened in your case: *ptr got evaluated before the call, when ptr was still pointing to x. After that the assignment destination became finalized (i.e. it became known that the code will assign to x). Once it happens, changing ptr no longer changes the assignment destination.
So, the immediate behavior of your code is unspecified due to unspecified order of evaluation. However, one possible evaluation schedule leads to undefined behavior by causing a null pointer dereference. This means that in general case the behavior is undefined.
If I had to model the behavior of this code in terms of C++ language, I'd say that the process of evaluation in this case can be split into these essential steps
1a. int &lhs = *ptr; // evaluate the left-hand side
1b. int rhs = func(&ptr); // evaluate the right-hand side
2. lhs = rhs; // perform the actual assignment
(Even though C language does not have references, internally it uses the same concept of "run-time bound lvalue" to store the result of evaluation of left-hand side of assignment.) The language specification allows enough freedom to make steps 1a and 1b to occur in any order. You expected 1b to occur first, while your compiler decided to start with 1a.
This is undefined behaviour, I believe. The standard does not stipulate when the LHS of the assignment is evaluated compared to the RHS. If *ptr is evaluated after the function is called, you will be dereferencing a null pointer; if it is evaluated before the function is called, then you get sane behaviour.
The code is thoroughly disreputable. Do not try using it, or anything similar, in real code.
Note that there is a sequence point immediately before a function is called, after its arguments have been evaluated; there is also a sequence point immediately before a function returns. Thus, there are sequence points related to the evaluation of the function arguments and its return value, but...and this is crucial in this context...it still does not tell you whether *ptr is evaluated before or after the function is called. Either is possible; both are correct; the code depends on which happens, which makes it rely on undefined behaviour.
The assignment operator is not a sequence point. So, there is no guarantee, as to which side will be evaluated first. So, it is unspecified behaviour.
In one of the cases (dereferencing a NULLPTR) it could exhibit undefined behavior.
Between consecutive "sequence points" an object's value can be
modified only once by an expression.
You can see a list of what are defined sequence points in C here.
While the call of a function is a sequence point, this is bound to the evaluation of parameters (before the call), not the functions side-effects (the call itself).

Resources