+= Operator Chaining (with a dash of UB) - c

I understand there is no sequence point here before the semicolon, but is there a plausible explanation for the dereferenced pointer to use the old value 2 in the expression?
Or can it be simply put down as undefined behaviour?
int i=2;
int *x=&i;
*x+=*x+=i+=7;
Result:
i= 13

It is "simply" undefined behavior.
That said, the compiler probably emits code that reads the value of i once then performs all the arithmetic, then stores the new value of i.
The obvious way to find out the real explanation would be to go look at the assembly generated by the compiler.

The behaviour isn't undefined, it is down to the way the compiler breaks down the expression and pushes the intermediate results onto the stack. The two *xs are calculated first (both equal 2) and are pushed onto the stack. Then i has 7 added to it and equals 9. Then the second *x, which still equals 2, is pulled off the stack, and added, to make 11. Then the first *x is pulled off the stack and added to the 11 to make 13.
Look up Reverse Polish Notation for hints on what is going on here.

Related

Using + to check if multiple pointers are all NULL

Syntactically it makes sense (Although it looks like some other language, which I don't particularly enjoy), it can save a lot of typing and code space, but how bad is it?
if(p1 + (unsigned)p2 + (unsigned)p3 == NULL)
{
// all pointers are NULL, exit
}
Using pointer arithmetic with a pointer rvalue, I don't see how it could give a false result (the entire expression to evaluate to NULL even though not all pointers are NULL), but I don't exactly know how much evilness this potentially hides, so is it bad to do this, not-common way of checking if plenty of pointers are all NULL?
Regarding to the original version of the question, which omitted the casts ...
it can save a lot of typing and code space, but how bad is it?
Very, very bad. Its behavior is altogether undefined, and if your compiler fails to reject it then you should get yourself a better one. Subtraction of one pointer from another is defined under some circumstances (and yields an integer result), but it is never meaningful to add two pointers.
Inasmuch as it shouldn't even compile, every keystroke used to type it instead of something that works is wasted, so no, it doesn't save typing or code space.
I don't see how it could give a false result.
If the compiler actually accepts it, the result can be anything at all. It is undefined.
so is it bad to do this, not-common way of checking if plenty of pointers are all NULL?
Yes.
Regarding the modified question in which all but one of the pointers are cast to integer:
The casts do not rescue the code -- multiple problems remain.
If the remaining pointer does not point to a valid object, or if the sum of the integers is negative or greater than the number of elements in the array to which the pointer points then the result of the pointer addition is still undefined (where a pointer to a scalar is treated as a pointer to a one-element array). Of course, the integer sum can't be negative in this particular case, but that's of minimal advantage.
C does not guarantee that casting a null pointer to an integer yields the value 0. It is common for it to do so, but the language does not require it.
C does not guarantee that non-null pointers convert to nonzero integers, and with your particular code that's a genuine risk. The type unsigned is not necessarily large enough to afford a distinct value to every distinct pointer.
Even if all of the foregoing were not a problem for some particular implementation -- that is, if you could safely perform arithmetic on a NULL pointer, and NULL pointers reliably converted to integers as zero, and non-NULL pointers reliably converted to nonzero -- the test could still go wrong because two nonzero unsigned integers can sum to zero. That happens where the arithmetic sum of the two is equal to UINT_MAX + 1.
There are multiple reasons why this is not a reliable method.
First, when you add an integer to a pointer, the C standard does not say what happens if the result is outside of the array into which the pointer points. (For these purposes, pointing just one past the last element, the end of the array, counts as inside, not outside. Also, a pointer to a single object counts as an array of one object.) Note that the C standard does not just not say what the result of the addition is; it does not say what the behavior of the entire program is. So, once you execute an addition that goes outside of an array, you cannot predict (from the C standard) what your program will do at all.
One likely result is that the compiler will see pointer + integer + integer and reason (or, more technically, apply transformations as if this reasoning were used) that pointer + integer is valid only if pointer is not NULL, and then the result is never NULL, so the expression pointer + integer is never NULL. Similarly, pointer + integer + integer is never NULL. Therefore pointer + integer + integer == NULL is always false, and we can optimize the program by removing this code completely. Thus, the code to handle the case when all pointers are NULL will be silently removed from your program.
Second, even if the C standard did guarantee a result of the addition, this expression could, hypothetically, evaluate to NULL even if none of the pointers were NULL. For example, consider a 16-bit address space where the first pointer were represented with the address 0x7000, the second were 0x6000, and the third were 0x3000. (I will also suppose these are char * pointers, so one element is one byte.) If we add these, the mathematical result is 0x10000. In 16-bit arithmetic, that wraps, so the computed result is 0x0000. Thus, the expression could evaluate to zero, which is likely used for NULL.
Third, unsigned may be narrower than pointers (for example, it may be 32 bits while pointers are 64), so the cast may lose information—there may be non-zero bits in the bits that were lost during the conversion, so the test will fail to detect them.
There are situations where we want to optimize pointer tests, and there are legitimate but non-standard ways to do it. On some processors, branching can be expensive, so doing some arithmetic with one test and one branch may be faster than doing three tests and three branches. C provides an integer type intended for working with pointer representations: uintptr_t, declared in <stdint.h>. With that, we can write this code:
if (((uintptr_t) p1 | (uintptr_t) p2 | (uintptr_t) p3) == 0) …
What this does is convert each pointer to an unsigned integer of a width suitable for working with pointer representations. The C standard does not say what the result of this conversion is, but it is intended to be unsurprising, and C implementations for flat address spaces may document that the result is the memory address. They may also document that NULL is the zero address. Once we have these integers, we OR them together instead of adding them. The result of an OR has a bit set if either of the corresponding bits in its operands was set. Thus, if any one of the addresses is not zero, then the result will not be zero either. So this code, if executed in a suitable C implementation, will perform the test you desire.
(I have used such tests in special high-performance code to test whether all pointers were aligned as desired, rather than to test for NULL. In that case, I had direct access to the compiler developers and could ensure the compiler would behave as desired. This is not standard C code.)
Using any sort of pointer arithmetic on non-array pointers is undefined behavior in C.

Calling Convention Confusion [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Could anyone explain these undefined behaviors (i = i++ + ++i , i = i++, etc…)
I'm not able to understand the output of this program (using gcc).
main()
{
int a=10;
printf("%d %d %d\n",++a, a++,a);
}
Output:
12 10 12
Also, please explain the order of evaluation of arguments of printf().
The compiler will evaluate printf's arguments in whatever order it happens to feel like at the time. It could be an optimization thing, but there's no guarantee: the order they are evaluated isn't specified by the standard, nor is it implementation defined. There's no way of knowing.
But what is specified by the standard, is that modifying the same variable twice in one operation is undefined behavior; ISO C++03, 5[expr]/4:
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined.
printf("%d %d %d\n",++a, a++,a); could do a number of things; work how you expected it, or work in ways you could never understand.
You shouldn't write code like this.
AFAIK there is no defined order of evaluation for the arguments of a function call, and the results might vary for each compiler. In this instance I can guess the middle argument was first evaluated, following by the first, and the third.
As haggai_e hinted, the parameters are evalueted in this order: middle, left, right.
To fully understand why these particular numbers are showing up, you have to understand how the increment works.
a++ means "do something with a, and then increment it afterwards".
++a means "increment a first, then do something with the new value".
In your particular Example, printf evaluates a++ first, reads 10 and prints it and only then increments it to 11. printf then evaluates ++a, increments it first, reads 12 and prints it out. The last variable printf evaluates is read as it is (12) and is printed without any change.
Although they are evaluated in a random order, they are displayed in the order you mentioned them. That's why you get 12 10 12 and not 10 12 12.

C program array Position

Why does the following C program return the subtraction of 4 of (a+4) and 1 of (a+1)?
#include<stdio.h>
int main()
{
int a[3][2]={1,2,
5,7,
6,8};
printf("\n%d",(a+4)-(a+1));
return 0;}
Also when i substitute the subtraction operator with addition (a+4)+(a+1), it gives
error: invalid operands to binary + (have ‘int (*)[2]’ and ‘int (*)[2]’)
Note that a is an array and when used by itself degrades to a pointer (i.e. a memory address). This means that (a+4) and (a+1) are also memory addresses. Subtracting memory addresses makes sense because you are calculating the distance between the two addresses. However, adding memory addresses is nonsense.
I am unsure what you want to do here, so I am not able to suggest a solution to fix the problem. Feel free to edit your question with more details so that we can help you further.
I ran your code and got 3 as the difference, which makes sense: a + 4 - (a + 1) = 3.
On the error, I believe that C doesn't let you add two memory addresses as a safeguard. It's totally nonsensical to do so as I pointed out in a previous comment. Subtracting one memory address from another, however, gives you something useful in certain cases (the offset between two locations in memory).
you have a syntax error.
you r using a as an single intiger, but the data type of a is integer array.
so an e.g subtraction would be: a[1][3]-a[0][3] and the value here equals 3.
(for your info the array actually looks like a={1,2,5
7,6,8} it has 3 rows and 2 columns.

a++ vs a = a + 1 which is useful in efficient memory programming and how?

This was my Interview question in HP. I answered a++ takes less instruction compared to a = a +1;
I want to know which is useful in efficient programming, and how both are different from each other..?
Hoping for quick and positive response..
In C, there would be no difference, if the compiler is smart.
In C++, it depends on what type a is, and whether the ++ operator is overloaded. To complicate matters even more, the = operator can be overloaded too, and a = a + 1 might not be the same as a++. For even more complication, the + operator can also be overloaded, so an innocent looking piece of code such as a = a + 1 might have drastic implications.
So, without some context, you simply cannot know.
First of all, in C++ this will depend on type of a. Clearly a can be of class type and have those operators overloaded and without knowing the details it's impossible to decide which is more efficient.
That said, both in C and C++ whatever looks cleaner is preferable. First write clear code, then profile it and see if it's intolerably slow.
I think I would answer in an implementation independent way. The a++ is easier to read to me because it's just showing me what it does. Whereas for a = a + 1 I first have to scan all the addition. I prefer to go for the choice that's more foolproof.
The former, a++, evaluates to the prior value, so you can use it to express things in sometimes surprisingly simpler manners. For instance
// copy, until '\0' is hit.
while(*dest++ = *source++) ;
Apart from these considerations, I don't think any of them is more efficient, assuming you have to do with basic integer types.
I am not an expert in microprocessor design, but I guess many processors have a INC or DEC instruction. If the data type is int then increment can be done in one instruction. But a = a + 1 requires more, first add and then assignment. So a++ should be faster, obviously assuming that a is not a complex data type.
However a smart compile should do this kind of optimization.
With an optimizing compiler they are identical. The interview question is moot.
As far as I know, there's no difference between a++ and a = a + 1.
HOWEVER, there is a difference between ++a and a = a + 1
Let's take the first case, a = a + 1.
a = a + 1 will have to take the value of a, add one to it, and then store the result back to a.
++a will be a single assembly instruction.
You can notice the difference with these two examples:
Example 1
int a = 1;
int x = a++; //x will be 1
Example 2
int a = 1;
int x = ++a; //x will be 2
BE AWARE! Most compilers optimize this today. If you have a++ somewhere in your code it will MOST likely be optimized to a single assembly instruction.
Even more efficient in many cases in ++a. When a is an int or a pointer though it is not going to make any difference.
The rationale of why these increments are more efficient though than a=a+1 is because the instruction of increment is one instruction whereas the instructions involved in adding 1 to a then assigning it back is something like:
get the address of a
push its contents onto the stack
push 1 to the stack
add them
get the address of a (possibly already stored)
write (pop) from the stack into this address
Really it all boils down to what your compiler does optimize.
Lets take the optimal case of a is an int. Then normally your compiler will make a++ and a=a+1 be exactly the same.
Now what can be pointed out, is that a = a + 1; is purely incrementing the value of the fixed amount 1,
whereas a++ is incrementing the value of 1 of the type of the variable. So if it is an int, float etc you'll get 1->2, 3.4->4.4 in both cases.
But if a is a pointer to a array/list etc, you'll be able to change pointer to the next element in the list/array when using a++, while a = a+1 might do something else or not work at all.
Long answer short, I'd say a++ is better:
your code is clearer and shorter
your can manipulate a wider range of varaibles types
and should be more efficient since (I think but I'm not sure) ++ is a basic operator on the same level as << etc.: it modifies directly the variable, while a = a + 1, if not optimized by your compiler, will require more operations by adding a with another number.
++a;
a+=1;
a=a+1;
Which notation should we use? Why?
We prefer the first version, ++a, because it more directly expresses the idea of incrementing. It says what we want to do (increment a) rather than how to do it.

(add 1 to a and then write the result to a).
In general, a way of saying something in a program is better than another if it more directly expresses an idea.
The result is more concise and easier for a reader to understand. If we wrote a=a+1, a reader could easily wonder whether we really meant to increment by 1.
Maybe we just mistyped a=b+1, a=a+2, or even a=a–1.
With ++a there are far fewer opportunities for such doubts.
Note: This is a logical argument about readability and correctness, not an argument about efficiency. Contrary to popular belief.
Modern compilers tend to generate exactly the same code from a=a+1 as for ++a when a is one of the built-in types.
from http://www.parashift.com/c++-faq-lite/operator-overloading.html#faq-13.15 :
++i is sometimes faster than, and is never slower than, i++.
a++ is better than a+1 because in the case of floating point numbers a++ increments more efficiently than a=a+1. I.e. a++ increments exactly 1 and no rounding takes place.

Question about C programming

int a, b;
a = 1;
a = a + a++;
a = 1;
b = a + a++;
printf("%d %d, a, b);
output : 3,2
What's the difference between line 3 and 5?
What you are doing is undefined.
You can't change the value of a variable you are about to assign to.
You also can't change the value of a variable with a side effect and also try to use that same variable elsewhere in the same expression (unless there is a sequence point, but in this case there isn't). The order of evaluation for the two arguments for + is undefined.
So if there is a difference between the two lines, it is that the first is undefined for two reasons, and line 5 is only undefined for one reason. But the point is both line 3 and line 5 are undefined and doing either is wrong.
What you're doing on line 3 is undefined. C++ has the concept of "sequence points" (usually delimited by semicolons). If you modify an object more than once per sequence point, it's illegal, as you've done in line 3. As section 6.5 of C99 says:
(2) Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored.
Line 5 is also undefined because of the second sentence. You read a to get its value, which you then use in another assignment in a++.
a++ is a post-fix operator, it gets the value of a then increments it.
So, for lines 2,3:
a = 1
a = 1 + 1, a is incremented.
a becomes 3 (Note, the order these operations are performed may vary between compilers, and a can easily also become 2)
for lines 4,5:
a = 1
b = 1 + 1, a is incremented.
b becomes 2, a becomes 2. (Due to undefined behaviour, b could also become 3 of a++ is processed before a)
Note that, other than for understanding how postfix operators work, I really wouldn't recommend using this trick. It's undefined behavior and will get different results when compiled using different compilers
As such, it is not only a needlessly confusing way to do things, but an unreliable, and worst-practice way of doing it.
EDIT: And has others have pointed out, this is actually undefined behavior.
Line 3 is undefined, line 5 is not.
EDIT:
As Prasoon correctly points out, both are UB.
The simple expression a + a++ is undefined because of the following:
The operator + is not a sequence point, so the side effects of each operands may happen in either order.
a is initially 1.
One of two possible [sensible] scenarios may occur:
The first operand, a is evaluated first,
a) Its value, 1 will be stored in a register, R. No side effects occur.
b) The second operand a++ is evaluated. It evaluates to 1 also, and is added to the same register R. As a side effect, the stored value of a is set to 2.
c) The result of the addition, currently in R is written back to a. The final value of a is 2.
The second operand a++ is evaluated first.
a) It is evaluated to 1 and stored in register R. The stored value of a is incremented to 2.
b) The first operand a is read. It now contains the value 2, not 1! It is added to R.
c) R contains 3, and this result is written back to a. The result of the addition is now 3, not 2, like in our first case!
In short, you mustn't rely on such code to work at all.

Resources