how is an expression involving several ^= operators evaluated? - c

#include<stdio.h>
int main(){
int arr[ 5 ] = { 1, 2, 3, 4, 5 };
int *f = arr;
int *l = (4+arr);
while(f<l){
*f^=*l^=*f^=*l;
++f; --l;
}
printf("\n%d\t%d\t%d\n", *arr, *f, *l)
return 0;
}
My output is 1 3 3 on paper but compiler is showing 033.
Please anyone explain it to me.
Thanks in advance.

*f^=*l^=*f^=*l;
The evaluation of the operands of ^= is not sequenced, and you use the same variables several times in the same expression, with no sequence point in between.
This means that the behavior of the program is undefined. Nobody can know how that expression will be evaluated and anything can happen. The program may crash or the output can be anything.
You have to fix this bug by changing the code into this:
*f ^= *l;
*l ^= *f;
*f ^= *l;
Then each semi-colon will introduce a sequence point and there are no order of evaluation issues.
Standard references.

I actually don't care one bit how this is evaluated. If you have code and ask "what exactly does this code do", then the correct answer is "don't write that kind of code". (Except if you are writing a compiler, in which case the answer is "you shouldn't be writing compilers if you ask on stack overflow how some code should be executed").
In addition, the result is undefined behaviour in C, C++ before C++ 11, and Objective-C, so that's a good reason not to do it where it is defined. In addition, it has zero chance to pass any code review, and there is a rule "always assume that the next maintenance programmer reading your code is a violent sociopath who knows your home address".

The output is correct:
*arr = ((1^5)^1)^5);
which indeed is 0. Note that *arr is modified during the first iteration and that's all you worry about. In the second and later iterations you're not modifying *arr anymore. At that point f has been updated and doesn't point to arr anymore
A few lines explaining the maths:
1^5 = 4
4^1 = 5
5^5 = 0
EDIT:
I assumed that the language guarantees that the order of evaluation is from left to right. As pointed out by Lundin this is not the case. However, judging by the output of the executable that's most likely the way that the compiler has handled it.

Related

Where can I find weird, specific C syntax rules?

I will take an exam and my teacher asks weird C syntax rules. Like:
int q=5;
for(q=-2;q=-5;q+=3) { //assignment in condition part??
printf("%d",q); //prints -5
break;
}
Or
int d[][3][2]={4,5,6,7,8,9,10,11,12,13,14,15,16};
int i=-1;
int j;
j=d[i++][++i][++i];
printf("%d",j); //prints 4?? why j=d[0][0][0] ?
Or
extern int a;
int main() {
do {
do {
printf("%o",a); //prints 12
} while(!1);
} while(0);
return 0;
}
int a=10;
I could not find it rules any site or book. Really absurd and uncommon. Where can I find?
To me it seems that your teacher is asking questions which invole undefined behavior.
If you tell him that this is incorrect, you're directly confronting him.
However, you could do the following:
Compile the code on different platforms
Compile the code with different compilers
Compile the code with different versions of the same compiler
Build a matrix with the results. You'll find out that they differ
Show the results to your teacher ans ask him to explain why that happens
That way you do not say that he's wrong, you're just showing some facts and you're showing that you're willing to learn and work.
Do that a long before the exam so that the teacher can look into it and think about his questions so that he can change the exam in time.
I could not find it rules any site or book. Where can I find?
See Where do I find the current C or C++ standard documents?. If you have a good library at university, they should own a copy.
Concerning for(q=-2;q=-5;q+=3) {, all you need to do is to break this down into its components. q=-2 is ran first, then q=-5 is tested, and if that is not 0 (which it isn't since it's an expression with value -5), then the loop body runs once. Then break forces a premature exit from an otherwise infinite loop. The expression then q+=3 is never reached.
The behaviour of d[i++][++i][++i] is undefined. Tell your teacher that, tactfully.
The "%o" format denotes octal output. a is set to 10 in decimal which is 12 in octal. Your code would be clearer if you had written:
int a=012; // octal constant.
The online version of the C language standard has what you need (and is what I will be referring to in this answer); just bear in mind is is a language definition and not a tutorial, and as such may not be easy to read for someone who doesn't have a lot of experience yet.
Having said that, your teacher is throwing you a few foul balls. For example:
j=d[i++][++i][++i];
This statement results in undefined behavior for several reasons. The first several paragraphs of section 6.5 of the document linked above explain the problem, but in a nutshell:
Except in a few situations, C does not guarantee left-to-right evaluation of expressions; neither does it guarantee that side effects are applied immediately after evaluation;
Attempting to modify the value of an object more than once between sequence points1, or modifying and then trying to use the value of an object without an intervening sequence point, results in undefined behavior.
Basically, don't write anything of the form:
x = x++;
x++ * x++;
a[i] = i++;
a[i++] = i;
C does not guarantee that each ++i and i++ is evaluated from left to right, and it does not guarantee that the side effect of each evaluation is applied immediately. So the result of j[i++][++i][++i] is not well-defined, and the result will not be consistent over different programs, or even different builds of the same program2.
AND, on top of that, i++ evaluates to the current value of i; so clearly, your teacher's intent was for j[i++][++i][++i] to evaluate to j[-1][1][2], which would also result in undefined behavior since you're attempting to index outside of the array bounds.
This is why I hate, hate, hate it when teachers throw this kind of code at their students - not only is it needlessly confusing, not only does it encourage bad practice, but more often than not it's just plain wrong.
As for the other questions:
for(q=-2;q=-5;q+=3) { //assignment in condition part??
See sections 6.5.16 and 6.8.5.3. In short, an assignment expression has a value (the value of the left operand after any type conversions), and it can appear as part of a controlling expression in a for loop. As long as the result of the assignment is non-zero (as in the case above), the loop will execute.
printf("%o",a); //prints 12
See section 7.21.6.1. The o conversion specifier tells printf to format the integer value as octal: 1010 == 128
A sequence point is a point in a programs execution where an expression has been fully evaluated and any side effects have been applied. Sequence points occur at the ends of statements, between the evaluation of a function's parameters and the function call, after evaluating the left operand of the &&, ||, and ?: operators, and a few other places. See Annex C for the complete list.
Or even different runs of the same build, although in practice you won't see values change from run to run unless you're doing something really hinky.

Pointers in c language program output

I have a question :
char *c[] = {"GeksQuiz", "MCQ", "TEST", "QUIZ"};
char **cp[] = {c+3, c+2, c+1, c};
char ***cpp = cp;
int main()
{
printf("%s ", *--*++cpp+3);
}
I am not able to understand the output = sQUIZ ,
my approach: first it will point to cpp+3 i.e c now ++c means pointing to "MCQ" , * of that would give the value "MCQ" ,can't understand what the -- before * would do here . or is my approach totally wrong ?
I will post it as an answer as was mentioned in comments. You should read at first this: http://en.wikipedia.org/wiki/Sequence_point also look here and you can search for dozens of articles accross the Internet about sequence points. This stuff is as BAD as undefined behaviour and unspecified behaviour. You can read this post, especially the part What is the relation between Undefined Behaviour and Sequence Points? in the accepted answer.
Probably this interview question implied your knowledge about sequence points then it is not as bas as I see it, but nevertheless NEVER EVER write such a code even for your pet projects and I don't even want to mention production code. This is silly.
If they look for experienced C++/C developer they shouldn't ask such questions at all.
EDIT
Just for the tip about sequence points, because I saw some misunderstandings in other posted answer and in the comments. This is *--*++cpp+3 not an unspecified behaviour or undefined behaviour (I mean it is a bad code in general), but this IS:
int i =1;
*--*++cpp+i+i++;
The code above is unsequenced and unspecified. Please read about differences between undefined behaviour, unspecified behaviour, implementation-defined behavior and sequence points e.g. here .I wrote all this in order to explain you why you should avoid such a terrible code at all (whether it legal from the point of language standard or not). Yes, your code is legal, but unreadable, and, as you see in my edits, small changes made it illegal. Do not think we don't want to help you, I mean the code similar to your is a bad code in general wherever it will be asked. It will be better if they asked you to explain WHY such a code is bad and fragile - then it will be a good interview question.
P.S. The actual output is an empty string, because you print a null-terminator. See an excellent answer below - it explained the output from the point of C operators preceding (you should also learn it then such questions will not bother you at all).
All variables in this expression are modified only once. Maybe I don't understand something about sequence points, but I don't have no idea why people call this expression undefined behavior.
char *c[] = {"GeksQuiz", "MCQ", "TEST", "QUIZ"};
char **cp[] = {c+3, c+2, c+1, c};
char ***cpp = cp;
/*1*/ cpp; // == &cp[0]
/*2*/ ++cpp; // == &cp[1] (`cpp` changed)
/*3*/ *++cpp; // == cp[1] == c+2
/*4*/ --*++cpp; // == c+2-1 == &c[1] (`cp[1]` changed)
/*5*/ *--*++cpp; // == "MCQ"
/*6*/ *--*++cpp+3; // == "MCQ"+4 - it's pointer to '\0'
So it should not print anything.

Feel kind of confused by the book "Programming in C" (Stephen Kochan)

I've been teaching myself in C programming with the book recommended by a friend who is great in C. The book title is "Programming in C" by Stephen Kochan.
I have a background in Java, and I feel a little bit crazy with the way the codes were written in Stephen's book. For example, the following code, in which I commented my confusion. Maybe I'm missing something important here, so I'm looking to hear some inputs about the correct way of coding in C.
#include <stdio.h>
void test(int *int_pointer)
{
*int_pointer = 100;
}
int main(void)
{
void test(int *int_pointer); // why call the test() function here without any real argument? what's the point?
int i = 50, *p = &i;
printf("Before the call to test i = %i\n", i);
test(p);
printf("After the call to test i = %i\n", i);
int t;
for (t = 0; t < 5; ++t) // I'm more used to "t++" in a loop like this. As I know ++t is different than t++ in some cases. Writting ++t in a loop just drives me crazy
{
if (4 == t) // isn't it normal to write "t == 4" ?? this is driving me crazy again!
printf("skip the number %i\n", t);
else
printf("the value of t is now %i\n", t);
}
return 0;
}
// why call the test() function here without any real argument? what's the point?
It is not a call, it is function declaration. Completely unnecessary at this location, since the function is defined few lines before. In real world such declarations are not used often.
// I'm more used to "t++" in a loop like this. As I know ++t is different than t++ in some cases. Writting ++t in a loop just drives me crazy
In this case they are equivalent, but if you think of going to C++ it is better to switch completely to ++t form, since there in some cases (e.g. with iterators) it makes difference.
// isn't it normal to write "t == 4" ?? this is driving me crazy again!
Some people tend to use 4 == t to avoid a problem when t = 4 is used instead of t == 4 (both are valid in C as if condition). Since all normal compilers signal a warning for t = 4 anyway, 4 == t is rather unnecessary.
Please read about pointers then you will understand that a pointer to an int has been passed as an argument here...
void test(int *int_pointer);
You can see the difference between ++t and t++ nicely explained in this link . It doesn't make a difference in this code. Result will be the same.
if(4 == t) is same as if(t == 4) . Just different styles in writing. 4 == t is mostly used to avoid typing = instead of ==. Compiler will complain if you write 4 = t but wont complain if you write t = 4
why call the test() function here without any real argument? what's the point?
Here test is declared as function (with void return type) which expects an argument of the type a pointer to int.
I'm more used to "t++" in a loop like this. As I know ++t is different than t++ in some cases. Writting ++t in a loop just drives me crazy
Note that, when incrementing or decrementing a variable in a statement by itself (t++; or ++t), the pre-increment and post-increment have same effect.
The difference can be seen when these expression appears in a large or complex expressions ( int x = t++ and int x = ++t have different results for the same value of t).
isn't it normal to write "t == 4" ?? this is driving me crazy again!
4 == t is much safer than t == 4, although both have same meaning. In case of t == 4, if user type accidentally t = 4 then compiler would not going to throw any error and you may get erroneous result. While in case of 4 == t, if user accidentally type 4 = t then compiler would through you a warning like:
lvalue is required as left operand of assignment operator.
void test(int *int_pointer); is a function prototype. It's not required in this particular instance since the function is defined above main() but you would need it (though not necessarily in the function body) if test was defined later in the file. (Some folk rely on implicit declaration but let's not get into that here.)
++t will never be slower than t++ since, conceptually, the latter has to store and return the previous value. (Most compilers will optimise the copy out, although I prefer not to rely on that: I always use ++t but plenty of experienced programmers don't.)
4 == t is often used in place of t == 4 in case you accidentally omit one of the =. It's easily done but once you've spent a day or two hunting down a bug caused by a single = in place of == you won't ever do it again! 4 = t will generate a compile error but t = 4 is actually an expression of value 4 which will compare true and assigns the value of 4 to t: a particularly dangerous side-effect. Personally though I find 4 == t obfuscating.

Prefix and postfix operators necessity

What is the necessity of both prefix and postfix increment operators? Is not one enough?
To the point, there exists like a similar while/do-while necessity problem, yet, there is not so much confusion (in understanding and usage) in having them both. But with having both prefix and postfix (like priority of these operators, their association, usage, working).
And do anyone been through a situation where you said "Hey, I am going to use postfix increment. Its useful here."
POSTFIX and PREFIX are not the same. POSTFIX increments/decrements only after the current statement/instruction is over. Whereas PREFIX increments/decrements and then executes the current step. Example, To run a loop n times,
while(n--)
{ }
works perfectly. But,
while(--n)
{
}
will run only n-1 times
Or for example:
x = n--; different then x = --n; (in second form value of x and n will be same). Off-course we can do same thing with binary operator - in multiple steps.
Point is suppose if there is only post -- then we have to write x = --n in two steps.
There can be other better reasons, But this is one I suppose a benefit to keep both prefix and postfix operator.
[edit to answer OP's first part]
Clearly i++ and ++i both affect i the same but return different values. The operations are different. Thus much code takes advantage of these differences.
The most obvious need to have both operators is the 40 year code base for C. Once a feature in a language is used extensively, very difficult to remove.
Certainly a new language could be defined with only one or none. But will it play in Peoria? We could get rid of the - operator too, just use a + -b, but I think it is a tough sell.
Need both?
The prefix operator is easy to mimic with alternate code for ++i is pretty much the same as i += 1. Other than operator precedence, which parens solves, I see no difference.
The postfix operator is cumbersome to mimic - as in this failed attempt if(i++) vs. if(i += 1).
If C of the future moved to depreciate one of these, I suspect it would be to depreciate the prefix operator for its functionality, as discussed above, is easier to replace.
Forward looking thought: the >> and << operators were appropriated in C++ to do something quite different from integer bit shifting. Maybe the ++pre and post++ will generate expanded meaning in another language.
[Original follows]
Answer to the trailing OP question "do anyone been through a situation where you saidd "Hey, I am going to use postfix increment. Its useful here"?
Various array processing, like with char[], benefit. Array indexing, starting at 0, lends itself to a postfix increment. For after fetching/setting the array element, the only thing to do with the index before the next array access is to increment the index. Might as well do so immediately.
With prefix increment, one may need to have one type of fetch for the 0th element and another type of fetch for the rest.
size_t j = 0;
for (size_t i = 0, (ch = inbuffer[i]) != '\0'; i++) {
if (condition(ch)) {
outbuffer[j++] = ch; // prefer this over below
}
}
outbuffer[j] = '\0';
vs.
for (size_t i = 0, (ch = inbuffer[i]) != '\0'; ++i) {
if (condition(ch)) {
outbuffer[j] = ch;
++j;
}
}
outbuffer[j] = '\0';
I think the only fair answer to which one to keep would be to do away with them both.
If, for example, you were to do away with postfix operators, then where code was once compactly expressed using n++, you would now have to refer to (++n - 1), or you would have to rearrange other terms.
If you broke the increment or decrement out onto its own line before or after the expression which referred to n, above, then it's not really relevant which you use, but in that case you could just as easily use neither, and replace that line with n = n + 1;
So perhaps the real issue, here, is expressions with side effects. If you like compact code then you'll see that both pre and post are necessary for different situations. Otherwise there doesn't seem to be much point in keeping either of them.
Example usage of each:
char array[10];
char * const end = array + sizeof(array) / sizeof(*array);
char *p = end;
int i = 0;
/* set array to { 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 } */
while (p > array)
*--p = i++;
p = array;
i = 0;
/* set array to { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 } */
while (p < end)
*p++ = i++;
They are necessary because they are already used in lots of code, so if they were removed then lots of code would fail to compile.
As to why they ever existed in the first place, older compilers could generate more efficient code for ++i and i++ than they could for i+=1 and (i+=1)-1. For newer compilers this is generally not an issue.
The postfix version is something of an anomaly, as nowhere else in C is there an operator that modifies its operand but evaluates to the prior value of its operand.
One could certainly get by using only one or other of prefix or postfix increment operators. It would be a little more difficult to get by using only one or other of while or do while, as the difference between them is greater than the difference between prefix and postfix increment in my view.
And one could of course get by without using either prefix or postfix increment, or while or do while. But where do you draw the line between what's needless cruft and what's useful abstraction?
Here's a quickie example that uses both; an array-based stack, where the stack grows towards 0:
#define STACKSIZE ...
typedef ... T;
T stack[STACKSIZE];
size_t stackptr = STACKSIZE;
// push operation
if ( stackptr )
stack[ --stackptr ] = value;
// pop operation
if ( stackptr < STACKSIZE )
value = stack[ stackptr++ ];
Now we could accomplish the exact same thing without the ++ and -- operators, but it wouldn't scan as cleanly.
As for any other obscure mechanism in the C language, there are various historical reasons for it. In ancient times when dinosaurs walked the earth, compilers would make more efficient code out of i++ than i+=1. In some cases, compilers would generate less efficient code for i++ than for ++i, because i++ needed to save away the value to increment later. Unless you have a dinosaur compiler, none of this matters the slightest in terms of efficiency.
As for any other obscure mechanism in the C language, if it exists, people will start to use it. I'll use the common expression *p++ as an example (it means: p is a pointer, take the contents of p, use that as the result of the expression, then increment the pointer). It must use postfix and never prefix, or it would mean something completely different.
Some dinosaur once started writing needlessly complex expressions such as the *p++ and because they did, it has became common and today we regard such code as something trivial. Not because it is, but because we are so used at reading it.
But in modern programming, there is absolutely no reason to ever write *p++. For example, if we look at the implementation of the memcpy function, which has these prerequisites:
void* memcpy (void* restrict s1, const void* restrict s2, size_t n)
{
uint8_t* p1 = (uint8_t*)s1;
const uint8_t* p2 = (const uint8_t*)s2;
Then one popular way to implement the actual copying is:
while(n--)
{
*p1++ = *p2++;
}
Now some people will cheer, because we used so few lines of code. But few lines of code is not necessarily a measure of good code. Often it is the opposite: consider replacing it with a single line while(n--)*p1++=*p2++; and you see why this is true.
I don't think either case is very readable, you have to be a somewhat experienced C programmer to grasp it without scratching your head for five minutes. And you could write the same code like this:
while(n != 0)
{
*p1 = *p2;
p1++;
p2++;
n--;
}
Far clearer, and most importantly it yields exactly the same machine code as the first example.
And now see what happened: because we decided not to write obscure code with lots of operands in one expression, we might as well have used ++p1 and ++p2. It would give the same machine code. Prefix or postfix does not matter. But in the first example with obscure code, *++p1 = *++p2 would have completely changed the meaning.
To sum it up:
There exist prefix and postfix increment operators for historical reasons.
In modern programming, having two different such operators is completely superfluous, unless you write obscure code with several operators in the same expression.
If you write obscure code, will find ways to motivate the use of both prefix and postfix. However, all such code can always be rewritten.
You can use this as a quality measure of your code: if you ever find yourself writing code where it matters whether you are using prefix or postfix, you are writing bad code. Stop it, rewrite the code.
Prefix operator first increments value then its uses in the expression. Postfix operator,first uses the value in the expression and increments the value
The basic use of prefix/postfix operators are assembler replaces it with single increment/decrement instruction. If we use arithmetic operators instead of increment or decrement operators, assembler replaces it with two or three instructions. that's why we use increment/decrement operators.
You don't need both.
It is useful for implementing a stack, so it exists in some machine languages. From there it has been inherited indirectly to C (In which this redundancy is still somewhat useful, and some C programmers seems to like the idea of combining two unrelated operations in a single expression), and from C to any other C-like lagnuages.

Arithmetic operations in IF loop

What does the below code do? I'm very confused with its working. Because I thought that the if loop runs till the range of int. But I'm confused when I try to print the value of i. Please help me out with this.
#include<stdio.h>
void main()
{
static int i;
for (;;)
if (i+++”Apple”)
printf(“Banana”);
else
break;
}
It is interpreted as i++ + "Apple". Since i is static and does not have an initializer, i++ yields 0. So the whole expression is 0 + some address or equivalent to if ("Apple").
EDIT
As Jonathan Leffler correctly notes in the comments, what I said above only applies to the first iteration. After that it will keep incrementing i and will keep printing "Banana".
I think at some point, due to overflows (if it doesn't crash) "Apple" + i will yield 0 and the loop will break. Again, I don't really know what a well-meaning compiler should do when one adds a pointer and a large number.
As Eric Postpischil commented, you can only advance the pointer until it points to one-past the allocated space. In your exxample adding 7 will advance the pointer one-past the allocated space ("Apples\0"). Adding more is undefined behavior and technically strange things can happen.
Use int main(void) instead of void main().
The expression i+++"Apple" is parsed as (i++) + "Apple"; the string literal "Apple" is converted from an expression of type "6-element array of char" to "pointer to char", and its value is the address of the first element of the array. The expression i++ evaluates to the current value of i, and as a side effect, the value in i is incremented by 1.
So, we're adding the result of the integer expression i++ to the pointer value resulting from the expression "Apple"; this gives us a new pointer value that's equal or greater than the address of "Apple". So assuming the address of the string literal "Apple" is 0x80123450, then basically we're evaluating the values
0x80123450 + 0
0x80123450 + 1
0x80123450 + 2
...
all of which should evaluate to non-zero, which causes the printf statement to be executed. The question is what happens when i++ results in an integer overflow (the behavior of which is not well defined) or the value of i+++"Apple" results in an overflow for a pointer value. It's not clear that i+++"Apple" will ever result in a 0-valued expression.
This code SHOULD Have been written like this:
char *apple = "Apple";
for(i = 0; apple[i++];)
printf("Banana");
Not only is it clearer than the code posted in the original, it is also clearer to see what it does. But I guess this came from "Look how bizarre we can write things in C". There are lots of things that are possible in C that isn't a great idea.
It is also possible to learn to balance a plate of hot food on your head for the purpose of serving yourself dinner. It doesn't make it a particularly great idea - unless you don't have hands and feet, I suppose... ;)
Edit: Except this is wrong... The equivalent is:
char *apple = "Apple";
for(i = 0; apple+i++ != NULL;)
printf("Banana");
On a 64-bit machine, that will take a while. If it finishes in reasonable time (sending output to /dev/null), I will update. It takes approximitely three minutes on my machine (AMD 3.4GHz Phenom II).

Resources