Is i+++j always treated as i++ + j? [duplicate] - c

This question already has answers here:
3 plus symbols between two variables (like a+++b) in C [duplicate]
(3 answers)
Closed 9 years ago.
In the printf statement i+++j, is it always treated as i++ +j?
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main() {
int i =5,j= 6, z;
z=i^j;
printf("%d",i+++j);
return 0;
}

i+++j is equivalent to i++ + j.
This has nothing to do with operator precedence. The +++ is resolved to ++ + by the compiler before expressions are parsed.
The C standard defines a sequence of translation phases, each using as its input the output of the previous one. +++ is resolved to ++ + in phase 3, which decomposes the source into preprocessor tokens. Operator precedence is not considered until phase 7, syntactic and semantic analysis. (The translation phases don't have to be implemented as distinct phases or passes, but the compiler must behave as if they are.)
The rules that says +++ is resolved to ++ + and not + ++ is what's informally called the "maximal munch rule". It's stated in section 6.4 paragraph 4:
If the input stream has been parsed into preprocessing tokens up to a
given character, the next preprocessing token is the longest sequence
of characters that could constitute a preprocessing token.
(Amusingly, the index refers to "maximal munch", but that term isn't mentioned anywhere else in the standard.)
This also implies that i+++++j, which could be tokenized as the valid expression i++ + ++j, is actually i ++ ++ + j, which is a syntax error.
Of course the solution, for a programmer, is to add whitespace to make the division into tokens clear: i++ + j. (i+++j is perfectly clear to the compiler, but i++ + j is much clearer to a human reader.)
Reference: N1570, section 6.4, paragraph 4. N1570 is a draft of the 2011 ISO C standard. This rule is unchanged from earlier versions of the standard. Translation phases are discussed

Yes. It will be parsed as (i++) + (j).

Since postfix increment/decrement operator has higher precedence over the addition operator, there is no doubt it would be treated as (i++) + j.
So, it is not a compiler issue, you should rather consider the operator precedence chart.
Also, I would suggest you to put proper spaces between the expressions, it would be beneficial in case you go through your code later on. :)
Hope that helps!

Related

Increment madness [duplicate]

int main ()
{
int a = 5,b = 2;
printf("%d",a+++++b);
return 0;
}
This code gives the following error:
error: lvalue required as increment operand
But if I put spaces throughout a++ + and ++b, then it works fine.
int main ()
{
int a = 5,b = 2;
printf("%d",a++ + ++b);
return 0;
}
What does the error mean in the first example?
Compilers are written in stages. The first stage is called the lexer and turns characters into a symbolic structure. So "++" becomes something like an enum SYMBOL_PLUSPLUS. Later, the parser stage turns this into an abstract syntax tree, but it can't change the symbols. You can affect the lexer by inserting spaces (which end symbols unless they are in quotes).
Normal lexers are greedy (with some exceptions), so your code is being interpreted as
a++ ++ +b
The input to the parser is a stream of symbols, so your code would be something like:
[ SYMBOL_NAME(name = "a"),
SYMBOL_PLUS_PLUS,
SYMBOL_PLUS_PLUS,
SYMBOL_PLUS,
SYMBOL_NAME(name = "b")
]
Which the parser thinks is syntactically incorrect. (EDIT based on comments: Semantically incorrect because you cannot apply ++ to an r-value, which a++ results in)
a+++b
is
a++ +b
Which is ok. So are your other examples.
printf("%d",a+++++b); is interpreted as (a++)++ + b according to the Maximal Munch Rule!.
++ (postfix) doesn't evaluate to an lvalue but it requires its operand to be an lvalue.
!
6.4/4 says
the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token"
The lexer uses what's generally called a "maximum munch" algorithm to create tokens. That means as it's reading characters in, it keeps reading characters until it encounters something that can't be part of the same token as what it already has (e.g., if it's been reading digits so what it has is a number, if it encounters an A, it knows that can't be part of the number. so it stops and leaves the A in the input buffer to use as the beginning of the next token). It then returns that token to the parser.
In this case, that means +++++ gets lexed as a ++ ++ + b. Since the first post-increment yields an rvalue, the second can't be applied to it, and the compiler gives an error.
Just FWIW, in C++ you can overload operator++ to yield an lvalue, which allows this to work. For example:
struct bad_code {
bad_code &operator++(int) {
return *this;
}
int operator+(bad_code const &other) {
return 1;
}
};
int main() {
bad_code a, b;
int c = a+++++b;
return 0;
}
The compiles and runs (though it does nothing) with the C++ compilers I have handy (VC++, g++, Comeau).
This exact example is covered in the draft C99 standard(same details in C11) section 6.4 Lexical elements paragraph 4 which in says:
If the input stream has been parsed into preprocessing tokens up to a
given character, the next preprocessing token is the longest sequence
of characters that could constitute a preprocessing token. [...]
which is also known as the maximal munch rule which is used in in lexical analysis to avoid ambiguities and works by taking as many elements as it can to form a valid token.
the paragraph also has two examples the second one is an exact match for you question and is as follows:
EXAMPLE 2 The program fragment x+++++y is parsed as x ++ ++ + y, which
violates a constraint on increment operators, even though the parse x
++ + ++ y might yield a correct expression.
which tells us that:
a+++++b
will be parsed as:
a ++ ++ + b
which violates the constraints on post increment since the result of the first post increment is an rvalue and post increment requires an lvalue. This is covered in section 6.5.2.4 Postfix increment and decrement operators which says (emphasis mine):
The operand of the postfix increment or decrement operator shall have
qualified or unqualified real or pointer type and shall be a
modifiable lvalue.
and
The result of the postfix ++ operator is the value of the operand.
The book C++ Gotchas also covers this case in Gotcha #17 Maximal Munch Problems it is the same problem in C++ as well and it also gives some examples. It explains that when dealing with the following set of characters:
->*
the lexical analyzer can do one of three things:
Treat it as three tokens: -, > and *
Treat it as two tokens: -> and *
Treat it as one token: ->*
The maximal munch rule allows it to avoid these ambiguities. The author points out that it (In the C++ context):
solves many more problems than it causes, but in two common
situations, it’s an annoyance.
The first example would be templates whose template arguments are also templates (which was solved in C++11), for example:
list<vector<string>> lovos; // error!
^^
Which interprets the closing angle brackets as the shift operator, and so a space is required to disambiguate:
list< vector<string> > lovos;
^
The second case involves default arguments for pointers, for example:
void process( const char *= 0 ); // error!
^^
would be interpreted as *= assignment operator, the solution in this case is to name the parameters in the declaration.
Your compiler desperately tries to parse a+++++b, and interprets it as (a++)++ +b. Now, the result of the post-increment (a++) is not an lvalue, i.e. it can't be post-incremented again.
Please don't ever write such code in production quality programs. Think about the poor fellow coming after you who needs to interpret your code.
(a++)++ +b
a++ returns the previous value, a rvalue. You can't increment this.
Because it causes undefined behaviour.
Which one is it?
c = (a++)++ + b
c = (a) + ++(++b)
c = (a++) + (++b)
Yeah, neither you nor the compiler know it.
EDIT:
The real reason is the one as said by the others:
It gets interpreted as (a++)++ + b.
but post increment requires a lvalue (which is a variable with a name) but (a++) returns a rvalue which cannot be incremented thus leading to the error message you get.
Thx to the others to pointing this out.
I think the compiler sees it as
c = ((a++)++)+b
++ has to have as an operand a value that can be modified. a is a value that can be modified. a++ however is an 'rvalue', it cannot be modified.
By the way the error I see on GCC C is the same, but differently-worded: lvalue required as increment operand.
Follow this precesion order
1.++ (pre increment)
2.+ -(addition or subtraction)
3."x"+ "y"add both the sequence
int a = 5,b = 2;
printf("%d",a++ + ++b); //a is 5 since it is post increment b is 3 pre increment
return 0; //it is 5+3=8

Why compiler treats i+++++i and i+++i differently [duplicate]

This question already has answers here:
Why doesn't a+++++b work?
(9 answers)
Closed 9 years ago.
int i=5;
printf("%d",i+++++i);
This gives error, but:
printf("%d",i+++i);
gives the output 11. In this case, the compiler read it as:
printf("%d",i+ ++i);
Why is this not done in first expression? i.e :
printf("%d",i+++++i);
Because of operator precedence i++++++i is treated as (i++)++ + i). This gives a compiler error because (i++) is not an lvalue.
Modifying the same variable multiple times between two sequence points is an Undefined Behavior according to §6.5 of language specifications
Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored.(71)
71) This paragraph renders undefined statement expressions such as
i = ++i + 1;
a[i++] = i;
while allowing
i = i + 1;
a[i] = i;
i+++++i is parsed as i ++ ++ + i. It contains an invalid subexpression i ++ ++. Speaking formally, this expression contains a constraint violation, which is why it does not compile.
Meanwhile i+++i is parsed as i ++ + i (not as i + ++ i as you incorrectly believe). It does not contain any constraint violations. It produces undefined behavior, but is otherwise well-formed.
Also, it is rather naive to believe that printf("%d",i+++i) will print 11. The behavior of i+++i is undefined, meaning that there's no point in trying to predict the output.
In printf("%d",i+++++i);, the source text i+++++i is first processed according to this rule from C 2011 (N1570) 6.4 4:
If the input stream has been parsed into preprocessing tokens up to a given character, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token…
This causes the lexical analysis to proceed in this way:
i can be a token, but i+ cannot, so i is the next token. This leaves +++++i.
+ and ++ can each be a token, but +++ cannot. Since ++ is the longest sequence that could be a token, it is the next token. This leaves +++i.
For the same reason, ++ is the next token. This leaves +i.
+ can be a token, but +i cannot, so + is the next token. This leaves i.
i can be a token, but i) cannot, so i is the next token.
Thus, the expression is i ++ ++ + i.
Then the grammar rules structure this expression as ((i ++) ++) + i.
When i++ is evaluated, the result is just a value, not an lvalue. Since ++ cannot be applied to a value that is not an lvalue, (i ++) ++ is not allowed.
After the compiler recognizes that the expression is semantically incorrect, it cannot go back and change the lexical analysis. Th C standard specifies that the rules must be followed as described above.
In i+++i, the code violates a separate rule. This is parsed as (i ++) + i. This expression both modifies i (in i ++) and separately accesses it (in the i of + i). This violates C 2011 (1570) 6.5 2:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
This rule uses some technical terms: In i ++, the effect of changing i is a side effect of ++. (The main effect is to produce the value of i.) The use of i in + i is a value computation of the scalar object i. And these two things are unsequenced, because the C standard does not specify whether producing the value of i for + i comes before or after changing i in i ++.

Why doesn't c = a+++++b work in C? [duplicate]

int main ()
{
int a = 5,b = 2;
printf("%d",a+++++b);
return 0;
}
This code gives the following error:
error: lvalue required as increment operand
But if I put spaces throughout a++ + and ++b, then it works fine.
int main ()
{
int a = 5,b = 2;
printf("%d",a++ + ++b);
return 0;
}
What does the error mean in the first example?
Compilers are written in stages. The first stage is called the lexer and turns characters into a symbolic structure. So "++" becomes something like an enum SYMBOL_PLUSPLUS. Later, the parser stage turns this into an abstract syntax tree, but it can't change the symbols. You can affect the lexer by inserting spaces (which end symbols unless they are in quotes).
Normal lexers are greedy (with some exceptions), so your code is being interpreted as
a++ ++ +b
The input to the parser is a stream of symbols, so your code would be something like:
[ SYMBOL_NAME(name = "a"),
SYMBOL_PLUS_PLUS,
SYMBOL_PLUS_PLUS,
SYMBOL_PLUS,
SYMBOL_NAME(name = "b")
]
Which the parser thinks is syntactically incorrect. (EDIT based on comments: Semantically incorrect because you cannot apply ++ to an r-value, which a++ results in)
a+++b
is
a++ +b
Which is ok. So are your other examples.
printf("%d",a+++++b); is interpreted as (a++)++ + b according to the Maximal Munch Rule!.
++ (postfix) doesn't evaluate to an lvalue but it requires its operand to be an lvalue.
!
6.4/4 says
the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token"
The lexer uses what's generally called a "maximum munch" algorithm to create tokens. That means as it's reading characters in, it keeps reading characters until it encounters something that can't be part of the same token as what it already has (e.g., if it's been reading digits so what it has is a number, if it encounters an A, it knows that can't be part of the number. so it stops and leaves the A in the input buffer to use as the beginning of the next token). It then returns that token to the parser.
In this case, that means +++++ gets lexed as a ++ ++ + b. Since the first post-increment yields an rvalue, the second can't be applied to it, and the compiler gives an error.
Just FWIW, in C++ you can overload operator++ to yield an lvalue, which allows this to work. For example:
struct bad_code {
bad_code &operator++(int) {
return *this;
}
int operator+(bad_code const &other) {
return 1;
}
};
int main() {
bad_code a, b;
int c = a+++++b;
return 0;
}
The compiles and runs (though it does nothing) with the C++ compilers I have handy (VC++, g++, Comeau).
This exact example is covered in the draft C99 standard(same details in C11) section 6.4 Lexical elements paragraph 4 which in says:
If the input stream has been parsed into preprocessing tokens up to a
given character, the next preprocessing token is the longest sequence
of characters that could constitute a preprocessing token. [...]
which is also known as the maximal munch rule which is used in in lexical analysis to avoid ambiguities and works by taking as many elements as it can to form a valid token.
the paragraph also has two examples the second one is an exact match for you question and is as follows:
EXAMPLE 2 The program fragment x+++++y is parsed as x ++ ++ + y, which
violates a constraint on increment operators, even though the parse x
++ + ++ y might yield a correct expression.
which tells us that:
a+++++b
will be parsed as:
a ++ ++ + b
which violates the constraints on post increment since the result of the first post increment is an rvalue and post increment requires an lvalue. This is covered in section 6.5.2.4 Postfix increment and decrement operators which says (emphasis mine):
The operand of the postfix increment or decrement operator shall have
qualified or unqualified real or pointer type and shall be a
modifiable lvalue.
and
The result of the postfix ++ operator is the value of the operand.
The book C++ Gotchas also covers this case in Gotcha #17 Maximal Munch Problems it is the same problem in C++ as well and it also gives some examples. It explains that when dealing with the following set of characters:
->*
the lexical analyzer can do one of three things:
Treat it as three tokens: -, > and *
Treat it as two tokens: -> and *
Treat it as one token: ->*
The maximal munch rule allows it to avoid these ambiguities. The author points out that it (In the C++ context):
solves many more problems than it causes, but in two common
situations, it’s an annoyance.
The first example would be templates whose template arguments are also templates (which was solved in C++11), for example:
list<vector<string>> lovos; // error!
^^
Which interprets the closing angle brackets as the shift operator, and so a space is required to disambiguate:
list< vector<string> > lovos;
^
The second case involves default arguments for pointers, for example:
void process( const char *= 0 ); // error!
^^
would be interpreted as *= assignment operator, the solution in this case is to name the parameters in the declaration.
Your compiler desperately tries to parse a+++++b, and interprets it as (a++)++ +b. Now, the result of the post-increment (a++) is not an lvalue, i.e. it can't be post-incremented again.
Please don't ever write such code in production quality programs. Think about the poor fellow coming after you who needs to interpret your code.
(a++)++ +b
a++ returns the previous value, a rvalue. You can't increment this.
Because it causes undefined behaviour.
Which one is it?
c = (a++)++ + b
c = (a) + ++(++b)
c = (a++) + (++b)
Yeah, neither you nor the compiler know it.
EDIT:
The real reason is the one as said by the others:
It gets interpreted as (a++)++ + b.
but post increment requires a lvalue (which is a variable with a name) but (a++) returns a rvalue which cannot be incremented thus leading to the error message you get.
Thx to the others to pointing this out.
I think the compiler sees it as
c = ((a++)++)+b
++ has to have as an operand a value that can be modified. a is a value that can be modified. a++ however is an 'rvalue', it cannot be modified.
By the way the error I see on GCC C is the same, but differently-worded: lvalue required as increment operand.
Follow this precesion order
1.++ (pre increment)
2.+ -(addition or subtraction)
3."x"+ "y"add both the sequence
int a = 5,b = 2;
printf("%d",a++ + ++b); //a is 5 since it is post increment b is 3 pre increment
return 0; //it is 5+3=8

Why doesn't a+++++b work?

int main ()
{
int a = 5,b = 2;
printf("%d",a+++++b);
return 0;
}
This code gives the following error:
error: lvalue required as increment operand
But if I put spaces throughout a++ + and ++b, then it works fine.
int main ()
{
int a = 5,b = 2;
printf("%d",a++ + ++b);
return 0;
}
What does the error mean in the first example?
Compilers are written in stages. The first stage is called the lexer and turns characters into a symbolic structure. So "++" becomes something like an enum SYMBOL_PLUSPLUS. Later, the parser stage turns this into an abstract syntax tree, but it can't change the symbols. You can affect the lexer by inserting spaces (which end symbols unless they are in quotes).
Normal lexers are greedy (with some exceptions), so your code is being interpreted as
a++ ++ +b
The input to the parser is a stream of symbols, so your code would be something like:
[ SYMBOL_NAME(name = "a"),
SYMBOL_PLUS_PLUS,
SYMBOL_PLUS_PLUS,
SYMBOL_PLUS,
SYMBOL_NAME(name = "b")
]
Which the parser thinks is syntactically incorrect. (EDIT based on comments: Semantically incorrect because you cannot apply ++ to an r-value, which a++ results in)
a+++b
is
a++ +b
Which is ok. So are your other examples.
printf("%d",a+++++b); is interpreted as (a++)++ + b according to the Maximal Munch Rule!.
++ (postfix) doesn't evaluate to an lvalue but it requires its operand to be an lvalue.
!
6.4/4 says
the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token"
The lexer uses what's generally called a "maximum munch" algorithm to create tokens. That means as it's reading characters in, it keeps reading characters until it encounters something that can't be part of the same token as what it already has (e.g., if it's been reading digits so what it has is a number, if it encounters an A, it knows that can't be part of the number. so it stops and leaves the A in the input buffer to use as the beginning of the next token). It then returns that token to the parser.
In this case, that means +++++ gets lexed as a ++ ++ + b. Since the first post-increment yields an rvalue, the second can't be applied to it, and the compiler gives an error.
Just FWIW, in C++ you can overload operator++ to yield an lvalue, which allows this to work. For example:
struct bad_code {
bad_code &operator++(int) {
return *this;
}
int operator+(bad_code const &other) {
return 1;
}
};
int main() {
bad_code a, b;
int c = a+++++b;
return 0;
}
The compiles and runs (though it does nothing) with the C++ compilers I have handy (VC++, g++, Comeau).
This exact example is covered in the draft C99 standard(same details in C11) section 6.4 Lexical elements paragraph 4 which in says:
If the input stream has been parsed into preprocessing tokens up to a
given character, the next preprocessing token is the longest sequence
of characters that could constitute a preprocessing token. [...]
which is also known as the maximal munch rule which is used in in lexical analysis to avoid ambiguities and works by taking as many elements as it can to form a valid token.
the paragraph also has two examples the second one is an exact match for you question and is as follows:
EXAMPLE 2 The program fragment x+++++y is parsed as x ++ ++ + y, which
violates a constraint on increment operators, even though the parse x
++ + ++ y might yield a correct expression.
which tells us that:
a+++++b
will be parsed as:
a ++ ++ + b
which violates the constraints on post increment since the result of the first post increment is an rvalue and post increment requires an lvalue. This is covered in section 6.5.2.4 Postfix increment and decrement operators which says (emphasis mine):
The operand of the postfix increment or decrement operator shall have
qualified or unqualified real or pointer type and shall be a
modifiable lvalue.
and
The result of the postfix ++ operator is the value of the operand.
The book C++ Gotchas also covers this case in Gotcha #17 Maximal Munch Problems it is the same problem in C++ as well and it also gives some examples. It explains that when dealing with the following set of characters:
->*
the lexical analyzer can do one of three things:
Treat it as three tokens: -, > and *
Treat it as two tokens: -> and *
Treat it as one token: ->*
The maximal munch rule allows it to avoid these ambiguities. The author points out that it (In the C++ context):
solves many more problems than it causes, but in two common
situations, it’s an annoyance.
The first example would be templates whose template arguments are also templates (which was solved in C++11), for example:
list<vector<string>> lovos; // error!
^^
Which interprets the closing angle brackets as the shift operator, and so a space is required to disambiguate:
list< vector<string> > lovos;
^
The second case involves default arguments for pointers, for example:
void process( const char *= 0 ); // error!
^^
would be interpreted as *= assignment operator, the solution in this case is to name the parameters in the declaration.
Your compiler desperately tries to parse a+++++b, and interprets it as (a++)++ +b. Now, the result of the post-increment (a++) is not an lvalue, i.e. it can't be post-incremented again.
Please don't ever write such code in production quality programs. Think about the poor fellow coming after you who needs to interpret your code.
(a++)++ +b
a++ returns the previous value, a rvalue. You can't increment this.
Because it causes undefined behaviour.
Which one is it?
c = (a++)++ + b
c = (a) + ++(++b)
c = (a++) + (++b)
Yeah, neither you nor the compiler know it.
EDIT:
The real reason is the one as said by the others:
It gets interpreted as (a++)++ + b.
but post increment requires a lvalue (which is a variable with a name) but (a++) returns a rvalue which cannot be incremented thus leading to the error message you get.
Thx to the others to pointing this out.
I think the compiler sees it as
c = ((a++)++)+b
++ has to have as an operand a value that can be modified. a is a value that can be modified. a++ however is an 'rvalue', it cannot be modified.
By the way the error I see on GCC C is the same, but differently-worded: lvalue required as increment operand.
Follow this precesion order
1.++ (pre increment)
2.+ -(addition or subtraction)
3."x"+ "y"add both the sequence
int a = 5,b = 2;
printf("%d",a++ + ++b); //a is 5 since it is post increment b is 3 pre increment
return 0; //it is 5+3=8

how it's answer is 36? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Could anyone explain these undefined behaviors (i = i++ + ++i , i = i++, etc…)
main()
{
int a=5;
a= a++ + ++a + ++a + a++ + a++;
printf("%d",a);
}
This is not defined.
You can find the Committee Draft from May 6, 2005 of the C-standard here (pdf)
See section 6.5 Expressions:
2 Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.
and the example:
71) This paragraph renders undefined statement expressions such as
i = ++i + 1;
a[i++] = i;
The answer is actually undefined.
Answer in undefined because you've got some situations in which the parser doesn't know how to parse the code..
is a+++b: a + ++b or a++ + b?
Think the fact that usually white space is just ignored when lexing the source code. It may depends upon implementation of the compiler (and some other languages with same ++ operators may choose to give priority to one instead of another) but in general this is not safe.
For example in Java your code line gives 37 as the answer, because it chooses to bind ++ operators in a specific way according to precedence, but it's just a choice..

Resources