Why are there different associativities among operators in C? - c

Talking about the associativity of operators in C, I was wondering why there are differences associativities among operators that have the same precedence. for example, postfix increment and postfix decrement have left associativity; while prefix increment and prefix decrement have right associativity. Isn't it simple to have just left or right associativity for all the same precedence operators?
Are there any reasons behind that?

Isn't it simple to have just left or right associativity for all the
same precedence operators?
Yes and it is the case in C. May be you assumed that prefix and postfix have the same precedence which is wrong. Postfix has a higher precedence than prefix!
Also there is another curious case to consider as to why certain operators have certain associativity. From Wiki,
For example, in C, the assignment a = b is an expression that returns
a value (namely, b converted to the type of a) with the side effect of
setting a to this value. An assignment can be performed in the middle
of an expression. (An expression can be made into a statement by
following it with a semicolon; i.e. a = b is an expression but a = b;
is a statement). The right-associativity of the = operator allows
expressions such as a = b = c to be interpreted as a = (b = c),
thereby setting both a and b to the value of c. The alternative (a =
b) = c does not make sense because a = b is not an lvalue.

Binary operators are all left-associative except the assignment operator which is right-associative.
Postfix operators are sometimes (for exemple in K&R 2nd) said to be right-associative but this is to express the idea they have higher precedence than unary operators.

Related

Short circuit and operator precedence in C

I know that logical operators in C follow short circuiting but my doubt is that are short circuiting and operator precedence rules not opposing each other. See the below example :
#include<stdio.h>
int main()
{
int a;
int b=5;
a=0 && --b;
printf("%d %d",a,b);
return 0;
}
According to the precedence rules, the highest precedence is of the prefix operator. So --b should be evaluated first and then the && and at last result will be assigned to a. So expected output should be 0 4. But in this case the second operand of && never actually executes and result comes out to be 0 5.
Why precedence rules are not being applied here. Are logical operators exempted from precedence rules? If yes, what other operators show such behavior? And what is the logic behind this behavior?
You're conflating two related but different topics: operator precedence and order of evaluation.
The operator precedence rules dictate how various operators are grouped together. In the case of this expression:
a=0 && --b;
The operators are grouped like this:
a = (0 && (--b));
This has no effect however on which order the operands are evaluated in. The && operator in particular dictates that the left operand is evaluated first, and if it evaluates to 0 the right operand is not evaluated.
So in this case the left side of && which is 0 is evaluated, and because it is 0 the right side which is --b is not evaluated, so b is not incremented.
Here's another example of the difference between operator precedence and order of evaluation.
int val()
{
static x = 2;
x *= 2;
return x;
}
int main()
{
int result = val() + (5 * val());
printf("%d\n", result);
return 0;
}
What will the above program print? As it turns out, there are two possibilities, and both are valid.
In this expression:
val() + (5 * val())
There are no operators that have any type of short circuit behavior. So the compiler is free to evaluate the individual operands of both + and * in any order.
If the first instance of val() is evaluated first, the result will be 4 + ( 5 * 8) == 44. If the second instance of val() is evaluated first, the result will be 8 + (5 * 4) == 28. Again, both are valid since the operands may be evaluated in any order.
Precedence affects how ambiguous expressions are parsed. When there are multiple ways to interpret an expression with several operators, precedence tells us which interpretation is correct. Think of precedence as a mechanism to figure out where the implied parentheses are.
For example in the statement in question there are two valid ways to parse it. If = had higher precedence than && it could be read as:
(a = 0) && --b;
But since && has higher precedence, it's actually interpreted as:
a = (0 && --b);
(Note: Your code's formatting suggests it's the first. Be careful not to mislead!)
Evaluation order is different from precedence. They're related, but independent concepts. After precedence is used to determine the correct parsing of an expression, evaluation order tells us the order to evaluate the operands in. Is it left to right? Right to left? Simultaneous? Unspecified?
For the most part evaluation order is left unspecified. Operators like + and * and << have no defined evaluation order. The compiler is allowed to do whatever it likes, and the programmer must not write code that depends on any particular order. a + b could evaluate a then b, or b then a, or it could even interweave their evaluations.
= and &&, among others, are exceptions. = is always evaluated right to left, and && is left to right with short circuiting.
Here's how evaluation proceeds step-by-step for our statement:
a = (0 && --b), = evaluated right to left
0 && --b, && evaluated left to right with short circuiting
0, evaluates false which triggers short circuiting and cancels the next step
--b, not evaluated due to short circuiting
result is 0
a, variable reference evaluated
a = 0, assignment occurs and overall result is 0
You said that there is no specific order for + and *, but this table shows the order to be left to right. Why so?
The last column of that table is associativity. Associativity breaks precedence ties when we use the same operator twice, or when we use operators with the same precedence.
For example, how should we read a / b / c. Is it:
(a / b) / c, or
a / (b / c)?
According to the table / has left-to-right associativity, so it's the first.
What about chained assignments like foo = bar = baz? Now, assignment has right-to-left associativity, so the correct parsing is foo = (bar = baz).
If this all gets confusing, focus on one simple rule of thumb:
"Precedence and associativity are independent from order of evaluation."
Operator precedence doesn't necessarily tell that an expression gets executed first, it just means that the expression is parsed such that the result of the higher-precedence operation is used in the lower-precedence operation and not the other way around. The actual expressions only get evaluated if they need to!
operator && 's order of evaluation is left to right.
= has lower precedence, in fact only ooperator , has lower precedence than =.
So the expresssion will read a = (0 && --b) being 0 evaluated first given the mentioned order of evaluation.
Since 0 evaluates to false, there is no need to evaluate the second part of the expression because false && true is false, given the first part of the expression is false, the expression will always be false.
If you had || operator the second part of the expression would have to be evaluated.
Operator precedence is not all that plays in the game. There is also order of evaluation. And that mandates that a=0 is evaluated first (evaluation order is from left to right), and then right part after the && is not evaluated at all.
That is how C works.

Postfix/Prefix operator precedence and associativity

I'm confused about the precedence and associativity of postfix/prefix operators.
On one hand, as I'm reading K&R book, it states that:
(*ip)++
The parentheses are necessary in this last example; without them, the expression would increment ip instead of what it points to, because unary operators like * and ++ associate right to left.
No mention whatsoever of a difference of associativity between postfix/prefix operators. Both are treated equally. The book also states that * and ++ have the same precedence.
On the other hand, this page states that:
1) Precedence of prefix ++ and * is same. Associativity of both is right to left.
2) Precedence of postfix ++ is higher than both * and prefix ++. Associativity of postfix ++ is left to right.
Which one should I trust? Is it something that changed with the C revisions over the years?
TL;DR: the two descriptions are saying the same thing, using the same words and symbols with slightly different meaning.
On one hand, as I'm reading K&R book, it states that:
(*ip)++
The parentheses are necessary in this last example; without them, the expression would increment ip instead of what it points to,
because unary operators like * and ++ associate right to left.
No mention whatsoever of a difference of associativity between
postfix/prefix operators. Both are treated equally. The book also
states that * and ++ have the same precedence.
It's unclear which edition of K&R you're reading, but the first, at least, does treat the prefix and postfix versions of the increment and decrement operators as a single operator each, with effects depending on whether their operand precedes or follows them.
On the other hand, this page states that:
1) Precedence of prefix ++ and * is same. Associativity of both is
right to left.
2) Precedence of postfix ++ is higher than both * and prefix ++.
Associativity of postfix ++ is left to right.
The language standard and most modern treatments describe the prefix and postfix versions as different operators, disambiguated by their position relative to their operand. The rest of this answer explains how this is an alternative description of the same thing.
Observe that when only unary operators are involved, associativity questions arise only between one prefix and one postfix operator of the same precedence. Among a chain of only prefix or only postfix operations, there is no ambiguity with respect to how they associate. For example, given - - x, you cannot meaningfully group it as (- -) x. The only alternative is - (- x).
Next, observe that all the highest-precedence operators are postfix unary operators, and that in K&R, all the second-precedence operators are prefix unary operators except ambi-fix ++ and --. Applying right-to-left associativity to the second-precedence operators, then, disambiguates only expressions involving postfix ++ or -- and a prefix unary operator, and does so in favor of the postfix operator. This is equivalent to the modern approach of distinguishing the postfix and prefix versions of those operators and assigning higher precedence to the postfix versions.
To get the rest of the way to the modern description, consider the observations I already made that associativity questions arise for unary operators only when prefix and postfix operators are chained, and that all the highest-precedence operators are postfix unary operators. Having distinguished postfix ++ and -- as separate, higher-precedence operators than their prefix versions, one could put them in their own tier between the other postfix operators and all the prefix operators, but putting them instead in the same tier with all the other postfix operators changes nothing about how any expression is interpreted, and is simpler. That's how it is usually represented these days, including in your second resource.
As for left-to-right vs. right-to-left associativity, the question is, again, moot for a precedence tier containing only prefix or only postfix operators. However, describing postfix operators as associating left-to-right and prefix operators as associating right-to-left is consistent with their semantic order of operations.
You can refer to the C11 standard although its section on precedence is a little hard to follow. See sec. 6.5.1. (footnote 85 says "The syntax specifies the precedence of operators in the evaluation of an expression, which is the same
as the order of the major subclauses of this subclause, highest precedence first.")
Basically, postfix operators are higher precedence than prefix because they come earlier in that section, 6.5.2.4 vs. 6.5.3.1. So K&R is correct (no surprise there!) that *ip++ means *(ip++), which is different from (*ip)++, however its point about it being due to associativity is a bit misleading I'd say. And the geeksforgeeks site's point #2 is also correct.
#GaryO's answer is spot on! Postfix has higher precedence because they come earlier.
Here's a small test to sanity check to convince yourself.
I made two integer arrays and a pointer to the start of each array, then ran (*p)++ and *p++ on the two pointers. I printed out the pointer and array state before and after for reference.
#include <stdio.h>
#define PRINT_ARRS printf("a = {%d, %d, %d}\n", a[0], a[1], a[2]); \
printf("b = {%d, %d, %d}\n\n", b[0], b[1], b[2]);
#define PRINT_PTRS printf("*p1 = a[%ld] = %d\n", p1 - a, *p1); \
printf("*p2 = b[%ld] = %d\n\n", p2 - b, *p2);
int main()
{
int a[3] = {1 , 1, 1};
int b[3] = {10,10, 10};
int *p1 = a;
int *p2 = b;
PRINT_ARRS
PRINT_PTRS
printf("(*p1)++: %d\n", (*p1)++);
printf("*p1++ : %d\n\n", *p2++);
PRINT_ARRS
PRINT_PTRS
}
Compiling with gcc and running on my machine produces:
a = {1, 1, 1}
b = {10, 10, 10}
*p1 = a[0] = 1
*p2 = b[0] = 10
(*p1)++: 1
*p2++ : 10
a = {2, 1, 1}
b = {10, 10, 10}
*p1 = a[0] = 2
*p2 = b[1] = 10
You can see that (*p1)++ increments the array value while *p2++ increments the pointer.

C operator order

Why is the postfix increment operator (++) executed after the assignment (=) operator in the following example? According to the precedence/priority lists for operators ++ has higher priority than = and should therefore be executed first.
int a,b;
b = 2;
a = b++;
printf("%d\n",a);
will output a = 2.
PS: I know the difference between ++b and b++ in principle, but just looking at the operator priorities these precende list tells us something different, namely that ++ should be executed before =
++ is evaluated first. It is post-increment, meaning it evaluates to the value stored and then increments. Any operator on the right side of an assignment expression (except for the comma operator) is evaluated before the assignment itself.
It is. It's just that, conceptually at least, ++ happens after the entire expression a = b++ (which is an expression with value a) is evaluated.
Operator precedence and order of evaluation of operands are rather advanced topics in C, because there exists many operators that have their own special cases specified.
Postfix ++ is one such special case, specified by the standard in the following manner (6.5.2.4):
The value computation of the result is sequenced before the side
effect of updating the stored value of the operand.
It means that the compiler will translate the line a = b++; into something like this:
Read the value of b into a CPU register. ("value computation of the result")
Increase b. ("updating the stored value")
Store the CPU register value in a.
This is what makes postfix ++ different from prefix ++.
The increment operators do two things: add +1 to a number and return a value. The difference between post-increment and pre-increment is the order of these two steps. So the increment actually is executed first and the assignment later in any case.

Precedence of operators in RPN

While calculating postfix expression in C, if our token is an operator we have to place it the stack in such a way that it's has the highest priority.
My question is among the operators *, /, %, which has the highest priority.
Do we need to consider associativity as well ? Since all these operators have LEFT-TO-RIGHT associativity, will / get higher preference over * ?
Precedence usually only applies to infix notations. Postfix (and Prefix) notations are usually considered to explicitly specify which operands are associated with which operator. Precedence only comes into play when there is ambiguity in the parsing, which is not the case in postfix notation.
The precedence question that arises in an infix expression
4 * 5 + 3 / 12
simply doesn't exist after conversion to an RPN form
4 5 * 3 + 12 /
or a prefix form
(/ (+ (* 4 5) 3) 12)
.
There is some possibility for confusion when considering something like the Shunting-Yard Algorithm which can be used to generate an RPN representation from -- or directly evaluate -- an infix expression. It deals with operator precedence by deferring operators onto a secondary stack until a lower precedence operator forces it to be popped and evaluated (or output).
Operators *, /, % are same in precedence and there associativity is left to right. So an expression like:
a * b / c /* both operators have same precedence */
is same as:
(a * b) / c
Similarly an expression like:
a / b * c /* both operators have same precedence */
is same as:
( a / b ) * c
So even operators are same in precedence, but suppose if they appears in an expression(without parenthesis) then left most operator has higher precedence because of left to right associativity.
Note Conceptually we use parenthesis in an expression to overwrite precedence of operators, so although expression: a / b * c is same as: (a / b) * c but we can force to evaluate * first using ( ) by writing expression as a / ( b * c). What I means to say if you have confusion in operator precedence while writing code use parenthesis.
EDIT:
In POSTFIX and PREFIX form don't use parenthesis ( ). Precedence of operator are decided in order of there appearance in expression, So while evaluating an expression its not need to search next operation to perform - and so evaluation becomes fast.
While in INFIX expression precedence of operators can be overwritten by brackets ( ). Hence brackets are there in infix expression - and it need to search which operation to perform next e.g. a + b % d - and evaluation of expression is slow.
That is the reason conversion are useful in computer science.
So compiler first translates an infix expression into equivalent postfix form(using grammar rules) then generates target code to evaluate expression value. That is the reason why we study postfix and prefix form.
And according to precedence and associativity rules the following expression:
a * b / c /* both operators have same precedence */
will be translates into:
a b * c /
And expression
a / b * c /* both operators have same precedence */
will be translated into
a b / c *
My question is among the operators *, /, %, which has the highest priority.
They are equal, just as + and - (binary) are equal.
Do we need to consider associativity as well?
Yes, for example 1 + 2 + 3 needs to become (1 + 2) + 3, i.e. 1, 2, ADD, 3, ADD, as opposed to 1, 2, 3, ADD, ADD.
Since all these operators have LEFT-TO-RIGHT associativity, will / get higher preference over * ?
Associativity doesn't have anything to do with precence. The question doesn't make sense.
But if you're just calculating an existing RPN expression, as your title says, I don't know why you're asking any of this. You just push the operands and evaluate the operators as they occur. Are you really asking about translation into RPN?

About use of parentheses in C

void main()
int a,b,c;
c=(a,b)
This gives c=b while
c=a,b
gives c=a.
What is the reason for the above two?
In this line:
c=(a,b)
The parentheses mean, "evaluate the expression a,b first, then assign the value to c." In this case, b is assigned, because it's the right-hand-side expression of a,b. In C, comma expressions are evaluated left-to-right, with the overall value being that of the rightmost expression.
While in this line:
c=a,b
The assignment is evaluated as the entire left hand side first, which is c=a. This is because the equal = operator takes precedence over the comma , operator. Thus, b doesn't get assigned to c at all. It is equivalent to:
(c=a),b
In C, the comma operator evaluates the first operand, then discard it and then evaluates the right operand. So the outcome is the right operand. And it has the lowest precedence.
c = (a,b)
() has higher precedence than, so a,b evaluates first. The result is b. So c = b.
But when used c = a,b assignment = have higher precedence. So c = a evaluates first. Thus a is assigned to c.
Check this for further details.

Resources