Precedence of operators in RPN - c

While calculating postfix expression in C, if our token is an operator we have to place it the stack in such a way that it's has the highest priority.
My question is among the operators *, /, %, which has the highest priority.
Do we need to consider associativity as well ? Since all these operators have LEFT-TO-RIGHT associativity, will / get higher preference over * ?

Precedence usually only applies to infix notations. Postfix (and Prefix) notations are usually considered to explicitly specify which operands are associated with which operator. Precedence only comes into play when there is ambiguity in the parsing, which is not the case in postfix notation.
The precedence question that arises in an infix expression
4 * 5 + 3 / 12
simply doesn't exist after conversion to an RPN form
4 5 * 3 + 12 /
or a prefix form
(/ (+ (* 4 5) 3) 12)
.
There is some possibility for confusion when considering something like the Shunting-Yard Algorithm which can be used to generate an RPN representation from -- or directly evaluate -- an infix expression. It deals with operator precedence by deferring operators onto a secondary stack until a lower precedence operator forces it to be popped and evaluated (or output).

Operators *, /, % are same in precedence and there associativity is left to right. So an expression like:
a * b / c /* both operators have same precedence */
is same as:
(a * b) / c
Similarly an expression like:
a / b * c /* both operators have same precedence */
is same as:
( a / b ) * c
So even operators are same in precedence, but suppose if they appears in an expression(without parenthesis) then left most operator has higher precedence because of left to right associativity.
Note Conceptually we use parenthesis in an expression to overwrite precedence of operators, so although expression: a / b * c is same as: (a / b) * c but we can force to evaluate * first using ( ) by writing expression as a / ( b * c). What I means to say if you have confusion in operator precedence while writing code use parenthesis.
EDIT:
In POSTFIX and PREFIX form don't use parenthesis ( ). Precedence of operator are decided in order of there appearance in expression, So while evaluating an expression its not need to search next operation to perform - and so evaluation becomes fast.
While in INFIX expression precedence of operators can be overwritten by brackets ( ). Hence brackets are there in infix expression - and it need to search which operation to perform next e.g. a + b % d - and evaluation of expression is slow.
That is the reason conversion are useful in computer science.
So compiler first translates an infix expression into equivalent postfix form(using grammar rules) then generates target code to evaluate expression value. That is the reason why we study postfix and prefix form.
And according to precedence and associativity rules the following expression:
a * b / c /* both operators have same precedence */
will be translates into:
a b * c /
And expression
a / b * c /* both operators have same precedence */
will be translated into
a b / c *

My question is among the operators *, /, %, which has the highest priority.
They are equal, just as + and - (binary) are equal.
Do we need to consider associativity as well?
Yes, for example 1 + 2 + 3 needs to become (1 + 2) + 3, i.e. 1, 2, ADD, 3, ADD, as opposed to 1, 2, 3, ADD, ADD.
Since all these operators have LEFT-TO-RIGHT associativity, will / get higher preference over * ?
Associativity doesn't have anything to do with precence. The question doesn't make sense.
But if you're just calculating an existing RPN expression, as your title says, I don't know why you're asking any of this. You just push the operands and evaluate the operators as they occur. Are you really asking about translation into RPN?

Related

Short circuit and operator precedence in C

I know that logical operators in C follow short circuiting but my doubt is that are short circuiting and operator precedence rules not opposing each other. See the below example :
#include<stdio.h>
int main()
{
int a;
int b=5;
a=0 && --b;
printf("%d %d",a,b);
return 0;
}
According to the precedence rules, the highest precedence is of the prefix operator. So --b should be evaluated first and then the && and at last result will be assigned to a. So expected output should be 0 4. But in this case the second operand of && never actually executes and result comes out to be 0 5.
Why precedence rules are not being applied here. Are logical operators exempted from precedence rules? If yes, what other operators show such behavior? And what is the logic behind this behavior?
You're conflating two related but different topics: operator precedence and order of evaluation.
The operator precedence rules dictate how various operators are grouped together. In the case of this expression:
a=0 && --b;
The operators are grouped like this:
a = (0 && (--b));
This has no effect however on which order the operands are evaluated in. The && operator in particular dictates that the left operand is evaluated first, and if it evaluates to 0 the right operand is not evaluated.
So in this case the left side of && which is 0 is evaluated, and because it is 0 the right side which is --b is not evaluated, so b is not incremented.
Here's another example of the difference between operator precedence and order of evaluation.
int val()
{
static x = 2;
x *= 2;
return x;
}
int main()
{
int result = val() + (5 * val());
printf("%d\n", result);
return 0;
}
What will the above program print? As it turns out, there are two possibilities, and both are valid.
In this expression:
val() + (5 * val())
There are no operators that have any type of short circuit behavior. So the compiler is free to evaluate the individual operands of both + and * in any order.
If the first instance of val() is evaluated first, the result will be 4 + ( 5 * 8) == 44. If the second instance of val() is evaluated first, the result will be 8 + (5 * 4) == 28. Again, both are valid since the operands may be evaluated in any order.
Precedence affects how ambiguous expressions are parsed. When there are multiple ways to interpret an expression with several operators, precedence tells us which interpretation is correct. Think of precedence as a mechanism to figure out where the implied parentheses are.
For example in the statement in question there are two valid ways to parse it. If = had higher precedence than && it could be read as:
(a = 0) && --b;
But since && has higher precedence, it's actually interpreted as:
a = (0 && --b);
(Note: Your code's formatting suggests it's the first. Be careful not to mislead!)
Evaluation order is different from precedence. They're related, but independent concepts. After precedence is used to determine the correct parsing of an expression, evaluation order tells us the order to evaluate the operands in. Is it left to right? Right to left? Simultaneous? Unspecified?
For the most part evaluation order is left unspecified. Operators like + and * and << have no defined evaluation order. The compiler is allowed to do whatever it likes, and the programmer must not write code that depends on any particular order. a + b could evaluate a then b, or b then a, or it could even interweave their evaluations.
= and &&, among others, are exceptions. = is always evaluated right to left, and && is left to right with short circuiting.
Here's how evaluation proceeds step-by-step for our statement:
a = (0 && --b), = evaluated right to left
0 && --b, && evaluated left to right with short circuiting
0, evaluates false which triggers short circuiting and cancels the next step
--b, not evaluated due to short circuiting
result is 0
a, variable reference evaluated
a = 0, assignment occurs and overall result is 0
You said that there is no specific order for + and *, but this table shows the order to be left to right. Why so?
The last column of that table is associativity. Associativity breaks precedence ties when we use the same operator twice, or when we use operators with the same precedence.
For example, how should we read a / b / c. Is it:
(a / b) / c, or
a / (b / c)?
According to the table / has left-to-right associativity, so it's the first.
What about chained assignments like foo = bar = baz? Now, assignment has right-to-left associativity, so the correct parsing is foo = (bar = baz).
If this all gets confusing, focus on one simple rule of thumb:
"Precedence and associativity are independent from order of evaluation."
Operator precedence doesn't necessarily tell that an expression gets executed first, it just means that the expression is parsed such that the result of the higher-precedence operation is used in the lower-precedence operation and not the other way around. The actual expressions only get evaluated if they need to!
operator && 's order of evaluation is left to right.
= has lower precedence, in fact only ooperator , has lower precedence than =.
So the expresssion will read a = (0 && --b) being 0 evaluated first given the mentioned order of evaluation.
Since 0 evaluates to false, there is no need to evaluate the second part of the expression because false && true is false, given the first part of the expression is false, the expression will always be false.
If you had || operator the second part of the expression would have to be evaluated.
Operator precedence is not all that plays in the game. There is also order of evaluation. And that mandates that a=0 is evaluated first (evaluation order is from left to right), and then right part after the && is not evaluated at all.
That is how C works.

Beginner in need of a simple explanation of the difference between order of evaluation and precedence/associativity

I am reading the end of the 2nd chapter of K&R and I'm having some difficulty understanding two specific unrelated example lines of code (which follow) along with commentary of them in the book:
x = f() + g();
a[i] = i++;
FIRST LINE - I have no trouble understanding that the standard does not specify the order of evaluation for the + operator, and that therefore it is unspecified whether f() or g() evaluates first (and that is why I think the question isn't a duplicate). My confusion stems from the fact that if we look up the C operator precedence chart it cites function calls as of highest precedence with left-to-right associativity. Now doesn't that mean that f() has to be called/evaluated before g()? Obviously not, but I don't know what I am missing.
SECOND LINE - Again the similar conundrum regarding whether the array is indexed to the initial value of i or the incremented value. However, again the operator precedence chart cites array subscripting as of highest precedence with left-to-right associativity. Therefore wouldn't array subscripting be the first thing to be evaluated causing the array to be subscripted to the initial value of i and removing any unambiguity? Obviously not, and I'm missing something.
I do understand that compilers have the freedom to decide when side effects happen in an expression (between sequence points of course) and that that may cause undefined behaviour if the variable in question is used again in the same expression, however in the examples above it seems that any ambiguity is cleared by function calls and array subscripting having highest precedence and defined left-to-right associativity, so I fail to see the ambiguity.
I have a feeling that I have some fundamental misconception about the concepts of associativity, operator precedence and order of evaluation, but I can't point my finger on what it is, and similar questions/answers on this topic were out of my league to understand thoroughly at this point.
FIRST LINE
The left-to-right associativity means that an expression such as f()()() is evaluated as ((f())())(). The associativity of the function call operator () says nothing about its relationship with other operators such as +.
(Note that associativity only really makes sense for nestable infix operators such as binary +, %, or ,. For operators such as function call or the unary ones, associativity is rather pointless in general.)
SECOND LINE
Operator precedence affects parsing, not order of evaluation. The fact that [] has higher precedence than = means that the expression is parsed as (a[i]) = (i++). It says very little about evaluation order; a[i] and i++ must both be evaluated before the assignment, but nothing is said about their order with respect to each other.
To hopefully clear up confusion:
Associativity controls parsing and tells you whether a + b + c is parsed as (a + b) + c (left-to-right) or as a + (b + c) (right-to-left).
Precedence also controls parsing and tells you whether a + b * c is parsed as (a + b) * c (+ has higher precedence than *) or as a + (b * c) (* has higher precedence than +).
Order of evaluation controls which values need to be evaluated in which order. Parts of it can follow from associativity or precedence (an operand must be evaluated before it's used), but it's seldom fully defined by them.
It's not really meaningful to say that function calls have left-to-right associativity, and even if it were meaningful, this would only apply to exotic combinations where two function-call operators were being applied right next to each other. It wouldn't say anything about two separate function calls on either side of a + operator.
Precedence and associativity don't help us at all in the expression a[i] = i++. There simply is no rule that says precisely when within an expression i++ stores the new result back into i, meaning that there is no rule to tell us whether the a[i] part uses the old or the new value. That's why this expression is undefined.
Precedence tells you what happens when you have two different operators that might apply. In a + b * c, does the + or the * apply first? In *p++, does the * or the ++ apply first? Precedence answers these questions.
Associativity tells you what happens when you have two of the same operators that might apply (generally, a string of the same operators in a row). In a + b + c, which + applies first? That's what associativity answers.
But the answers to these questions (that is, the answers supplied by the precedence and associativity rules) apply rather narrowly. They tell you which of the two operators you were wondering about apply first, but they do not tell you much of anything about the bigger expression, or about the smaller subexpressions "underneath" the operators you were wondering about. (For example, if I wrote (a - b) + (c - d) * (e - f), there's no rule to say which of the subtractions happens first.)
The bottom line is that precedence and associativity do not fully determine order of evaluation. Let's say that again in a slightly different way: precedence and associativity partially determine the order of evaluation in certain expressions, but they do not fully determine the order of evaluation in all expressions.
In C, some aspects of the order of evaluation are unspecified, and some are undefined. (This is by contrast to, as I understand it, Java, where all aspects of evaluation order are defined.)
See also this answer which, although it's about a different question, explains the same points in more detail.
Precedence and associativity matter when an expression has more than one operator.
Associativity doesn't matter with addition, because as you may remember from grade school math, addition is commutative and associative -- there's no difference between (a + b) + c, a + (b + c), or (b + c) + a (but see the Note at the end of my answer).
But consider subtraction. If you write
100 - 50 - 5
it matters whether you treat this as
(100 - 50) - 5 = 45
or
100 - (50 - 5) = 55
Left associativity means that the first interpretation will be used.
Precedence comes into play when you have different operators, e.g.
10 * 20 + 5
Since * has higher precedence than +, this is treated like
(10 * 20) + 5 = 205
rather than
10 * (20 + 5) = 250
Finally, order of evaluation is only noticeable when there are side effects or other dependencies between the sub-expressions. If you write
x = f() - g() - h()
and these functions each print something, the language doesn't specify the order in which the output will occur. Associativity doesn't change this. Even though the results will be subtracted in left-to-right order, it could call them in a different order, save the results somewhere, and then subtract them in the correct order. E.g. it could act as if you'd written:
temp_h = h();
temp_f = f();
temp_g = g();
x = (temp_f - temp_g) - temp_h;
Any reordering of the first 3 lines would be allowed as an interpretation.
Note
Note that in some cases, computer arithmetic is not exactly like real arithmetic. Numbers in computers generally have limited range or precision, so there can be anomalous results (e.g. overflow if the result of addition is too large). This could cause different results depending on the order of operations even with operators that are theoretically associative, e.g. mathematically the following two expressions are equivalent:
x + y - z = (x + y) - z
y - z + x = (y - z) + x
But if x + y overflows, the results can be different. Use explicit parentheses to override the default associativity if necessary to avoid a problem like this.
Regarding your first question:
x = f() + g();
The left-to-right associativity relates to operators at the same level that are directly grouped together. For example:
x = a + b - c;
Here the + and - operators have the same precedence level, so a + b is first evaluated, then a + b - c.
For an example more related to yours, imagine a function that returns a function pointer. You could then do something like this:
x()();
In this case, the function x must be called first, then the function pointer returned by x is called.
For the second:
a[i] = i++;
The side effect of the postincrement operator is not guaranteed to occur until the next sequence point. Because there are no sequence points in this expression, the i on the left side may be evaluated before or after the side effect of ++. This invokes undefined behavior due to both reading and writing a variable without a sequence point.
FIRST LINE - Associativity is not relevant here. Associativity only really comes into play when you have a sequence of operators with the same precedence. Let's take the expression x + y - z. The additive operators + and - are left-associative, so that sequence is parsed as (x + y) - z - IOW, the result of z is subtracted from the result of x + y.
THIS DOES NOT MEAN that any of x, y, or z have to be evaluated in any particular order. It does not mean that x + y must be evaluated before z. It only means that the result of x + y must be known before the result of z is subtracted from it.
Regarding x = f() + g(), all that matters is that the results of f() and g() are known before they can be added together - it does not mean that f() must be evaluated before g(). And again, associativity has no effect here.
SECOND LINE - This statement invokes undefined behavior precisely because the order of operations is unspecified (strictly speaking, the expressions a[i] and i++ are unsequenced with respect to each other). You cannot both update an object (i++) and use its value in a computation (a[i]) in the same expression without an intervening sequence point. The result will not be consistent or predictable from build to build (it doesn't even have to be consistent from run to run of the same build). Expressions like a[i] = i++ (or a[i++] = i) and x = x++ all have undefined behavior, and the result can be quite literally anything.
Note that the &&, ||, ?:, and comma operators do force left-to-right evaluation and introduce sequence points, so an expression like
i++ && a[i]
is well-defined - i++ will be evaluated first and its side effect will be applied before a[i] is evaluated.
Precedence and associativity fall out of the language grammar - for example, the grammar for the additive operators + and - is
additive-expression:
multiplicative-expression
additive-expression + multiplicative-expression
additive-expression - multiplicative-expression
IOW, an additive-expression can produce a single multiplicative-expression, or it can produce another additive-expression followed by an additive operator followed by a multiplicative-expression. Let's see how this plays out with x + y - z:
x -- additive-expression ---------+
|
+ +-- additive-expression --+
| |
y -- multiplicative-expression ---+ |
+-- additive-expression
- |
|
z -- multiplicative-expression -----------------------------+
You can see that x + y is grouped together into an additive-expression first, and then that expression is grouped with z to form another additive-expression.

Postfix/Prefix operator precedence and associativity

I'm confused about the precedence and associativity of postfix/prefix operators.
On one hand, as I'm reading K&R book, it states that:
(*ip)++
The parentheses are necessary in this last example; without them, the expression would increment ip instead of what it points to, because unary operators like * and ++ associate right to left.
No mention whatsoever of a difference of associativity between postfix/prefix operators. Both are treated equally. The book also states that * and ++ have the same precedence.
On the other hand, this page states that:
1) Precedence of prefix ++ and * is same. Associativity of both is right to left.
2) Precedence of postfix ++ is higher than both * and prefix ++. Associativity of postfix ++ is left to right.
Which one should I trust? Is it something that changed with the C revisions over the years?
TL;DR: the two descriptions are saying the same thing, using the same words and symbols with slightly different meaning.
On one hand, as I'm reading K&R book, it states that:
(*ip)++
The parentheses are necessary in this last example; without them, the expression would increment ip instead of what it points to,
because unary operators like * and ++ associate right to left.
No mention whatsoever of a difference of associativity between
postfix/prefix operators. Both are treated equally. The book also
states that * and ++ have the same precedence.
It's unclear which edition of K&R you're reading, but the first, at least, does treat the prefix and postfix versions of the increment and decrement operators as a single operator each, with effects depending on whether their operand precedes or follows them.
On the other hand, this page states that:
1) Precedence of prefix ++ and * is same. Associativity of both is
right to left.
2) Precedence of postfix ++ is higher than both * and prefix ++.
Associativity of postfix ++ is left to right.
The language standard and most modern treatments describe the prefix and postfix versions as different operators, disambiguated by their position relative to their operand. The rest of this answer explains how this is an alternative description of the same thing.
Observe that when only unary operators are involved, associativity questions arise only between one prefix and one postfix operator of the same precedence. Among a chain of only prefix or only postfix operations, there is no ambiguity with respect to how they associate. For example, given - - x, you cannot meaningfully group it as (- -) x. The only alternative is - (- x).
Next, observe that all the highest-precedence operators are postfix unary operators, and that in K&R, all the second-precedence operators are prefix unary operators except ambi-fix ++ and --. Applying right-to-left associativity to the second-precedence operators, then, disambiguates only expressions involving postfix ++ or -- and a prefix unary operator, and does so in favor of the postfix operator. This is equivalent to the modern approach of distinguishing the postfix and prefix versions of those operators and assigning higher precedence to the postfix versions.
To get the rest of the way to the modern description, consider the observations I already made that associativity questions arise for unary operators only when prefix and postfix operators are chained, and that all the highest-precedence operators are postfix unary operators. Having distinguished postfix ++ and -- as separate, higher-precedence operators than their prefix versions, one could put them in their own tier between the other postfix operators and all the prefix operators, but putting them instead in the same tier with all the other postfix operators changes nothing about how any expression is interpreted, and is simpler. That's how it is usually represented these days, including in your second resource.
As for left-to-right vs. right-to-left associativity, the question is, again, moot for a precedence tier containing only prefix or only postfix operators. However, describing postfix operators as associating left-to-right and prefix operators as associating right-to-left is consistent with their semantic order of operations.
You can refer to the C11 standard although its section on precedence is a little hard to follow. See sec. 6.5.1. (footnote 85 says "The syntax specifies the precedence of operators in the evaluation of an expression, which is the same
as the order of the major subclauses of this subclause, highest precedence first.")
Basically, postfix operators are higher precedence than prefix because they come earlier in that section, 6.5.2.4 vs. 6.5.3.1. So K&R is correct (no surprise there!) that *ip++ means *(ip++), which is different from (*ip)++, however its point about it being due to associativity is a bit misleading I'd say. And the geeksforgeeks site's point #2 is also correct.
#GaryO's answer is spot on! Postfix has higher precedence because they come earlier.
Here's a small test to sanity check to convince yourself.
I made two integer arrays and a pointer to the start of each array, then ran (*p)++ and *p++ on the two pointers. I printed out the pointer and array state before and after for reference.
#include <stdio.h>
#define PRINT_ARRS printf("a = {%d, %d, %d}\n", a[0], a[1], a[2]); \
printf("b = {%d, %d, %d}\n\n", b[0], b[1], b[2]);
#define PRINT_PTRS printf("*p1 = a[%ld] = %d\n", p1 - a, *p1); \
printf("*p2 = b[%ld] = %d\n\n", p2 - b, *p2);
int main()
{
int a[3] = {1 , 1, 1};
int b[3] = {10,10, 10};
int *p1 = a;
int *p2 = b;
PRINT_ARRS
PRINT_PTRS
printf("(*p1)++: %d\n", (*p1)++);
printf("*p1++ : %d\n\n", *p2++);
PRINT_ARRS
PRINT_PTRS
}
Compiling with gcc and running on my machine produces:
a = {1, 1, 1}
b = {10, 10, 10}
*p1 = a[0] = 1
*p2 = b[0] = 10
(*p1)++: 1
*p2++ : 10
a = {2, 1, 1}
b = {10, 10, 10}
*p1 = a[0] = 2
*p2 = b[1] = 10
You can see that (*p1)++ increments the array value while *p2++ increments the pointer.

see about this infix to postfix conversion

If we solve an infix expression in to postfix expression with braces and without braces will both of them produce same result?
Example:
((2+8)x9)-(5x(5+2))
2+8*9-5*5+2
Will both of these examples produce the same results? If No, then why not?
In general, it will not produce the same result, due to the precedence of the operations. For example, if you define * to have higher precedence than + and -, then * must be evaluated before + or -, which changes how the expression is calculated, and also changes the post-fix representation.

Why are there different associativities among operators in C?

Talking about the associativity of operators in C, I was wondering why there are differences associativities among operators that have the same precedence. for example, postfix increment and postfix decrement have left associativity; while prefix increment and prefix decrement have right associativity. Isn't it simple to have just left or right associativity for all the same precedence operators?
Are there any reasons behind that?
Isn't it simple to have just left or right associativity for all the
same precedence operators?
Yes and it is the case in C. May be you assumed that prefix and postfix have the same precedence which is wrong. Postfix has a higher precedence than prefix!
Also there is another curious case to consider as to why certain operators have certain associativity. From Wiki,
For example, in C, the assignment a = b is an expression that returns
a value (namely, b converted to the type of a) with the side effect of
setting a to this value. An assignment can be performed in the middle
of an expression. (An expression can be made into a statement by
following it with a semicolon; i.e. a = b is an expression but a = b;
is a statement). The right-associativity of the = operator allows
expressions such as a = b = c to be interpreted as a = (b = c),
thereby setting both a and b to the value of c. The alternative (a =
b) = c does not make sense because a = b is not an lvalue.
Binary operators are all left-associative except the assignment operator which is right-associative.
Postfix operators are sometimes (for exemple in K&R 2nd) said to be right-associative but this is to express the idea they have higher precedence than unary operators.

Resources