How does associativity affects the order of evaluation of operands? - c

In compiler design ,If I have a grammar defined as
E-->E+E/E-E/id
T-->id
Now since this grammar is left-recursive and also we can say that both the + and - operators are left-associative so then when the parse tree would be constructed ,so if I have an input like id+id-id ,so then first id+id would be executed and then the result of addition would subtract id .
And if I have an input string like id+id+id ,then in that case execution order would be (id+id)+id .
I am not getting this concept as I have studied that Associativity of operators do not define the order of evaluation ,if that is so true then what about the parse tree generation because if we are asked to compare two parse trees and find which one would work properly if say I have an input string like id+id-id,then we would chose the parse tree wherein we have the order of evaluation such that the subtree which is rooted at node + would be executed first and then the subtree rooted at - would be executed first ,so please clarify me the actual parameters which decide the order of evaluation in the c program.

The associativity defines whether a - b - c is equivalent to (a - b) - c or a - (b - c), that is whether c is added to the result of adding b to a or whether the result if b + c is added to a. Associativity thus also tells you what the AST of the expression looks like.
What associativity does not tell you is which one of a, b and c is evaluated first. That is if you write f() - g() - h(), you know that it's equivalent to (f() - g()) - h() because subtraction is left-associative. However you do not know whether f is executed before g and/or h and so on. That's what people mean when they say that associativity does not define evaluation order.
clarify me the actual parameters which decide the order of evaluation in the c program.
The order of evaluation of operands in an arithmetic expression in a C program is undefined. That is it is completely up to the compiler.

Related

For the expression a+=b|=c, how will this expression run?

I'm learning C programming in university, and for a quiz the question above came. I would like to understand how it will execute. Does it have something to do with the order of precedence?
Yes it does, but that's only half the story.
To solve this one, you need to know two things:
the operator precedence of += and |=
if these are the same, the associativity of these operators (left-to-right or right-to-left)
Fortunately, there is a table at cppreference.
This tells us that:
both += and |= have the same precedence
their associativity is right-to-left
The answer to the quiz (as shown in your screenshot!) is therefore a += (b |= c), that is to say
b |= c is evaluated first and the result is then added to a.
But, as bolov points out, any self-respecting programmer would, at minimum, put the brackets in for you, or (ideally) code this as two separate statements.
When the calculation formulas have the same priority.
It will be resolved from the right side.
In other words, the result is the same as the following formula.
b=b|c;
a=a+b;

Are innermost parentheses evaluated first in C?

Consider the below question in regard to this expression: (a+b)+ (c +(d+e)) +(f+g)
In C we use precedence and associativity to decide which operator should be evaluated first and if there is more than one operator with the same precedence, then in what order they should be associated.
My question is:
Is (d+e) evaluated first by the compiler as it is the innermost nested parenthesis in the expression?
If so, do associativity and precedence somehow recommend innermost parenthesis evaluation should be done first?
I strongly feel that it doesn't, and if it doesn't then why would the compiler even decide to evaluate the innermost parenthesis first? Because going from left to right and evaluating parentheses at the same level seems much more logical to me.
You are confusing operator precedence with order of evaluation. These are different though related terms. See What is the difference between operator precedence and order of evaluation?
Specifically:
Do (d+e) is evaluated first by compiler as it is innermost nested parenthesis in the expression?
This depends on the order of evaluation of the + (additive) operators, nothing else. The order is unspecified for + (as it is for most operators) and we can't know it, nor should we rely on it. Compilers are allowed to do as they please and don't need to tell you (document) how and when they pick a certain order.
If so does associativity and precedence somehow explains innermost parenthesis evaluation should be done first?
No, this has nothing to do with operator precedence.
I strongly feel that it doesn't and if it don't then why we decided to evaluate innermost parenthesis first, because going from left to right and evaluating parenthesis at same level seems much logical to me
See 1)
Consider instead
(a() + b()) + (c() + (d() + e())) + (f() + g())
where each of the functions prints it's letter and, of course, returns a value
int a(void) { putchar('a'); return 42; }
When that expression is evaluated, something like "abcdefg" will be printed according to "order of evaluation", regardless of the operator precedence.
The order of evaluation is not something the Standard has rules for. Each compiler implementation can do the evaluation as it likes best ... even changing the order between compilations or runs.

During less than or equal to comparison what comparison is evaluated first?

When we have simple condition (a<=b), what is actually happening? Will it firstly compare a<b, and if it's false will compare a==b (a<b || a==b)?
a <= b evaluates to true (1) if and only if a is less than or equal to b. In typical C implementations, this determination is performed via a single machine instruction. If, for some reason, multiple instructions are needed, the C standard does not specify any ordering for them, just that the result is correct.
If a and b are expressions beyond simple identifiers, the C standard does not specify any ordering for evaluation of them, their parts, or their side effects due to the <= operator, although there may be ordering constraints within the expressions.

C operator precedence [duplicate]

This question already has answers here:
Undefined behavior and sequence points
(5 answers)
Closed 8 years ago.
For my compiler class, we are gradually creating a pseudo-PASCAL compiler. It does, however, follow the same precedence as C. That being said, in the section where we create prefix and postfix operators, I get 0 for
int a = 1;
int b = 2;
++a - b++ - --b + a--
when C returns a 1. What I don't understand is how you can even get a 1. By doing straight prefix first, the answer should be 2. And by doing postfix first, the answer should be -2. By doing everything left to right, I get zero.
My question is, what should my precedence of my operators be to return a 1?
Operator precedence tells you for example whether ++a - b means (++a) - b or ++(a - b). Clearly it should be the former since the latter isn't even valid. In your implementation it's clearly the former (or you wouldn't be getting a result at all), so you implemeneted operator precedence correctly.
Operator precedence has nothing to do with the order in which subexpressions are evaluated. In fact the order in which the operator operands to + and - are evaluated is unspecified in C and any code that modifies the same variable twice without a sequence point in between invokes undefined behavior. So whichever order you choose is fine and 0 is as valid a result as any other value.
It is illegal to change variables several times in a row like that (roughly between asignments, the standard talks about sequence points). Technically, this is what the C standard calls undefined behaviour. The compiler has no obligation to detect you are writing nonsense, and can assume you will never do. Anything whatsoever can happen when you run the program (or even while compiling). Also check nasal demons in the Jargon File.
The ++ increment and -- decrement operators can be placed before or after a value, different affect. If placed before the operand (prefix), its value is immediately changed, if placed after the operand (postfix) its value is noted first, then the value is changed.
McGrath, Mike. (2006). C programming in easy steps, 2nd Edition. United Kingdom : Computer Step.

Lazy arithmetic in C

As far as I know, C uses lazy calculation for logical expressions, e. g. in expression
f(x) && g(x)
g(x) will not be called if f(x) is false.
But what about arithmetic expressions like
f(x)*g(x)
Does g(x) will be called if f(x) is zero?
Yes, arithmetic operations are eager, not lazy.
So in f(x)*g(x) both f and g are always called (pedantically the compiler is transforming that into some A-normal form and could even avoid some calls if that is not observable), but there is no guarantee about the order of calling f before or after g. And evaluating x*1/x or y*1/x is undefined behavior when x is 0.
This is not true in Haskell AFAIU
Yes, g(x) will still be called.
Generally, it would be a quite slow to conditionally elide the evaluation of the right-hand side just because the left-hand side is zero. Perhaps not in the case where the right-hand side is an expensive function call, but the compiler wouldn't presume to know that.
It's called "Short Circuit" instead of lazy. And, at least as far as the standard cares, yes -- i.e., it doesn't specify short-circuit evaluation for *.
A compiler might be able to do short-circuit evaluation if it can be certain g() has no side effects, but only under the as-if rule (i.e., it can do so only by finding that there's no externally observable difference, not because the standard gives it any direct permission to do so).
In case of logical operators && and || order of evaluation bound to take place from left to right and short circuiting takes place.
There is a sequence point between evaluation of the left and right operands of the && (logical AND), || (logical OR) (as part of short-circuit evaluation). For example, in the expression *p++ != 0 && *q++ != 0, all side effects of the sub-expression *p++ != 0 are completed before any attempt to access q, but not in case of arithmetic operators .
While that optimization would be possible, there are a few arguments against it:
You might pay more for the optimization than you get back from it: Unlike with logical operators, the optimization is likely to be beneficial in only a small percentage of all cases with arithmetic operators, but at the same time requires an additional check for 0 for every operation.
Because boolean truth values only have two possible values, there is a theoretical 50 % chance (1 &div; 2) with short-circuiting boolean expressions that the second operand will not have to be evaluated. (This assumes uniform distribution, which is perhaps not realistic, but bear with me.) That is, you are likely to profit from the optimization in a relatively large percentage of cases.
Contrast this with integral numbers, where 0 is only one out of millions of possible values. The probability that the first operand is 0 is much lower: 1 &div; 232 (for 32-bit integers, again assuming uniform distribution). Even if 0 were in fact somewhat more probable to occur than that (i.e. with a non-uniform distribution), it's still unlikely that we're dealing with the same order of magnitude as with truth values.
Floating point math further aggravates that issue. Here you need to deal with the possibility of rounding errors and denormalization. The probability that some calculation yields exactly 0 is likely to be even lower than with integral numbers.
Therefore the optimization is relatively unlikely to result in the remaining operand not being evaluated. But it will result in an added check for zero, 100 % of the time!
If you want evaluation rules to remain reasonably consistent, you would have to redefine short-circuit evaluation order of && and ||: Division has one important corner case, namely division by 0: Even if the first operand is 0, the quotient is not necessarily 0. Divison by 0 is to be treated as an error (except perhaps in IEEE floating-point math); therefore, you always have to evaluate the second operand in order to determine whether the calculation is valid.
There is one alternative optimization for /: division by 1. In that case, you wouldn't have to divide at all, but simply return the first operand. / would therefore be better optimised by starting with the second operand (divisor).
Now, unless you want &&, ||, and * to start evaluation with the first operand, but / to start with the second (which might seem unintuitive), you would have to generally re-define short-circuiting behavior such that the second operand always gets evaluated first, which would be a departure from the status quo.
This is not per se a problem, but might break a lot of existing code if the C language were thus changed.
The optimization might break "compatibility" with C++ code where operators can be overloaded. Would the optimizations still apply to overloaded * and / operators? Or would there have to be two different forms of these operators, one short-circuiting, and one with eager evaluation?
Again, this is not a deficiency inherent in short-circuit arithmetic operators, but an issue that would arise if such short-circuiting were introduced into the C (and C++) language as a breaking change.

Resources