For the expression a+=b|=c, how will this expression run? - c

I'm learning C programming in university, and for a quiz the question above came. I would like to understand how it will execute. Does it have something to do with the order of precedence?

Yes it does, but that's only half the story.
To solve this one, you need to know two things:
the operator precedence of += and |=
if these are the same, the associativity of these operators (left-to-right or right-to-left)
Fortunately, there is a table at cppreference.
This tells us that:
both += and |= have the same precedence
their associativity is right-to-left
The answer to the quiz (as shown in your screenshot!) is therefore a += (b |= c), that is to say
b |= c is evaluated first and the result is then added to a.
But, as bolov points out, any self-respecting programmer would, at minimum, put the brackets in for you, or (ideally) code this as two separate statements.

When the calculation formulas have the same priority.
It will be resolved from the right side.
In other words, the result is the same as the following formula.
b=b|c;
a=a+b;

Related

Are innermost parentheses evaluated first in C?

Consider the below question in regard to this expression: (a+b)+ (c +(d+e)) +(f+g)
In C we use precedence and associativity to decide which operator should be evaluated first and if there is more than one operator with the same precedence, then in what order they should be associated.
My question is:
Is (d+e) evaluated first by the compiler as it is the innermost nested parenthesis in the expression?
If so, do associativity and precedence somehow recommend innermost parenthesis evaluation should be done first?
I strongly feel that it doesn't, and if it doesn't then why would the compiler even decide to evaluate the innermost parenthesis first? Because going from left to right and evaluating parentheses at the same level seems much more logical to me.
You are confusing operator precedence with order of evaluation. These are different though related terms. See What is the difference between operator precedence and order of evaluation?
Specifically:
Do (d+e) is evaluated first by compiler as it is innermost nested parenthesis in the expression?
This depends on the order of evaluation of the + (additive) operators, nothing else. The order is unspecified for + (as it is for most operators) and we can't know it, nor should we rely on it. Compilers are allowed to do as they please and don't need to tell you (document) how and when they pick a certain order.
If so does associativity and precedence somehow explains innermost parenthesis evaluation should be done first?
No, this has nothing to do with operator precedence.
I strongly feel that it doesn't and if it don't then why we decided to evaluate innermost parenthesis first, because going from left to right and evaluating parenthesis at same level seems much logical to me
See 1)
Consider instead
(a() + b()) + (c() + (d() + e())) + (f() + g())
where each of the functions prints it's letter and, of course, returns a value
int a(void) { putchar('a'); return 42; }
When that expression is evaluated, something like "abcdefg" will be printed according to "order of evaluation", regardless of the operator precedence.
The order of evaluation is not something the Standard has rules for. Each compiler implementation can do the evaluation as it likes best ... even changing the order between compilations or runs.

Which operator(s) in C have wrong precedence?

In the "Introduction" section of K&R C (2E) there is this paragraph:
C, like any other language, has its blemishes. Some of the operators have the wrong precedence; ...
Which operators are these? How are their precedence wrong?
Is this one of these cases?
Yes, the situation discussed in the message you link to is the primary gripe with the precedence of operators in C.
Historically, C developed without &&. To perform a logical AND operation, people would use the bitwise AND, so a==b AND c==d would be expressed with a==b & c==d. To facilitate this, == had higher precedence than &. Although && was added to the language later, & was stuck with its precedence below ==.
In general, people might like to write expressions such as (x&y) == 1 much more often than x & (y==1). So it would be nicer if & had higher precedence than ==. Hence people are dissatisfied with this aspect of C operator precedence.
This applies generally to &, ^, and | having lower precedence than ==, !=, <, >, <=, and >=.
There is a clear rule of precedence that is incontrovertible.
The rule is so clear that for a strongly typed system (think Pascal) the wrong precedence would give clear unambiguous syntax errors at compile time. The problem with C is that since its type system is laissez faire the errors turn out to be more logical errors resulting in bugs rather than errors catch-able at compile time.
The Rule
Let ○ □ be two operators with type
○ : α × α → β
□ : β × β → γ
and α and γ are distinct types.
Then
x ○ y □ z can only mean (x ○ y) □ z, with type assignment
x: α, y : α, z : β
whereas x ○ (y □ z) would be a type error because ○ can only take an α whereas the right sub-expression can only produce a γ which is not α
Now lets
Apply this to C
For the most part C gets it right
(==) : number × number → boolean
(&&) : boolean × boolean → boolean
so && should be below == and it is so
Likewise
(+) : number × number → number
(==) : number × number → boolean
and so (+) must be above (==) which is once again correct
However in the case of bitwise operators
the &/| of two bit-patterns aka numbers produce a number
ie
(&), (|) : number × number → number
(==) : number × number → boolean
And so a typical mask query eg. x & 0x777 == 0x777
can only make sense if (&) is treated as an arithmetic operator ie above (==)
C puts it below which in light of the above type rules is wrong
Of course Ive expressed the above in terms of math/type-inference
In more pragmatic C terms x & 0x777 == 0x777 naturally groups as
x & (0x777 == 0x777) (in the absence of explicit parenthesis)
When can such a grouping have a legitimate use?
I (personally) dont believe there is any
IOW Dennis Ritchie's informal statement that these precedences are wrong can be given a more formal justification
Wrong may sound a bit too harsh. Normal people generally only care about the basic operators like +-*/^ and if those don't work like how they write in math, that may be called wrong. Fortunately those are "in order" in C (except power operator which doesn't exist)
However there are some other operators that might not work as many people expect. For example the bitwise operators have lower precedence than comparison operators, which was already mentioned by Eric Postpischil. That's less convenient but still not quite "wrong" because there wasn't any defined standard for them before. They've just been invented in the last century during the advent of computers
Another example is the shift operators << >> which have lower precedence than +-. Shifting is thought as multiplication and division, so people may expect that it should be at a higher level than +-. Writing x << a + b may make many people think that it's x*2a + b until they look at the precedence table. Besides (x << 2) + (x << 4) + (y << 6) is also less convenient than simple additions without parentheses. Golang is one of the languages that fixed this by putting <</>> at a higher precedence than + and -
In other languages there are many real examples of "wrong" precedence
One example is T-SQL where -100/-100*10 = 0
PHP with the wrong associativity of ternary operators
Excel with wrong precedence (lower than unary minus) and associativity (left-to-right instead of right-to-left) of ^:
According to Excel, 4^3^2 = (4^3)^2. Is this really the standard mathematical convention for the order of exponentiation?
Why does =-x^2+x for x=3 in Excel result in 12 instead of -6?
Why is it that Microsoft Excel says that 8^(-1^(-8^7))) = 8 instead of 1/8?
It depends which precedence convention is considered "correct". There's no law of physics (or of the land) requiring precedence to be a certain way; it's evolved through practice over time.
In mathematics, operator precedence is usually taken as "BODMAS" (Brackets, Order, Division, Multiplication, Addition, Subtraction). Brackets come first and Subtraction comes last.Ordering Mathematical Operations | BODMAS Order of operations
Operator precedence in programming requires more rules as there are more operators, but you can distil out how it compares to BODMAS.
The ANSI C precedence scheme is pictured here:
As you can see, Unary Addition and Subtraction are at level 2 - ABOVE Multiplication and Division in level 3. This can be confusing to a mathematician on a superficial reading, as can precedence around suffix/postfix increment and decrement.
To that extent, it is ALWAYS worth considering adding brackets in your mathematical code - even where syntactically unnecessary - to make sure to a HUMAN reader that your intention is clear. You lose nothing by doing it (although you might get flamed a bit by an uptight code reviewer, in which you can flame back about coding risk management). You might lose readability, but intention is always more important when debugging.
And yes, the link you provide is a good example. Countless expensive production errors have resulted from this.

How does associativity affects the order of evaluation of operands?

In compiler design ,If I have a grammar defined as
E-->E+E/E-E/id
T-->id
Now since this grammar is left-recursive and also we can say that both the + and - operators are left-associative so then when the parse tree would be constructed ,so if I have an input like id+id-id ,so then first id+id would be executed and then the result of addition would subtract id .
And if I have an input string like id+id+id ,then in that case execution order would be (id+id)+id .
I am not getting this concept as I have studied that Associativity of operators do not define the order of evaluation ,if that is so true then what about the parse tree generation because if we are asked to compare two parse trees and find which one would work properly if say I have an input string like id+id-id,then we would chose the parse tree wherein we have the order of evaluation such that the subtree which is rooted at node + would be executed first and then the subtree rooted at - would be executed first ,so please clarify me the actual parameters which decide the order of evaluation in the c program.
The associativity defines whether a - b - c is equivalent to (a - b) - c or a - (b - c), that is whether c is added to the result of adding b to a or whether the result if b + c is added to a. Associativity thus also tells you what the AST of the expression looks like.
What associativity does not tell you is which one of a, b and c is evaluated first. That is if you write f() - g() - h(), you know that it's equivalent to (f() - g()) - h() because subtraction is left-associative. However you do not know whether f is executed before g and/or h and so on. That's what people mean when they say that associativity does not define evaluation order.
clarify me the actual parameters which decide the order of evaluation in the c program.
The order of evaluation of operands in an arithmetic expression in a C program is undefined. That is it is completely up to the compiler.

Evaluating expressions with operators

First, I know I know. This question has kind of been asked some times before, but most of the answers got on other topics only partly answer my question.
I'm doing something which can parse C like expressions.
That includes expressions for example like (some examples)
1) struct1.struct2.structarray[283].shd->_var
2) *((*array_dptr)[2][1] + 5)
3) struct1.struct2.struct3.var + b * c / 3 % 5
Problem is... I need to be fast on this. The fastest possible, even if it makes the code ugly - well, obviously, the speed improvement must be tangible. The reason is that it is interpreted. It needs to be fast...
I have many questions, and I will probably ask some more depending on your answers. But anyways...
First, I'm aware of "operator priorities". For example algorithms implemented in C compilers will assign to operators a priority number and evaluate the expression based on that.
I've consulted this table : http://en.wikipedia.org/wiki/Operators_in_C_and_C++#Operator_precedence
Now, this is cool but... I wonder a few things.
My principal question is... how would you implement this to be the fastest possible?
I have thought about for example... (please note the program I'm speaking about actually parses a file containing these expressions, and not all C operators will be supported)
1) Stocking the expression string into an array, storing each operator position inside an array, and then starting to parse all this crap, starting from the highest priority operator. For example if I had str = "2*1+3", then after checking all the operators present, I would check for the position at str[1], and the check at right and left, do the operation (here multiply) and then substitude the expression with the result and evaluate again.
The problem I see there is... say two operators in the expr are the same priority
for example : var1 * var2 / var3 / var4
since * and / have both the same precedence, how to know on which position to start the parsing? Of course this example is rather intuitive, but I can the problem growing on enormous expressions.
2) Is this even possible to do it non recursive? Usually recursive means slower due to multiple function call setting their own stack frames, re-initializing stuff etc etc.
3) How to distinguish unary operators from non unaries?
For example : 2 + *a + b * c
There is the dereferencing op and the multiplication one. Intuitively I have an idea on how to do it, but I ain't sure. I'd rather have your advices on this (i think : check if one of the right or left members are operators, if so, then it's unary?)
4) I don't get expressions being evaluated right-to-left. Seems so unnatural to be. More that I don't unterstand what does it means. Would you show an example? Why do it that way?!?
5) Do you have better algorithms in head? Better ideas of achieving it?
For now, that sums pretty much what I'm thinking about.
This ain't an homework by the way. It's practical stuff.
Thanks!

Is *++*p acceptable syntax?

In K&R Section 5.10, in their sample implementation of a grep-like function, there are these lines:
while (--argc > 0 && (*++argv)[0] == '-')
while (c = *++argv[0])
Understanding the syntax there was one of the most challenging things for me, and even now a couple weeks after viewing it for the first time, I still have to think very slowly through the syntax to make sense of it. I compiled the program with this alternate syntax, but I'm not sure that the second line is allowable. I've just never seen *'s and ++'s interleaved like this, but it makes sense to me, it compiles, and it runs. It also requires no parentheses or brackets, which is maybe part of why it seems more clear to me. I just read the operators in one direction only (right to left) rather than bouncing back and forth to either side of the variable name.
while (--argc > 0 && **++argv == '-')
while (c = *++*argv)
Well for one, that's one way to make anyone reading your code to go huh?!?!?!
So, from a readability standpoint, no, you probably shouldn't write code like that.
Nevertheless, it's valid code and breaks down as this:
*(++(*p))
First, p is dereferenced. Then it is incremented. Then it is dereferenced again.
To make thing worse, this line:
while (c = *++*argv)
has an assignment in the loop-condition. So now you have two side-effects to make your reader's head spin. YAY!!!
Seems valid to me. Of course, you should not read it left to right, that's not how C compiler parses the source, and that's not how C language grammatics work. As a rule of thumb, you should first locate the object that's subject to operating upon (in this case - argv), and then analyze the operators, often, like in this case, from inside (the object) to outside. The actual parsing (and reading) rules are of course more complicated.
P. S. And personally, I think this line of code is really not hard to understand (and I'm not a C programming guru), so I don't think you should surround it with parentheses as Mysticial suggests. That would only make the code look big, if you know what I mean...
There's no ambiguity, even without knowledge of the precedence rules.
Both ++ and * are prefix unary operators; they can only apply to an operand that follows them. The second * can only apply to argv, the ++ to *argv, and the first * to ++*argv. So it's equivalent to *(++(*argv)). There's no possible relationship between the precedences of ++ and * that could make it mean anything else.
This is unlike something like *argv++, which could conceivably be either (*argv)++ or *(argv++), and you have to apply precedence rules to determine which (it's *(argv++)` because postfix operators bind more tightly than prefix unary operators).
There's a constraint that ++ can only be applied to an lvalue; since *argv is an lvalue, that's not a problem.
Is this code valid? Yes, but that's not what you asked.
Is this code acceptable? That depends (acceptable to who?).
I wouldn't consider it acceptable - I'd consider it "harder to read than necessary" for a few different reasons.
First; lots of programmers have to work with several different languages, potentially with different operator precedence rules. If your code looks like it relies on a specific language's operator precedence rules (even if it doesn't) then people have to stop and try to remember which rules apply to which language.
Second; different programmers have different skill levels. If you're ever working in a large team of developers you'll find that the best programmers write code that everyone can understand, and the worst programmers write code that contains subtle bugs that half of the team can't spot. Most C programmers should understand "*++*argv", but a good programmer knows that a small number of "not-so-good" programmers either won't understand it or will take a while to figure it out.
Third; out of all the different ways of writing something, you should choose the variation that expresses your intent the best. For this code you're working with an array, and therefore it should look like you intend to be working with an array (and not a pointer). Note: For the same reason, "uint32_t foo = 0x00000002;" is better than "uint32_t foo = 0x02;".

Resources