Lazy arithmetic in C - c

As far as I know, C uses lazy calculation for logical expressions, e. g. in expression
f(x) && g(x)
g(x) will not be called if f(x) is false.
But what about arithmetic expressions like
f(x)*g(x)
Does g(x) will be called if f(x) is zero?

Yes, arithmetic operations are eager, not lazy.
So in f(x)*g(x) both f and g are always called (pedantically the compiler is transforming that into some A-normal form and could even avoid some calls if that is not observable), but there is no guarantee about the order of calling f before or after g. And evaluating x*1/x or y*1/x is undefined behavior when x is 0.
This is not true in Haskell AFAIU

Yes, g(x) will still be called.
Generally, it would be a quite slow to conditionally elide the evaluation of the right-hand side just because the left-hand side is zero. Perhaps not in the case where the right-hand side is an expensive function call, but the compiler wouldn't presume to know that.

It's called "Short Circuit" instead of lazy. And, at least as far as the standard cares, yes -- i.e., it doesn't specify short-circuit evaluation for *.
A compiler might be able to do short-circuit evaluation if it can be certain g() has no side effects, but only under the as-if rule (i.e., it can do so only by finding that there's no externally observable difference, not because the standard gives it any direct permission to do so).

In case of logical operators && and || order of evaluation bound to take place from left to right and short circuiting takes place.
There is a sequence point between evaluation of the left and right operands of the && (logical AND), || (logical OR) (as part of short-circuit evaluation). For example, in the expression *p++ != 0 && *q++ != 0, all side effects of the sub-expression *p++ != 0 are completed before any attempt to access q, but not in case of arithmetic operators .

While that optimization would be possible, there are a few arguments against it:
You might pay more for the optimization than you get back from it: Unlike with logical operators, the optimization is likely to be beneficial in only a small percentage of all cases with arithmetic operators, but at the same time requires an additional check for 0 for every operation.
Because boolean truth values only have two possible values, there is a theoretical 50 % chance (1 ÷ 2) with short-circuiting boolean expressions that the second operand will not have to be evaluated. (This assumes uniform distribution, which is perhaps not realistic, but bear with me.) That is, you are likely to profit from the optimization in a relatively large percentage of cases.
Contrast this with integral numbers, where 0 is only one out of millions of possible values. The probability that the first operand is 0 is much lower: 1 ÷ 232 (for 32-bit integers, again assuming uniform distribution). Even if 0 were in fact somewhat more probable to occur than that (i.e. with a non-uniform distribution), it's still unlikely that we're dealing with the same order of magnitude as with truth values.
Floating point math further aggravates that issue. Here you need to deal with the possibility of rounding errors and denormalization. The probability that some calculation yields exactly 0 is likely to be even lower than with integral numbers.
Therefore the optimization is relatively unlikely to result in the remaining operand not being evaluated. But it will result in an added check for zero, 100 % of the time!
If you want evaluation rules to remain reasonably consistent, you would have to redefine short-circuit evaluation order of && and ||: Division has one important corner case, namely division by 0: Even if the first operand is 0, the quotient is not necessarily 0. Divison by 0 is to be treated as an error (except perhaps in IEEE floating-point math); therefore, you always have to evaluate the second operand in order to determine whether the calculation is valid.
There is one alternative optimization for /: division by 1. In that case, you wouldn't have to divide at all, but simply return the first operand. / would therefore be better optimised by starting with the second operand (divisor).
Now, unless you want &&, ||, and * to start evaluation with the first operand, but / to start with the second (which might seem unintuitive), you would have to generally re-define short-circuiting behavior such that the second operand always gets evaluated first, which would be a departure from the status quo.
This is not per se a problem, but might break a lot of existing code if the C language were thus changed.
The optimization might break "compatibility" with C++ code where operators can be overloaded. Would the optimizations still apply to overloaded * and / operators? Or would there have to be two different forms of these operators, one short-circuiting, and one with eager evaluation?
Again, this is not a deficiency inherent in short-circuit arithmetic operators, but an issue that would arise if such short-circuiting were introduced into the C (and C++) language as a breaking change.

Related

During less than or equal to comparison what comparison is evaluated first?

When we have simple condition (a<=b), what is actually happening? Will it firstly compare a<b, and if it's false will compare a==b (a<b || a==b)?
a <= b evaluates to true (1) if and only if a is less than or equal to b. In typical C implementations, this determination is performed via a single machine instruction. If, for some reason, multiple instructions are needed, the C standard does not specify any ordering for them, just that the result is correct.
If a and b are expressions beyond simple identifiers, the C standard does not specify any ordering for evaluation of them, their parts, or their side effects due to the <= operator, although there may be ordering constraints within the expressions.

Which operator(s) in C have wrong precedence?

In the "Introduction" section of K&R C (2E) there is this paragraph:
C, like any other language, has its blemishes. Some of the operators have the wrong precedence; ...
Which operators are these? How are their precedence wrong?
Is this one of these cases?
Yes, the situation discussed in the message you link to is the primary gripe with the precedence of operators in C.
Historically, C developed without &&. To perform a logical AND operation, people would use the bitwise AND, so a==b AND c==d would be expressed with a==b & c==d. To facilitate this, == had higher precedence than &. Although && was added to the language later, & was stuck with its precedence below ==.
In general, people might like to write expressions such as (x&y) == 1 much more often than x & (y==1). So it would be nicer if & had higher precedence than ==. Hence people are dissatisfied with this aspect of C operator precedence.
This applies generally to &, ^, and | having lower precedence than ==, !=, <, >, <=, and >=.
There is a clear rule of precedence that is incontrovertible.
The rule is so clear that for a strongly typed system (think Pascal) the wrong precedence would give clear unambiguous syntax errors at compile time. The problem with C is that since its type system is laissez faire the errors turn out to be more logical errors resulting in bugs rather than errors catch-able at compile time.
The Rule
Let ○ □ be two operators with type
○ : α × α → β
□ : β × β → γ
and α and γ are distinct types.
Then
x ○ y □ z can only mean (x ○ y) □ z, with type assignment
x: α, y : α, z : β
whereas x ○ (y □ z) would be a type error because ○ can only take an α whereas the right sub-expression can only produce a γ which is not α
Now lets
Apply this to C
For the most part C gets it right
(==) : number × number → boolean
(&&) : boolean × boolean → boolean
so && should be below == and it is so
Likewise
(+) : number × number → number
(==) : number × number → boolean
and so (+) must be above (==) which is once again correct
However in the case of bitwise operators
the &/| of two bit-patterns aka numbers produce a number
ie
(&), (|) : number × number → number
(==) : number × number → boolean
And so a typical mask query eg. x & 0x777 == 0x777
can only make sense if (&) is treated as an arithmetic operator ie above (==)
C puts it below which in light of the above type rules is wrong
Of course Ive expressed the above in terms of math/type-inference
In more pragmatic C terms x & 0x777 == 0x777 naturally groups as
x & (0x777 == 0x777) (in the absence of explicit parenthesis)
When can such a grouping have a legitimate use?
I (personally) dont believe there is any
IOW Dennis Ritchie's informal statement that these precedences are wrong can be given a more formal justification
Wrong may sound a bit too harsh. Normal people generally only care about the basic operators like +-*/^ and if those don't work like how they write in math, that may be called wrong. Fortunately those are "in order" in C (except power operator which doesn't exist)
However there are some other operators that might not work as many people expect. For example the bitwise operators have lower precedence than comparison operators, which was already mentioned by Eric Postpischil. That's less convenient but still not quite "wrong" because there wasn't any defined standard for them before. They've just been invented in the last century during the advent of computers
Another example is the shift operators << >> which have lower precedence than +-. Shifting is thought as multiplication and division, so people may expect that it should be at a higher level than +-. Writing x << a + b may make many people think that it's x*2a + b until they look at the precedence table. Besides (x << 2) + (x << 4) + (y << 6) is also less convenient than simple additions without parentheses. Golang is one of the languages that fixed this by putting <</>> at a higher precedence than + and -
In other languages there are many real examples of "wrong" precedence
One example is T-SQL where -100/-100*10 = 0
PHP with the wrong associativity of ternary operators
Excel with wrong precedence (lower than unary minus) and associativity (left-to-right instead of right-to-left) of ^:
According to Excel, 4^3^2 = (4^3)^2. Is this really the standard mathematical convention for the order of exponentiation?
Why does =-x^2+x for x=3 in Excel result in 12 instead of -6?
Why is it that Microsoft Excel says that 8^(-1^(-8^7))) = 8 instead of 1/8?
It depends which precedence convention is considered "correct". There's no law of physics (or of the land) requiring precedence to be a certain way; it's evolved through practice over time.
In mathematics, operator precedence is usually taken as "BODMAS" (Brackets, Order, Division, Multiplication, Addition, Subtraction). Brackets come first and Subtraction comes last.Ordering Mathematical Operations | BODMAS Order of operations
Operator precedence in programming requires more rules as there are more operators, but you can distil out how it compares to BODMAS.
The ANSI C precedence scheme is pictured here:
As you can see, Unary Addition and Subtraction are at level 2 - ABOVE Multiplication and Division in level 3. This can be confusing to a mathematician on a superficial reading, as can precedence around suffix/postfix increment and decrement.
To that extent, it is ALWAYS worth considering adding brackets in your mathematical code - even where syntactically unnecessary - to make sure to a HUMAN reader that your intention is clear. You lose nothing by doing it (although you might get flamed a bit by an uptight code reviewer, in which you can flame back about coding risk management). You might lose readability, but intention is always more important when debugging.
And yes, the link you provide is a good example. Countless expensive production errors have resulted from this.

Does AND(&&) inside an If-Statement get checked in the order I type it? [duplicate]

This question already has answers here:
Is short-circuiting logical operators mandated? And evaluation order?
(7 answers)
Closed 4 years ago.
Let's say I have this code:
if(number_a==1 && number_b==2) {
doStuff();
}
Will my code run any faster, If i split it up into:
if(number_a==1){
if(number_b==2){
doStuff();
}
}
Or does the computer run the code in the exact order I gave it, checking a first, and instantly moving on if a isn't 1 ?
Or is there a chance the computer checks b first, or checks both(even if a isn't 1)?
For C this is well-defined: number_a == 1 is evaluated first, and number_b == 2 is evaluated only if the first condition was true (this is called short-circuiting and is important if one of the conditions has a side effect of evaluation).
Cppreference.com confirms this with a more formal restatement:
The logical AND expression has the form lhs && rhs
...
There is a sequence point after the evaluation of lhs. If the result of lhs compares equal to zero, then rhs is not evaluated at all (so-called short-circuit evaluation)
That said, you should not worry about optimization at this level and trust your compiler instead. The difference between these two computations is microscopic and might even be the opposite of what you expect due to microarchitectural effects such as pipelining and other low-level bits.
Instead, focus on writing code that is clean, concise, and readable, while writing algorithms that are efficient in their design. Microoptimization can then be performed if it is absolutely necessary, as a later step.

Can we compare benchmark/performance of binary operators?

My question is about the performance (execution time / benchmark) of binary operators, can we say by example that performing a + b is faster than a % b.
My question is not limited to only those operators (+ and %) but also:
Additive operators (+ and -)
Multiplicative operators (*, /, %...)
Comparative operators (<, >, <=...)
BITWISE and shift operators (<<, <<<...)
...
A couple of additions to FUZxxl's answer:
on modern Intels and AMDs both + and * have roughly the same (very fast) throughput, but * usually has higher latency. Throughput is how often you can issue a command, and latency is how long you'd have to wait before the results are ready (while the CPU executes something else out of order)
some RISC CPUs have pretty expensive shifts (namely, the ones used on Xbox360 and PS3)
they "fixed" the division some time ago, and it's no longer as horribly slow as it used to be. I think FP division is about 16 clocks now (integer might actually be slower)
while comparisons are all fast per se, conditional jumps can be very slow if they are mispredicted (since the CPU will have to dump everything that it would have predictively executed ahead). Whether the CPU manages to predict the results of a comparison depends on how random they are (when the same check is executed many times). However, even if they tend to follow a pattern, each jump uses up a branch prediction slot, so it may evict another jump from it, and that other branch would suffer the misprediction penalty instead. In other words, comparisons can be pretty expensive.
The performance of these operators depends on the platform. If an operation expresses with a “slow” operator can be implemented with a “fast” operator, you can generally expect the compiler to pick this up and emit fast code. Do not use “faster” operands just because someone told you they are faster without benchmarking.
Generally though, operators can be classified in speed roughly according to the following scale:
Zero cycles: Addition immediately preceding dereferencing such as in an array expression a[b] is usually free. Unary + is free, too.
One cycle: For integer operands: binary +, -, <<, >>, &, |, ^, unary -, ~, casts between integer types or pointers, if the result is not used numerically: !, <, >, <=, >=, !=, &&, ||
Three to four cycles: binary * on integer operands, on floating point operands: binary +, -
20 cycles (?): integer binary /, %
50 cycles (?): floating point /, fmod
Your mileage may vary, do not rely on this table, benchmark when in doubt.

What is the definition of "arithmetic operation" in C99?

In C99, the term arithmetic operation appears 16 times, but I don't see a definition for it.
The term arithmetic operator only appears twice in the text (again without definition) but it does appear in the Index:
arithmetic operators
additive, 6.5.6, G.5.2
bitwise, 6.5.10, 6.5.11, 6.5.12
increment and decrement, 6.5.2.4, 6.5.3.1
multiplicative 6.5.5, G.5.1
shift, 6.5.7
unary, 6.5.3.3
Then we have + - | &(binary) ++ -- *(binary) / % << >> ~ as arithmetic operators, if the Index is considered normative!
Perhaps we should identify arithmetic operation as being the use of an arithmetic operator. But F9.4.5 says that the sqrt() function is also an arithmetic operation, and refers to IEC 60559 (aka. IEEE754) for details. So there must be arithmetic operations that are not just the use of arithmetic operators.
Since we don't have a formal definition let's see if we can piece together a rationale interpretation of what an arithmetic operation should be. This will be speculative but I can not find any obvious defect reports or open issues that cover this.
I guess I would start with what are considered arithmetic types, which is covered in section 6.2.5 Types paragraph 18 says (emphasis mine going forward):
Integer and floating types are collectively called arithmetic types.
Each arithmetic type belongs to one type domain: the real type domain
comprises the real types, the complex type domain comprises the
complex types.
ok, so we know that an arithmetic operation has to operate on either an integer or a floating point type. So what is an operation? It seems like we have a good go at defining that from section 5.1.2.3 Program execution paragraph 2 which says:
Accessing a volatile object, modifying an object, modifying a file, or
calling a function that does any of those operations are all side
effects,11) which are changes in the state of the execution
environment. [...]
So modifying an object or call a function that does that, it is an operation. What is an object? Section 3.14 says:
region of data storage in the execution environment, the contents of
which can represent values
Although the standard seems to use the term operation more loosely to mean an evaluation, for example in section 7.12.1 Treatment of error conditions it says:
The behavior of each of the functions in is specified for all
representable values of its input arguments, except where stated
otherwise. Each function shall execute as if it were a single
operation without generating any externally visible exceptional
conditions.
and in section 6.5 Expressions paragraph 8 which says:
A floating expression may be contracted, that is, evaluated as though
it were an atomic operation [...]
So this would seem to imply that an evaluation is an operation.
So it would seem from these sections that pretty much all the arithmetic operators and any math function would fall under a common sense definition of arithmetic operation.
The most convincing bit I could find to be an implicit definition lies in 7.14 Signal Handling, paragraph 3, in the definition of the SIGFPE signal:
SIGFPE - an erroneous arithmetic operation, such as a zero divide or an operation resulting in overflow
One might then draw a conclusion that any operation that may cause SIGFPE to be raised can be considered an arithmetic operation; only arithmetic operations can result in the SIGFPE signal being raised.
That covers pretty much anything in <math.h> and the arithmetic operators, and <complex.h> if implemented. While a signal may not be raised for integral types, signed overflow and other "exceptional" conditions are allowed to generate trap representations, which means no other operations may be carried out reliably until a valid value is obtained — something that can only be done via assignment. In other words, the definition can apply equally to operations on an integral value.
As a result, pretty much any operation other than getting the size of an object/type, dereferencing a pointer, and taking the address of an object may be considered an arithmetic operation. Note that a[n] is *((a) + (n)), so even using an array can be considered an arithmetic operation.
An arithmetic operation involve manipulation of numbers. sqrt also manipulate numbers and that could be the reason that standard says it an arithmetic operation.

Resources