is checking a complex expression in a loop optimized by the compiler? - c

(If this is a duplicate please point me to an answer)
I have two scenarios where a loop is checking a complex expression over and over again (a complex expression would consist of math operations and retrieving data):
for (int i = 0; i < expression; i++) {
// stuff
}
for (int i = 0; i < someNumber; i++) {
if (i == expression) break;
}
I'm wondering if it's more efficient to pre-calculate the expression and check against a known value like so
int known = expression;
for (int i = 0; i < known; i++) {
// stuff
}
for (int i = 0; i < someNumber; i++) {
if (i == known) break;
}
or if it's done by the compiler automatically.
For reference, I'm running the loop ~700 000 000 times and the expression is something like structure->arr[j] % n or sqrt(a * n + b)
Is it even worth it?

If the compiler is able to detect that calculating expression will give the same result every time, it will only do the calculation once.
The tricky part is: "If the compiler is able to ...."
Compilers are very smart and will probably be successful in most cases. But why take the chance?
Just write that extra line to do the calculation before the loop as you did in your second example.
By doing that you send a clear message to the compiler about expression being constant within the loops. Further it may also help your co-workers to easier understand the code.
That said... you yourself must be sure that expression is in fact the same every time. Let's look at your example:
the expression is something like structure->arr[i] % n or sqrt(a * n + b)
Now the first one, i.e. structure->arr[i] % n depends on the loop variable i so it will be a big mistake to move the code outside the loop.
The second (i.e. sqrt(a * n + b)) looks better provided that a n b doesn't change inside the loop.

Related

Repetitive calculations with sizeof, strlen, etc

Let's say I have a large array and I'm doing something like the following:
for (int i = 0; i < sizeof(arr)/sizeof(arr[0]); ++i)
printf("arr[%d]=%d\n", i, arr[i]);
Of course, the sizeof stuff shouldn't be calculated every time so it should be like this instead:
size_t len = sizeof(arr)/sizeof(arr[0]);
for (int i = 0; i < len; ++i)
printf("arr[%d]=%d\n", i, arr[i]);
My questions is whether it can be assumed that any compiler will automatically do the above optimization and so it doesn't matter which approach I use? Or should I assume that that's not the case and the second approach is the only correct one.
There is no reason to hoist the division out of the loop condition. It isn't even a question of how many times the division is evaluated (as is the usual concern in these cases, and may require the compiler to use "escape analysis" to determine whether the length value can change between iterations). The simple fact is that sizeof(arr)/sizeof(arr[0]) is a constant expression which will be evaluated at compile time, and will be no different to if you hard-coded a number there.

For versus while versus do loop statements

Is there a difference between for and while statements? Is it just syntax?
#include <stdio.h>
void main() {
int cent_temp = 0;
int fah_temp;
while (cent_temp <= 20) {
fah_temp = (9 * cent_temp) / 5 + 32;
printf("%d degrees C = %d degrees F\n", cent_temp, fah_temp);
cent_temp++;
}
}
This means to me....
While the value of cent_temp is less than 20 then calculate fah_temp. Then increase the value of cent_temp by 1 and check it is still less than 20; then go round the loop again.
Regarding the syntax:
printf("%d degrees C = %d degrees F\n", cent_temp, fah_temp);
This means %d means print to the screen, replace with a decimal number the value of cent_temp and %d means replace with a decimal number the value of fah_temp.
#include <stdio.h>
void main() {
int cent_temp;
int fah_temp;
for (cent_temp = 0; cent_temp <= 20; cent_temp++) {
fah_temp = (9 * cent_temp) / 5 + 32;
printf("%2d degrees C = %2d degrees F\n", cent_temp, fah_temp);
}
}
My interpretation of the above is:
for cent_temp = 0 repeat while cent_temp less than 20 and then execute cent_temp+1 at the end. So cent_temp 0 goes into the loop to calculate fah_temp and gets printed to the screen. Then cent_temp goes up by one then goes round the loop again. Here I've used %2d instead of %d to signify that it should have 2 spaces for a decimal number (and they line up when executed). Both codes will not execute if cent_temp > 20.
Similarly rearranging the statement in a do while loop has a similar effect and doesn't really have an impact on the result.
Does each type of loop have a different application?
Please correct me if I wrong!
Is there a difference between 'for' and 'while' statements? Is it just
syntax?
To me, it is just syntax.
From K&R section 3.5 Loops -- While and For, I quote:
The for statement
for (expr1; expr2; expr3)
statement
is equivalent to
expr1;
while (expr2) {
statement
expr3;
}
except for the behavior of continue.
Grammatically, the three components of a for loop are expressions.
Most commonly, expr1 and expr3 are assignments or function calls
and expr2 is a relational expression.
Notes
As user #chqrlie has mentioned in the comments, control statements like break and continue make the situation slightly murkier.
There are some situations where the modify statement is necessary in the loop body. For example Erase-remove idiom with std::set failing with constness-related error (in C++ though)
Example
As an example, let us write a loop to print all the odd numbers between 1 and 100.
int i = 1;
while (i <= 100) {
printf("%d\n", i);
i += 2;
}
for (int i = 1; i <= 100; i += 2) {
printf("%d\n", i);
}
Opinion
I am not a language expert, but in most situations in practice I find them transformable.
I personally prefer using for syntax because:
loop control structure is in one single place (the for header) making it easy to read, and
the loop variable (e.g. i) is not exposed to the outer scope.
for(cent_temp = 0; cent_temp <= 20; cent_temp++)
{ /* code */ }
is 100% equivalent to
cent_temp = 0;
while(cent_temp <= 20)
{
/* code */
cent_temp++;
}
But a do-while is different since it puts the condition check at the end.
As for when to use which loop, it is a matter of style and therefore a bit subjective. The industry de facto standard style, used by the majority of all C programmers, goes like this:
for loops should always be used when performing a known number of iterations. It is then considered the most readable form.
while loops should be used the the number of iterations is unknown in advance, or when the loop is turning complex for some reason. For example if you need to alter the loop iterator variable inside the loop body, then you should use a while loop instead of a for loop.
do while loops should be used for special cases where you need to skip the condition check the first lap of the loop, for example do { result = send(); } while(result == ok);.
I looked at my Code Complete by Steve McConnell (the bible).
Here is what you can read in chapter 16:
A for loop is a good choice when you need a loop that executes a specified number of times. [...]
Use for loops for simple activities that don't require internal loops controls. Use them when the loop involves simple increments or simple decrements, such as iterating through the elements in a container. The point of a for loop is that you set it up at the top of the loop and then forget about it. You don't have to do anything inside the loop to control it. If you have a condition under which execution has to jump out of a loop, use a while loop instead.
Likewise, don't explicitly change the index value of a for loop to force it to terminate. Use a while loop instead. The for loop is for simple uses. Most complicated looping tasks are better handled by a while loop.
In general, you would use a for loop to iterate over a finite set of values, whereas you'd use a while or do-while loop to iterate while a specific condition or set of conditions is true. In most of C's contemporaries (Basic, Pascal, Fortran, etc.), a for loop can only iterate over a scalar index:
Fortran:
DO 10 i=1,10
statements
10 CONTINUE
Pascal:
for i := 1 to 10 do
begin
statements
end;
Both of these snippets loop exactly 10 times. The index i is initialized and updated by the loop automagically. I'd have to go back and check, but I'm pretty sure you cannot write to i in the loop body.
C actually blurred the lines between a for and while loop by adding the control expression:
for ( init-expr ; control-expr ; update-expr )
statement
In C, a for loop can iterate over a scalar just like Fortran or Pascal:
for( i = 0; i < 10; i++ )
{
do_something_with( i );
}
Or it can iterate over multiple scalars:
for ( i = 0, j = 0; i < 10 && j < 10; i++, j++ )
{
do_something_with( i, j );
}
Or it can iterate over the contents of a file:
for( c = fgetc( in ); c != EOF; c = fgetc( in ) )
{
do_something_with( c );
}
Or it can iterate over a linked list:
for( cur = head; cur != NULL; cur = cur->next )
{
do_something_with( cur );
}
In Fortran and Pascal, those last three loops would have to be expressed as while loops (which I'm not going to do, because I've pretty much exhausted my Fortran and Pascal knowledge already).
The other big difference between a C for loop and those of Fortran or Pascal is that you can write to the loop index (i, j, c, or cur) in the loop body; it's not specially protected in any way.
A while or do-while loop is used to iterate as long as a specific condition or set of conditions is true:
while( control-expr )
statement
do
statement
while( control-expr );
In both a for and while loop, the condition is tested before the loop body executes; in a do-while loop, the condition is tested after the loop body executes, so a do-while loop will always execute at least once.
In C, you can use either a for loop or a while loop in many circumstances:
while ( ( c = fgetc( in ) ) != EOF )
do_something_with( c );
for ( c = fgetc( in ); c != EOF; c = fgetc( in ) )
do_something_with( c );
Both loops do exactly the same thing; it's just a matter of which one you think more clearly expresses your intent, or which you think would be easier for other people to understand.
From the point of view of algorithmic for and while are not the same. Shortly, in algorithmic, for should be used when bounds are known and while when you don't know if the condition can be met or when it can be. For is to repeat something n times (n known), which is exactly the case of your example computation; a for loop should be used (don't you think what the loop makes is more clearly stated in the for loop ?). If you want an example of a must be used while loop, look at something like Collatz sequence. From a point of view of computability, for loops can always be transformed in while loops but not the converse.
From the point of view of computer languages it is now common to fuse both, in C for example, it makes no difference, only syntactic. But remember that in some other language that could be very different, for example in Pascal for loops are very limited.
Source code is written not only to be compiled and executed by computers but also to be read and understood by humans.
A computer doesn't really mind whether a for loop, a while loop or a goto is used. On the other hand, a human expects different meanings for different structures.
computing values over a known range of inputs is best shown with a for loop;
reading a file up to its end is best shown with a while loop.
Choosing which structure to use is similar as choosing a variable name.

C string length - is this valid code? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Is this type of expression in C valid (on all compilers)?
If it is, is this good C?
char cloneString (char clone[], char string[], int Length)
{
if(!Length)
Length=64
;
int i = 0
;
while((clone[i++] = string[i]) != '\0', --Length)
;
clone[i] = '\0';
printf("cloneString = %s\n", clone);
};
Would this be better, worse, indifferent?
char *cloneString (char clone[], char string[], int Length)
{
if(!Length)
Length=STRING_LENGTH
;
char *r = clone
;
while
( //(clone[i++] = string[i]) != '\0'
*clone++ = *string++
, --Length
);
*clone = '\0';
return clone = r
;
printf("cloneString = %s\n", clone);
};
Stackoverflow wants me to add more text to this question!
Okay! I'm concerned about
a.) expressions such as c==(a=b)
b.) performance between indexing vs pointer
Any comments?
Thanks so much.
Yes, it's syntactically valid on all compilers (though semantically valid on none), and no, it isn't considered good C. Most developers will agree that the comma operator is a bad thing, and most developers will generally agree that a single line of code should do only one specific thing. The while loop does a whole four and has undefined behavior:
it increments i;
it assigns string[i] to clone[i++]; (undefined behavior: you should use i only once in a statement that increments/decrements it)
it checks that string[i] isn't 0 (but discards the result of the comparison);
it decrements Length, and terminates the loop if Length == 0 after being decremented.
Not to mention that assuming that Length is 64 if it wasn't provided is a terrible idea and leaves plenty of room for more undefined behavior that can easily be exploited to crash or hack the program.
I see that you wrote it yourself and that you're concerned about performance, and this is apparently the reason you're sticking everything together. Don't. Code made short by squeezing statements together isn't faster than code longer because the statements haven't been squeezed together. It still does the same number of things. In your case, you're introducing bugs by squeezing things together.
The code has Undefined Behavior:
The expression
(clone[i++] = string[i])
both modifies and accesses the object i from two different subexpressions in an unsequenced way, which is not allowed. A compiler might use the old value of i in string[i], or might use the new value of i, or might do something entirely different and unexpected.
Simple answer no.
Why return char and the function has no return statement?
Why 64?
I assume that the two arrays are of length Length - Add documentation to say this.
Why the ; on a new line and not after the statement?
...
Ok so I decided to evolve my comments into an actual answer. Although this doesn’t address the specific piece of code in your question, it answers the underlying issue and I think you will find it illuminating as you can use this — let’s call it guide — on your general programming.
What I advocate, especially if you are just learning programming is to focus on readability instead of small gimmicks that you think or was told that improve speed / performance.
Let’s take a simple example. The idiomatic way to iterate through a vector in C (not in C++) is using indexing:
int i;
for (i = 0; i < size; ++i) {
v[i] = /* code */;
}
I was told when I started programming that v[i] is actually computed as *(v + i) so in generated assembler this is broken down (please note that this discussion is simplified):
multiply i with sizeof(int)
add that result to the address of v
access the element at this computed address
So basically you have 3 operations.
Let’s compare this with accessing via pointers:
int *p;
for (p = v; p != v + size; ++p) {
*p = /*..*/;
}
This has the advantage that *p actually expands to just one instruction:
access the element at the address p.
2 extra instructions don’t seam much but if your program spends most of it’s time in this loop (either extremely large size or multiple calls to (the functions containing this) loop) you realise that the second version makes your program almost 3 times faster. That is a lot. So if you are like me when I started, you will choose the second variant. Don’t!
So the first version has readability (you explicitly describe that you access the i-th element of vector v), the second one uses a gimmick in detriment of readability (you say that you access a memory location). Now this might not be the best example for unreadable code, but the principle is valid.
So why do I tell you to use the first version: until you have a firm grasp on concepts like cache, branching, induction variables (and a lot more) and how they apply in real world compilers and programs performance, you should stay clear of these gimmicks and rely on the compiler to do the optimizations. They are very smart and will generate the same code for both variants (with optimization enabled of course). So the second variant actually differs just by readability and is identical performance-wise with the first.
Another example:
const char * str = "Some string"
int i;
// variant 1:
for (i = 0; i < strlen(str); ++i) {
// code
}
// variant 2:
int l = strlen(str);
for (i = 0; i < l; ++i) {
// code
}
The natural way would be to write the first variant. You might think that the second improves performance because you call the function strlen on each iteration of the loop. And you know that getting the length of a string means iterating through all the string until you reach the end. So basically a call to strlen means adding an inner loop. Ouch that has to slow the program down. Not necessarily: the compiler can optimize the call out because it always produces the same result. Actually you can do harm as you introduce a new variable which will have to be assigned a different register from a very limited registry pool (a little extreme example, but nevertheless a point is to be made here).
Don’t spend your energy on things like this until much later.
Let me show you something else that will illustrate further more that any assumptions that you make about performance will be most likely be false and misleading (I am not trying to tell you that you are a bad programmer — far from it — just that as you learn, you should invest your energy in something else than performance):
Let’s multiply two matrices:
for (k = 0; k < n; ++k) {
for (i = 0; i < n; ++i) {
for (j = 0; j < n; ++j) {
r[i][j] += a[i][k] * b[k][j];
}
}
}
versus
for (k = 0; k < n; ++k) {
for (j = 0; j < n; ++j) {
for (i = 0; i < n; ++i) {
r[i][j] += a[i][k] * b[k][j];
}
}
}
The only difference between the two is the order the operations get executed. They are the exact same operations (number, kind and operands), just in a different order. The result is equivalent (addition is commutative) so on paper they should take the EXACT amount of time to execute. In practice, even with optimizations enable (some very smart compilers can however reorder the loops) the second example can be up to 2-3 times slower than the first. And even the first variant is still a long long way from being optimal (in regards to speed).
So basic point: worry about UB as the other answers show you, don’t worry about performance at this stage.
The second block of code is better.
The line
printf("cloneString = %s\n", clone);
there will never get executed since there a return statement before that.
To make your code a bit more readable, change
while
(
*clone++ = *string++
, --Length
);
to
while ( Length > 0 )
{
*clone++ = *string++;
--Length;
}
This is probably a better approach to your problem:
#include <stdio.h>
#include <string.h>
void cloneString(char *clone, char *string)
{
for (int i = 0; i != strlen(string); i++)
clone[i] = string[i];
printf("Clone string: %s\n", clone);
}
That been said, there's already a standard function to to that:
strncpy(const char *dest, const char *source, int n)
dest is the destination string, and source is the string that must be copied. This function will copy a maximum of n characters.
So, your code will be:
#include <stdio.h>
#include <string.h>
void cloneString(char *clone, char *string)
{
strncpy(clone, string, strlen(string));
printf("Clone string: %s\n", clone);
}

Using a 'for' loop iterator after the loop exits in C

For years, I've gotten in to the habit of not using the value of a for loop iterator after the loop exits. I could have sworn that I did this, because it used to produce a compiler warning, but after I was challenged in a recent code review, I was proven wrong.
For example, I always did this (NOTE: our code standards prohibit the use of the "break" keyword):
int i, result;
bool done = false;
for (i=0; i<10 && !done; i++) {
if (some_condition) {
result = i;
done = true;
}
}
// Value of i may be undefined here
Now, obviously the result variable could be removed, if I can rely on the value of i. I thought that because of compiler optimization, you could not rely on the value of the loop iterator. Am I just remembering a phantom teaching? Or is this the standard (specifically regarding GNU C)?
There is nothing wrong in C89, C99, or C11 to access the iteration variable after the for statement.
int i;
for (i = 0; i < 10; i++) {
/* Some code */
}
printf("%d\n", i); // No magic, the value is 10
From C99, you can use also a declaration as the first clause of the for statement, and in that case of course the declared variable cannot be used after the for statement.
Different languages have different rules. In Pascal, the compiler is allowed to optimize away storing the loop index after the final increment, so it might be the first loop-terminating value or it might be the last valid value.
There are plenty of usage cases where the for loop is used for nothing else but advancing the iterator. This can be seen in some implementations of strlen (though admittedly there are other ways to do strlen), and other sorts of functions whose goal it is to find a certain limit:
/*find the index of the first element which is odd*/
for (ii = 0; ii < nelem && arry[ii] % 2 == 0; ii++);
As mentioned, the point of confusion may come from constructs where the iterator itself is defined within the for statement.
In general for statements are very very powerful, and it's unfortunate that they're usually never utilized to their full potential.
For example, a different version of the same loop can be written as follows (though it wouldn't demonstrate the safety of using the iterator):
#include <stdio.h>
int main(void)
{
int cur, ii = 0, nelem, arry [] = { 1, 2, 4, 6, 8, 8, 3, 42, 45, 67 };
int sum = 0;
nelem = sizeof(arry) / sizeof(int);
/* Look mom! no curly braces! */
for (
ii = 0;
ii < nelem && ((cur = arry[ii]) %2 == 0 ||
((printf("Found odd number: %d\n", cur)||1)));
ii++, sum += cur
);
printf("Sum of all numbers is %d\n", sum);
return 0;
}
In this particular case, it seems like a lot of work for this specific problem, but it can be very handy for some things.
Even though the value of that for loop's control variable is well defined, you might have been told to avoid using the for loop's control variable after the for loop because of the way scoping of that variable is handled, and especially because the handling has changed of the history of C++ (I know this question is tagged "C", but I think the rationale for avoiding using for loop control variable after the loop may have origins in this C++ history).
For example, consider the following code:
int more_work_to_do(void)
{
return 1;
}
int some_condition(void)
{
return 1;
}
int foo()
{
int i = 100;
while (more_work_to_do()) {
int done = 0;
for (int i = 0; i < 10 && !done; i++) {
if (some_condition()) {
done = 1;
}
}
if (done) return i; // which `i`?
}
return 1;
}
Under some old rules of scoping for the i declared in the for loop, the value returned on the statement marked with the comment "which i" would be determined by the for loop (VC++ 6 uses these rules). Under the newer, standard rules for scoping that variable, the value returned will be the i declared at the start of the function.
While I can't possibly know how your habit came to be, I can tell you how my habit to do the same did. It was by seeing code like this:
for (i=0u; (i<someLimit) && (found != TRUE); i++)
{
if (someCondition) found = TRUE;
}
foundIndex = i-1;
Basically, code like this is written when the break keyword is disallowed by some coding rules, e.g. based on MISRA. If you don't break out of the loop though, the loop will usually leave you with an "i" which is off by one from what you care for.
Sometimes, you can even find this:
for (i=0u; (i<someLimit) && (found != TRUE); i++)
{
if (someCondition) found = TRUE;
}
foundIndex = i;
This is just semantically wrong and can be found when the "forbid break keyword rule" is introduced into an existing code base which is not sufficiently covered by unit tests. May sound surprising, but it's all out there...

The ternary (conditional) operator in C

What is the need for the conditional operator? Functionally it is redundant, since it implements an if-else construct. If the conditional operator is more efficient than the equivalent if-else assignment, why can't if-else be interpreted more efficiently by the compiler?
In C, the real utility of it is that it's an expression instead of a statement; that is, you can have it on the right-hand side (RHS) of a statement. So you can write certain things more concisely.
Some of the other answers given are great. But I am surprised that no one mentioned that it can be used to help enforce const correctness in a compact way.
Something like this:
const int n = (x != 0) ? 10 : 20;
so basically n is a const whose initial value is dependent on a condition statement. The easiest alternative is to make n not a const, this would allow an ordinary if to initialize it. But if you want it to be const, it cannot be done with an ordinary if. The best substitute you could make would be to use a helper function like this:
int f(int x) {
if(x != 0) { return 10; } else { return 20; }
}
const int n = f(x);
but the ternary if version is far more compact and arguably more readable.
The ternary operator is a syntactic and readability convenience, not a performance shortcut. People are split on the merits of it for conditionals of varying complexity, but for short conditions, it can be useful to have a one-line expression.
Moreover, since it's an expression, as Charlie Martin wrote, that means it can appear on the right-hand side of a statement in C. This is valuable for being concise.
It's crucial for code obfuscation, like this:
Look-> See?!
No
:(
Oh, well
);
Compactness and the ability to inline an if-then-else construct into an expression.
There are a lot of things in C that aren't technically needed because they can be more or less easily implemented in terms of other things. Here is an incomplete list:
while
for
functions
structs
Imagine what your code would look like without these and you may find your answer. The ternary operator is a form of "syntactic sugar" that if used with care and skill makes writing and understanding code easier.
Sometimes the ternary operator is the best way to get the job done. In particular when you want the result of the ternary to be an l-value.
This is not a good example, but I'm drawing a blank on somethign better. One thing is certian, it is not often when you really need to use the ternary, although I still use it quite a bit.
const char* appTitle = amDebugging ? "DEBUG App 1.0" : "App v 1.0";
One thing I would warn against though is stringing ternaries together. They become a real
problem at maintennance time:
int myVal = aIsTrue ? aVal : bIsTrue ? bVal : cIsTrue ? cVal : dVal;
EDIT: Here's a potentially better example. You can use the ternary operator to assign references & const values where you would otherwise need to write a function to handle it:
int getMyValue()
{
if( myCondition )
return 42;
else
return 314;
}
const int myValue = getMyValue();
...could become:
const int myValue = myCondition ? 42 : 314;
Which is better is a debatable question that I will choose not to debate.
Since no one has mentioned this yet, about the only way to get smart printf statements is to use the ternary operator:
printf("%d item%s", count, count > 1 ? "s\n" : "\n");
Caveat: There are some differences in operator precedence when you move from C to C++ and may be surprised by the subtle bug(s) that arise thereof.
The fact that the ternary operator is an expression, not a statement, allows it to be used in macro expansions for function-like macros that are used as part of an expression. Const may not have been part of original C, but the macro pre-processor goes way back.
One place where I've seen it used is in an array package that used macros for bound-checked array accesses. The syntax for a checked reference was something like aref(arrayname, type, index), where arrayname was actually a pointer to a struct that included the array bounds and an unsigned char array for the data, type was the actual type of the data, and index was the index. The expansion of this was quite hairy (and I'm not going to do it from memory), but it used some ternary operators to do the bound checking.
You can't do this as a function call in C because of the need for polymorphism of the returned object. So a macro was needed to do the type casting in the expression.
In C++ you could do this as a templated overloaded function call (probably for operator[]), but C doesn't have such features.
Edit: Here's the example I was talking about, from the Berkeley CAD array package (glu 1.4 edition). The documentation of the array_fetch usage is:
type
array_fetch(type, array, position)
typeof type;
array_t *array;
int position;
Fetch an element from an array. A
runtime error occurs on an attempt to
reference outside the bounds of the
array. There is no type-checking
that the value at the given position
is actually of the type used when
dereferencing the array.
and here is the macro defintion of array_fetch (note the use of the ternary operator and the comma sequencing operator to execute all the subexpressions with the right values in the right order as part of a single expression):
#define array_fetch(type, a, i) \
(array_global_index = (i), \
(array_global_index >= (a)->num) ? array_abort((a),1) : 0,\
*((type *) ((a)->space + array_global_index * (a)->obj_size)))
The expansion for array_insert ( which grows the array if necessary, like a C++ vector) is even hairier, involving multiple nested ternary operators.
It's syntatic sugar and a handy shorthand for brief if/else blocks that only contain one statement. Functionally, both constructs should perform identically.
like dwn said, Performance was one of its benefits during the rise of complex processors, MSDN blog Non-classical processor behavior: How doing something can be faster than not doing it gives an example which clearly says the difference between ternary (conditional) operator and if/else statement.
give the following code:
#include <windows.h>
#include <stdlib.h>
#include <stdlib.h>
#include <stdio.h>
int array[10000];
int countthem(int boundary)
{
int count = 0;
for (int i = 0; i < 10000; i++) {
if (array[i] < boundary) count++;
}
return count;
}
int __cdecl wmain(int, wchar_t **)
{
for (int i = 0; i < 10000; i++) array[i] = rand() % 10;
for (int boundary = 0; boundary <= 10; boundary++) {
LARGE_INTEGER liStart, liEnd;
QueryPerformanceCounter(&liStart);
int count = 0;
for (int iterations = 0; iterations < 100; iterations++) {
count += countthem(boundary);
}
QueryPerformanceCounter(&liEnd);
printf("count=%7d, time = %I64d\n",
count, liEnd.QuadPart - liStart.QuadPart);
}
return 0;
}
the cost for different boundary are much different and wierd (see the original material). while if change:
if (array[i] < boundary) count++;
to
count += (array[i] < boundary) ? 1 : 0;
The execution time is now independent of the boundary value, since:
the optimizer was able to remove the branch from the ternary expression.
but on my desktop intel i5 cpu/windows 10/vs2015, my test result is quite different with msdn blog.
when using debug mode, if/else cost:
count= 0, time = 6434
count= 100000, time = 7652
count= 200800, time = 10124
count= 300200, time = 12820
count= 403100, time = 15566
count= 497400, time = 16911
count= 602900, time = 15999
count= 700700, time = 12997
count= 797500, time = 11465
count= 902500, time = 7619
count=1000000, time = 6429
and ternary operator cost:
count= 0, time = 7045
count= 100000, time = 10194
count= 200800, time = 12080
count= 300200, time = 15007
count= 403100, time = 18519
count= 497400, time = 20957
count= 602900, time = 17851
count= 700700, time = 14593
count= 797500, time = 12390
count= 902500, time = 9283
count=1000000, time = 7020
when using release mode, if/else cost:
count= 0, time = 7
count= 100000, time = 9
count= 200800, time = 9
count= 300200, time = 9
count= 403100, time = 9
count= 497400, time = 8
count= 602900, time = 7
count= 700700, time = 7
count= 797500, time = 10
count= 902500, time = 7
count=1000000, time = 7
and ternary operator cost:
count= 0, time = 16
count= 100000, time = 17
count= 200800, time = 18
count= 300200, time = 16
count= 403100, time = 22
count= 497400, time = 16
count= 602900, time = 16
count= 700700, time = 15
count= 797500, time = 15
count= 902500, time = 16
count=1000000, time = 16
the ternary operator is slower than if/else statement on my machine!
so according to different compiler optimization techniques, ternal operator and if/else may behaves much different.
Some of the more obscure operators in C exist solely because they allow implementation of various function-like macros as a single expression that returns a result. I would say that this is the main purpose why the ?: and , operators are allowed to exist, even though their functionality is otherwise redundant.
Lets say we wish to implement a function-like macro that returns the largest of two parameters. It would then be called as for example:
int x = LARGEST(1,2);
The only way to implement this as a function-like macro would be
#define LARGEST(x,y) ((x) > (y) ? (x) : (y))
It wouldn't be possible with an if ... else statement, since it does not return a result value. Note)
The other purpose of ?: is that it in some cases actually increases readability. Most often if...else is more readable, but not always. Take for example long, repetitive switch statements:
switch(something)
{
case A:
if(x == A)
{
array[i] = x;
}
else
{
array[i] = y;
}
break;
case B:
if(x == B)
{
array[i] = x;
}
else
{
array[i] = y;
}
break;
...
}
This can be replaced with the far more readable
switch(something)
{
case A: array[i] = (x == A) ? x : y; break;
case B: array[i] = (x == B) ? x : y; break;
...
}
Please note that ?: does never result in faster code than if-else. That's some strange myth created by confused beginners. In case of optimized code, ?: gives identical performance as if-else in the vast majority of the cases.
If anything, ?: can be slower than if-else, because it comes with mandatory implicit type promotions, even of the operand which is not going to be used. But ?: can never be faster than if-else.
Note) Now of course someone will argue and wonder why not use a function. Indeed if you can use a function, it is always preferable over a function-like macro. But sometimes you can't use functions. Suppose for example that x in the example above is declared at file scope. The initializer must then be a constant expression, so it cannot contain a function call. Other practical examples of where you have to use function-like macros involve type safe programming with _Generic or "X macros".
ternary = simple form of if-else. It is available mostly for readability.
The same as
if(0)
do();
if(0)
{
do();
}

Resources