How does the compiler interpret null statements in C? In terms of execution time. ( empty ";" i.e., without any expression)
And will it optimize code during execution if it encounters null statements, by removing them.
Compilers only care about observable behaviour. Whether you compile
int main() {
;;;;;;;;;;;;;;;;;;
return 0;
}
or
int main() {
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
return 0;
}
does not make any difference regarding the resulting executable. The observable behaviour of both examples is the same.
If you want to convince yourself, look at the compilers output (this is a great tool: https://godbolt.org/z/bnbxiP) or try to profile the above examples (but dont expect to get meaningful numbers ;).
My suggestion is to not think about code as a way to talk to your cpu. When you write code you are not expressing instructions for your cpu. Code is rather a recipe for the compiler and your compiler knows much better how to intruct the cpu than any human. Small difference but I think it helps.
Related
Say you have (for reasons that are not important here) the following code:
int k = 0;
... /* no change to k can happen here */
if (k) {
do_something();
}
Using the -O2 flag, GCC will not generate any code for it, recognizing that the if test is always false.
I'm wondering if this is a pretty common behaviour across compilers or it is something I should not rely on.
Does anybody knows?
Dead code elimination in this case is trivial to do for any modern optimizing compiler. I would definitely rely on it, given that optimizations are turned on and you are absolutely sure that the compiler can prove that the value is zero at the moment of check.
However, you should be aware that sometimes your code has more potential side effects than you think.
The first source of problems is calling non-inlined functions. Whenever you call a function which is not inlined (i.e. because its definition is located in another translation unit), compiler assumes that all global variables and the whole contents of the heap may change inside this call. Local variables are the lucky exception, because compiler knows that it is illegal to modify them indirectly... unless you save the address of a local variable somewhere. For instance, in this case dead code won't be eliminated:
int function_with_unpredictable_side_effects(const int &x);
void doit() {
int k = 0;
function_with_unpredictable_side_effects(k);
if (k)
printf("Never reached\n");
}
So compiler has to do some work and may fail even for local variables. By the way, I believe the problem which is solved in this case is called escape analysis.
The second source of problems is pointer aliasing: compiler has to take into account that all sort of pointers and references in your code may be equal, so changing something via one pointer may change the contents at the other one. Here is one example:
struct MyArray {
int num;
int arr[100];
};
void doit(int idx) {
MyArray x;
x.num = 0;
x.arr[idx] = 7;
if (x.num)
printf("Never reached\n");
}
Visual C++ compiler does not eliminate the dead code, because it thinks that you may access x.num as x.arr[-1]. It may sound like an awful thing to do to you, but this compiler has been used in gamedev area for years, and such hacks are not uncommon there, so the compiler stays on the safe side. On the other hand, GCC removes the dead code. Maybe it is related to its exploitation of strict pointer aliasing rule.
P.S. The const keywork is never used by optimizer, it is only present in C/C++ language for programmers' convenience.
There is no pretty common behaviour across compilers. But there is a way to explore how different compilers acts with specific part of code.
Compiler explorer will help you to answer on every question about code generation, but of course you must be familiar with assembler language.
I don't understand exactly the following:
When using debugging and optimization together, the internal rearrangements carried out by the optimizer can make it difficult to see what is going on when examining an optimized program in the debugger. For example, the ordering of statements may be changed.
What i understand is when i build a program with the -g option, then the executable will contain a symbolic table which contains variable, function names, references to them and their line-numbers.
And when i build with an optimization option, for example the ordering of instructions may be changed depends on the optimization.
What i don't understand is, why debugging is more difficult.
I would like to see an example, and an easy to understand explanation.
An example that might happen:
int calc(int a, int b)
{
return a << b + 7;
}
int main()
{
int x = 5;
int y = 7;
int val = calc(x, y);
return val;
}
Optimized this might be the same as
int main()
{
return 642;
}
A contrived example, but trying to debug that kind of optimization in actual code isn’t simple. Some debuggers may show all lines of code marked when stepping through, some might skip them all, some may be confused. And the developer at least is.
simple example:
int a = 4;
int b = a;
int c = b;
printf("%d", c);
can be optimized as:
printf("%d", 4);
In fact in optimized compiles, the compiler might well do exactly this (in machine code of course)
When debugging we the debugger will allow us to inspect the memory associated by a,b and c but when the top version get optimized into the bottom version a,b and c no longer exist in RAM. This makes inspecting RAM a lot harder to figure out what is going on.
When you compile using the optimization flag you are ensured that the output of the program will be compliant to the code you wrote, but the code itself will variate from the one you actually compiled.
As you pointed out, the code will be rearranged and some call will be performed differently. Also another optimization could be loop unrolling, branch prediction and functions calls simplification. These optimizations will also vary on the architecture you are running on.
For all these reasons (and others) your code may become very difficult to debug, since it is transparent for you what the compiler exactly does, thus meaning that the code you want to debug may not look like the one you wrote.
The code below displays different results when compiled and run on Code::Blocks.
void sum(int a,int b){
printf("a=%d b=%d\n",a,b);
}
int main(){
int i=1;
sum(i=5,++i);
printf("i=%d\n\n",i);
/***********************/
i=2;
sum(i=5,i++);
printf("i=%d\n\n",i);
/**********************/
i=3;
sum(i=5,i);
printf("i=%d\n\n",i);
return 0;
}
Output:
a=5 b=5
i=5
a=5 b=2
i=5
a=5 b=5
i=5
I think the answer to this question is related to sequence point and the sequence point is related to ++ operator here. GCC must be following an order to pass the value to stack in a fixed order but because of ++ the answers are different. I think for a beginner to write a function call like this is not very common, but the lesson about operators are general so one can try.
My questions are,what should be the exact answer of it and questions like it? During which phase of compilation these things are decided(made it clear or unclear)? Which particular algorithm(s) (either for optimization or in general) is involved? Can same compiler provide different result for such expression or statements? And the last on is, how a beginner will understand and figure out these problems? It is sometimes very surprising.
The order of operations is decided during multiple phases of compilation, which is what causes the odd results you see. During the optimization phase in particular the compiler can reorder code in ways that aren't always obvious, and in this case it's affecting the result (which is fine, because you're doing something undefined and the compiler's explicitly allowed to do anything it wants to with that code). There isn't any specific algorithm involved, it's an interaction between several different algorithms applied at different points and the algorithm applied at each point can vary depending on what the compiler's decided is the best way to handle a particular bit of code.
When the documentation speaks of undefined behavior, it's not the behavior of a specific compiler that's undefined but the specification of what the compiler must or is allowed to do. The compiler's behavior is completely defined, but it's defined by detailed decisions buried deep in the design of it's parser, code generator and optimizer modules and it's complicated enough that not even the developers who wrote the compiler can tell you what it'll do without spending a lot of time analyzing how a given bit of code flows through the entire process.
A beginner won't be able to figure out the outcome. Even an expert developer may not be able to. That's why "undefined" is such an unwelcome word to developers, and why they try to avoid undefined behavior like the plague. To quote from a discussion of the language spec in question, "In short, you can't use sizeof() on a structure whose elements haven't been
defined, and if you do, demons may fly out of your nose.".
Suppose I have the following C code:
int i = 5;
int j = 10;
int result = i + j;
If I'm looping over this many times, would it be faster to use int result = 5 + 10? I often create temporary variables to make my code more readable, for example, if the two variables were obtained from some array using some long expression to calculate the indices. Is this bad performance-wise in C? What about other languages?
A modern optimizing compiler should optimize those variables away, for example if we use the following example in godbolt with gcc using the -std=c99 -O3 flags (see it live):
#include <stdio.h>
void func()
{
int i = 5;
int j = 10;
int result = i + j;
printf( "%d\n", result ) ;
}
it will result in the following assembly:
movl $15, %esi
for the calculation of i + j, this is form of constant propagation.
Note, I added the printf so that we have a side effect, otherwise func would have been optimized away to:
func:
rep ret
These optimizations are allowed under the as-if rule, which only requires the compiler to emulate the observable behavior of a program. This is covered in the draft C99 standard section 5.1.2.3 Program execution which says:
In the abstract machine, all expressions are evaluated as specified by
the semantics. An actual implementation need not evaluate part of an
expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).
Also see: Optimizing C++ Code : Constant-Folding
This is an easy task to optimize for an optimizing compiler. It will delete all variables and replace result with 15.
Constant folding in SSA form is pretty much the most basic optimization there is.
The example you gave is easy for a compiler to optimize. Using local variables to cache values pulled out of global structures and arrays can actually speed up execution of your code. If for instance you are fetching something from a complex structure inside a for loop where the compiler can't optimize and you know the value isn't changing, the local variables can save quite a bit of time.
You can use GCC (other compilers too) to generate the intermediate assembly code and see what the compiler is actually doing.
There is discussion of how to turn on the assembly listings here:Using GCC to produce readable assembly?
It can be instructive to examine the generated code and see what a compiler is actually doing.
While all sorts of trivial differences to the code can perturb the compiler's behavior in ways that mildly improve or worsen performance, in principle it it should not make any performance difference whether you use temp variables like this as long as the meaning of the program is not changed. A good compiler should generate the same, or comparable, code either way, unless you're intentionally building with optimization off in order to get machine code that's as close as possible to the source (e.g. for debugging purposes).
You're suffering the same problem I do when I'm trying to learn what a compiler does--you make a trivial program to demonstrate the problem, and examine the assembly output of the compiler, only to realize that the compiler has optimized everything you tried to get it to do away. You may find even a rather complex operation in main() reduced to essentially:
push "%i"
push 42
call printf
ret
Your original question is not "what happens with int i = 5; int j = 10...?" but "do temporary variables generally incur a run-time penalty?"
The answer is probably not. But you'd have to look at the assembly output for your particular, non-trivial code. If your CPU has a lot of registers, like an ARM, then i and j are very likely to be in registers, just the same as if those registers were storing the return value of a function directly. For example:
int i = func1();
int j = func2();
int result = i + j;
is almost certainly to be exactly the same machine code as:
int result = func1() + func2();
I suggest you use temporary variables if they make the code easier to understand and maintain, and if you're really trying to tighten a loop, you'll be looking into the assembly output anyway to figure out how to finesse as much performance out as possible. But don't sacrifice readability and maintainability for a few nanoseconds, if that's not necessary.
I have recently become a teaching assistant for a university course which primarily teaches C. The course standardized on C90, mostly due to widespread compiler support. One of the very confusing concepts to C newbies with previous Java experience is the rule that variable declarations and code may not be intermingled within a block (compound statement).
This limitation was finally lifted with C99, but I wonder: does anybody know why it was there in the first place? Does it simplify variable scope analysis? Does it allow the programmer to specify at which points of program execution the stack should grow for new variables?
I assume the language designers wouldn't have added such a limitation if it had absolutely no purpose at all.
In the very beginning of C the available memory and CPU resources were really scarce. So it had to compile really fast with minimal memory requirements.
Therefore the C language has been designed to require only a very simple compiler which compiles fast. This in turn lead to "single-pass compiler" concept: The compiler reads the source-file and translates everything into assembler code as soon as possible - usually while reading the source file. For example: When the compiler reads the definition of a global variable the appropriate code is emitted immediately.
This trait is visible in C up until today:
C requires "forward declarations" of all and everything. A multi-pass compiler could look forward and deduce the declarations of variables of functions in the same file by itself.
This in turn makes the *.h files necessary.
When compiling a function, the layout of the stack frame must be computed as soon as possible - otherwise the compiler had to do several passes over the function body.
Nowadays no serious C compiler is still "single pass", because many important optimizations cannot be done within one pass. A little bit more can be found in Wikipedia.
The standard body lingered for quite some time to relax that "single-pass" point in regard to the function body. I assume, that other things were more important.
It was that way because it had always been done that way, it made writing compilers a little easier, and nobody had really thought of doing it any other way. In time people realised that it was more important to favour making life easier for language users rather than compiler writers.
I assume the language designers wouldn't have added such a limitation if it had absolutely no purpose at all.
Don't assume that the language designers set out to restrict the language. Often restrictions like this arise by chance and circumstance.
I guess it should be easier for a non-optimising compiler to produce efficient code this way:
int a;
int b;
int c;
...
Although 3 separate variables are declared, the stack pointer can be incremented at once without optimising strategies such as reordering, etc.
Compare this to:
int a;
foo();
int b;
bar();
int c;
To increment the stack pointer just once, this requires a kind of optimisation, although not a very advanced one.
Moreover, as a stylistic issue, the first approach encourages a more disciplined way of coding (no wonder that Pascal too enforces this) by being able to see all the local variables at one place and eventually inspect them together as a whole. This provides a clearer separation between code and data.
Requiring that variables declarations appear at the start of a compound statement did not impair the expressiveness of C89. Anything that one could legitimately do using a mid-block declaration could be done just as well by adding an open-brace before the declaration and doubling up the closing brace of the enclosing block. While such a requirement may sometimes have cluttered source code with extra opening and closing braces, such braces would not have been just noise--they would have marked the beginning and end of variables' scopes.
Consider the following two code examples:
{
do_something_1();
{
int foo;
foo = something1();
if (foo) do_something_1(foo);
}
{
int bar;
bar = something2();
if (bar) do_something_2(bar);
}
{
int boz;
boz = something3();
if (boz) do_something_3(boz);
}
}
and
{
do_something_1();
int foo;
foo = something1();
if (foo) do_something_1(foo);
int bar;
bar = something2();
if (bar) do_something_2(bar);
int boz;
boz = something3();
if (boz) do_something_3(boz);
}
From a run-time perspective, most modern compilers probably wouldn't care about whether foo is syntactically in scope during the execution of do_something3(), since it could determine that any value it held before that statement would not be used after. On the other hand, encouraging programmers to write declarations in a way which would generate sub-optimal code in the absence of an optimizing compiler is hardly an appealing concept.
Further, while handling the simpler cases involving intermixed variable declarations would not be difficult (even a 1970's compiler could have done it, if the authors wanted to allow such constructs), things become more complicated if the block which contains intermixed declarations also contains any goto or case labels. The creators of C probably thought allowing intermixing of variable declarations and other statements would complicate the standards too much to be worth the benefit.
Back in the days of C youth, when Dennis Ritchie worked on it, computers (PDP-11 for example) have very limited memory (e.g. 64K words), and the compiler had to be small, so it had to optimize very few things and very simply. And at that time (I coded in C on Sun-4/110 in the 1986-89 era), declaring register variables was really useful for the compiler.
Today's compilers are much more complex. For example, a recent version of GCC (4.6) has more 5 or 10 million lines of source code (depending upon how you measure it), and does a big lot of optimizations which did not existed when the first C compilers appeared.
And today's processors are also very different (you cannot suppose that today's machines are just like machines from the 1980s, but thousands of times faster and with thousands times more RAM and disk). Today, the memory hierarchy is very important: cache misses are what the processor does the most (waiting for data from RAM). But in the 1980s access to memory was almost as fast (or as slow, by current standards) than execution of a single machine instruction. This is completely false today: to read your RAM module, your processor may have to wait for several hundreds of nanoseconds, while for data in L1 cache, it can execute more that one instruction each nanosecond.
So don't think of C as a language very close to the hardware: this was true in the 1980s, but it is false today.
Oh, but you could (in a way) mix declarations and code, but declaring new variables was limited to the start of a block. For example, the following is valid C89 code:
void f()
{
int a;
do_something();
{
int b = do_something_else();
}
}