Global variables are initialized to "0" by default.
How much difference does it make (if any) when I explicitly assign value "0" to it.
Is any one of them faster/better/more optimized?
I tried with a small sample .c program but I do not see any change in executable size.
Edit:0 I just want to understand the behavior. Its not a bottleneck for me in any way.
The answer to your question is very implementation specific but typically all uninitialized global and static variables end up in the .bss segment. Explicitly initialized variables are located in some other data segment. Both of these will be copied over by the program loader before the execution of main(). So, there shouldn't be any performance difference between explicitly initializing to zero, and leaving the variable uninitialized.
IMO it is good practice to explicitly initialize globals and statics to zero, as it makes it clear that a zero initial value is expected.
When you say optimized, I am assuming you mean faster in execution. If so, then there won't be any difference. And the compiler might even remove the initialization of the global variable (not sure on the compiler internals). And if you mean the space utilization of the program - there won't be a difference in that either.
Bigger question though is - is there a specific reason you are trying to look to optimize via the initialization of global variables. Can you please explain a bit more.
Static objects without an explicit initializer are initialized to zero at startup. Whether you explicitly initialize the object to 0 or not will probably make no difference in term of performance as the compiler usually initialize all the zero objects in one go before main.
// File scope
// Same code is likely to be generated for the two objects initialization
int bla1;
int bla2 = 0;
On the other hand, if you assign 0 instead of initializing, it could make a difference because the compiler could not infer what was the previous value of the object.
void init(void)
{
bla1 = 0;
bla2 = 0;
}
I doubt there is a difference, but, even if there is, I have much more doubts about the fact that your program is so optimized that the bottleneck is that.
I'd rather suggest not to care at all about all this kind of issues and write the code as you like, maybe giving way to readability rather than speed, leaving optimization only as a final problem.
Premature optimization is the root of all evil
There is none. The optimizer sees that as a no-op.
Explicit initialization is more verbose and clearer to the untrained eye. If you have juniors in your team, I'd explicitly initialize these variables.
Related
Some coding standards mandate initializing each variable when you declare it, even if the value is meaningless and the variable will soon be over written.
For example:
main()
{
char rx_char;
:
:
rx_char = get_char();
}
My co-worker asks me to initialize rx_char, but I don't understand why. Could anybody point out the reason?
"Initializing each variable when you declare it, even if the value is meaningless" is a schoolbook example of a cargo cult behavior. It is a completely meaningless requirement, which should be avoided whenever possible. In many cases a meaningless initializer is not much better than a random uninitialized value. In cases when it can actually "save the day", it does so by sweeping the problem under the carpet, i.e. by hiding it. Hidden problems are always the worst. When the code has a problem in it, it is always better to have it manifest itself as quickly as possible, even through having the code to crash or behave erratically.
Additionally, such requirements can impede compiler optimizations by spamming the compiler with misleading information about the effective value lifetime associated with the variable. The optimizer can treat an uninitialized variable as non-existent, thus saving valuable resources (e.g. CPU registers). In order to acquire the same kind of knowledge about a variable initialized with a meaningless value, the optimizer has to be able to figure out that it is indeed meaningless, which is generally a more complicated task.
The same problem has to be solved by a person reading the code, which impedes readability. A person unfamiliar with the code cannot say right away whether the specific initializer is meaningful (i.e. the code below relies on the initial value) or meaningless (i.e. provided thoughtlessly by a follower of the aforementioned cargo cult).
It is worth noting also that some modern compilers can detect attempts to use uninitialized variable values at both compile-time and run-time. Such compilers issue compile-time warnings or run-time assertions (in debugging builds) when such attempt is detected. In my experience this feature is much more useful than it might appear at first sight. Meanwhile, thoughtless initialization of variables with meaningless values effectively defeats this compiler-provided safety feature and, again, hides the associated errors. It certainly does more harm than good from that point of view.
Fortunately, the problem is no longer as acute as it used to be. In the modern versions of the C language variables can be declared anywhere in the code, which significantly reduces the harmful effects of such coding standards. You no longer have to declare variables at the beginning of the block, which greatly increases the chances that by the time you are ready to declare the variable you are also ready to supply a meaningful initializer for it.
Initializers are good. But when you see that you cannot provide a meaningful initializer for a variable, it is always a better idea to attempt to reorganize the code in order to create an opportunity for a meaningful initializer, instead of giving up and just using a meaningless one. This is often possible even in pre-C99 code.
With modern compilers often giving a warning about the use of uninitialised variables I'd favour not initialising towards some bogus value. Then, if someone accidentally uses the variable before the initialisation you'll get a warning, rather than having to spend X time debugging and finding out the error.
It's good practice to initialize your variables, even if it's just to set them to the types "null" equivalent. If someone else comes along and wants to edit that code, and uses rx_char before you've initialized it, they will get non-deterministic behavior. The value of rx_char would be dependent on things outside of your control. Some compilers 0 initialize variables, but you should not depend on this!
By initializing it on declaration, you're guaranteeing deterministic behavior. Which is in general, easier to debug, than having rx_char be something completely random, as it could potentially be, if you left it uninitialized.
main() {
char rx_char;
//Some stuff
char your_coworkers_char = rx_char;
//Some other stuff
rx_char = get_char();
putchar(your_cowowarers_char); //Potentially prints randomness!
}
This is perhaps optimal, unless you're working with super old style compilers you should be fine with this.
main() {
//some stuff
char rx_char = get_char();
putchar(your_cowowarers_char); //Potentially prints randomness!
}
Anybody could point out the reason?
The first person you should ask is the co-worker that asked you to do it. Her reasons may be entirely different than what we come up with.
...
Most coding standards are as much about the future as they are about what happens today.
Today, yes, it doesn't make any difference if you have char rx_char or char rx_char = 0. But in the future, what happens if someone deletes the rx_char=getchar() line? Then rx_char is a random, uninitialized value. If rx_char is 0 then you'll still have a bug, but it won't be random.
I am working on some (embedded) device, recently I just started thinking maybe to use less memory, in case stack size isn't that big.
I have long functions (unfortunately).
And inside I was thinking to save space in this way.
Imagine there is code
1. void f()
2. {
3. ...
4. char someArray[300];
5. char someOtherArray[300];
6. someFunc(someArray, someOtherArray);
7. ...
8. }
Now, imagine, someArray and someOtherArray are never used in f function beyond line: 6.
Would following save some stack space??
1. void f()
2. {
3. ...
4. {//added
5. char someArray[300];
6. char someOtherArray[300];
7. someFunc(someArray, someOtherArray);
8. }//added
9. ...
8. }
nb: removed second part of the question
For the compiler proper both are exactly the same and thus makes no difference. The preprocessor would replace all instances of TEXT1 with the string constant.
#define TEXT1 "SomeLongStringLiteral"
someFunc(TEXT1)
someOtherFunc(TEXT1)
After the preprocessor's job is done, the above snippet becomes
someFunc("SomeLongStringLiteral");
someOtherFunc("SomeLongStringLiteral");
Thus it makes no difference performance or memory-wise.
Aside: The reason #define TEXT1 "SomeLongStringLiteral" is done is to have a single place to change all instances of TEXT1s usage; but that's a convinience only for the programmer and has no effect on the produced output.
recently I just started thinking maybe to use less memory, in case stack size isn't that big.
Never micro optimise or prematurely optimise. In case the stack size isn't that big, you'll get to know it when you benchmark/measure it. Don't make any assumptions when you optimise; 99% of the times it'd be wrong.
I am working on some device
Really? Are you? I wouldn't have thought that.
Now, imagine, someArray and someOtherArray are never used in f function beyond line 6. Would following save some stack space?
On a good compiler, it wouldn't make a difference. By the standard, it isn't specified if it saves or not, it isn't even specified if there is a stack or not.
But on a not so good compiler, the one with the additional {} may be better. It is worth a test: compile it and look at the generated assembler code.
it seems my compiler doesn't allow me to do this (this is C), so never mind...
But it should so. What happens then? Maybe you are just confusing levels of {} ...
I'll ask another one here.
Would better be a separate question...
someFunc("SomeLongStringLiteral");
someOtherFunc("SomeLongStringLiteral");
vs.
someFunc(TEXT1)
someOtherFunc(TEXT1)
A #define is processed before any compilation step, so it makes absolutely no difference.
If it happens within the same compilation unit, the compiler will tie them together anyway. (At least, in this case. On an ATXmega, if you use PSTR("whatever") for having them in flash space only, each occurrence of them will be put into flash separately. But that's a completely different thing...)
Modern compilers should push stack variables before they are used, and pop them when they are no longer needed. The old thinking with { ... } marking the start and end of a stack push/pop should be rather obsolete by now.
Since 1999, C allows stack variables to be allocated anywhere and not just immediately after a {. C++ allowed this far earlier. Today, where the local variable is declared inside the scope has little to do with when it actually starts to exist in the machine code. And similarly, the } has little to do with when is ceases to exist.
So regarding adding extra { }, don't bother. It is premature optimization and only adds pointless clutter.
Regarding the #define it absolutely makes no difference in terms of efficiency. Macros are just text replacement.
Furthermore, from the generic point-of-view, data must always be allocated somewhere. Data used by a program cannot be allocated in thin air! That's a very common misunderstanding. For example, many people incorrectly believe that
int x = func();
if(x == something)
consumes more memory than
if(func() == something)
But both examples compile into identical machine code. The result of func must be stored somewhere, it cannot be stored in thin air. In the first example, the result is stored in a memory segment that the programmer may refer to as x.
In the second example, it is stored in the very same memory segment, taking up the same amount of space, for the same duration of program execution. The only difference is that the memory segment is anonymous and the programmer has no name for it. As far as the machine code is concerned, that doesn't matter, since no variable names exist in machine code.
And this would be why every professional C programmer needs to understand a certain amount of assembler. You cannot hope to ever do any kind of manual code optimization if you don't.
(Please don't ask two questions in one, this is really annoying since you get two types of answer for your two different questions.)
For your first question. Probably putting {} around the use of a variable will not help. The lifetime of automatic variables that are not VLA (see below) is not bound to the scope in which it is declared. So compilers may have a hard time in figuring out how the use of the stack may be optimized, and maybe don't do such an optimization at all. In your case this is most likely the case, since you are exporting pointers to your data to a function that is perhaps not visible, here. The compiler has no way to figure out if there is a valid use of the arrays later on in the code.
I see two ways to "force" the compiler into optimizing that space, functions or VLA. The first, functions is simple: instead of putting the block around the code, put it in a static function. Function calls are quite optimized on modern platforms, and here the compiler knows exactly how he may clear the stack at the end.
The second alternative in your case is a VLA, variable length array, if you compiler supports that c99 feature. Arrays that have a size that doesn't depend on a compile time constant have a special rule for their lifetime. That lifetime exactly ends at the end of the scope where they are defined. Even a const-qualified variable could be used for that:
{
size_t const len = 300;
char someArray[len];
char someOtherArray[len];
someFunc(someArray, someOtherArray);
}
At the end, on a given platform, you'd really have to inspect what assembler your compiler produces.
If i have a test.c file with the following
#include ...
int global = 0;
int main() {
int local1 = 0;
while(1) {
int local2 = 0;
// Do some operation with one of them
}
return 0;
}
So if I had to use one of this variables in the while loop, which one would be preferred?
Maybe I'm being a little vague here, but I want to know if the difference in time/space allocation is actually relevant.
If you are wondering whether declaring a variable inside a for loop causes it to be created/destroyed at every iteration, there is nothing really to worry about. These variables are not dynamically allocated at runtime, nothing is being malloced here - just some memory is being set aside for use inside the loop. So having the variable inside is just the same as having it outside the loop in terms of performance.
The real difference here is scope not performance. Whether you use a global or local variable only affects where you want this variable to be visible.
In case you're wondering about performance differences: most likely there aren't any. If there are theoretical performance differences, you'll find it hard to actually devise a test to measure them.
A decision like this should not be based on performance but semantics. Unless the semantic behavior of a global variable is required, you should always use automatic (local non-static) variables.
As others have said and surely will say, there are unlikely to be any differences in performance. If there are, the automatic variable will be faster.
The C compiler will have an easier time making optimizations on the variables declared local to the function. The global variable would require an optimizer to perform "Inter-Procedural Data Flow Analysis", which isn't that commonly done.
As an example of the difference, consider that all your declarations initialize the variable to zero. However, in the case of the global variable, the compiler cannot use that information unless it verifies that no flow of control in your program can change the global prior to using it in your example function. In the case of the locally declared ("automatic") variables, there is no way the initial value can be changed by another function (in particular, the compiler verifies that their address is never passed to a sub-function) and the compiler can perform "killed definitions" and "value liveness" analysis to determine whether the zero value can be assumed in some code paths.
Of the two local variables, as a guideline, the optimizer will always have an easier time optimizing access to the variable with the smaller (more limited) scope.
Having stated the above, I would suggest that other answers concerning a bias toward semantics over optimizer-meta-optimization is correct. Use the variable which causes the code to read best, and you will be rewarded with more time returned to you than assisting the def-use optimization calculation.
In general, avoid using a global variable, or any variable which can be accessed more broadly than absolutely necessary. Limited scoping of variables helps prevent bugs from being introduced during later program maintenance.
There are three broad classes of variables: static (global), stack (auto), and register.
Register variables are stored in CPU registers. Registers are very fast word-sized memories, which are integrated in the CPU pipeline. They are free to access, but there are a very limited number of them (typically between 8 and 32 depending on your processor and what operations you're doing).
Stack variables are stored in an area of RAM called the stack. The stack is almost always going to be in the cache, so stack variables typically take 1-4 cycles to access.
Generally, local variables can be either in registers or on the stack. It doesn't matter whether they are allocated at the top of a function or in a loop; they will only be allocated once per function call, and allocation is basically free. The compiler will put variables in registers if at all possible, but if you have more active variables than registers, they won't all fit. Also, if you take the address of a variable, it must be stored on the stack since registers don't have addresses.
Global and static variables are a different beast. Since they are not usually accessed frequently, they may not be in cache, so it could take hundreds of cycles to access them. Also, since the compiler may not know the address of a global variable ahead of time, it may need to be looked up, which is also expensive.
As others have said, don't worry too much about this stuff. It's definitely good to know, but it shouldn't affect the way you write your programs. Write code that makes sense, and let the compiler worry about optimization. If you get into compiler development, then you can start worrying about it. :)
Edit: more details on allocation:
Register variables are allocated by the compiler, so there is no runtime cost. The code will just put a value in a register as soon as the value is produced.
Stack variables are allocated by your program at runtime. Typically, when a function is called, the first thing it will do is reserve enough stack space for all of its local variables. So there is no per-variable cost.
I'm currently developing a very fast algorithm, with one part of it being an extremely fast scanner and statistics function.
In this quest, i'm after any performance benefit.
Therefore, I'm also interested in keeping the code "multi-thread" friendly.
Now for the question :
i've noticed that putting some very frequently accessed variables and arrays into "Global", or "static local" (which does the same), there is a measurable performance benefit (in the range of +10%).
I'm trying to understand why, and to find a solution about it, since i would prefer to avoid using these types of allocation.
Note that i don't think the difference comes from "allocation", since allocating a few variables and small array on the stack is almost instantaneous. I believe the difference comes from "accessing" and "modifying" data.
In this search, i've found this old post from stackoverflow :
C++ performance of global variables
But i'm very disappointed by the answers there. Very little explanation, mostly ranting about "you should not do that" (hey, that's not the question !) and very rough statements like 'it doesn't affect performance', which is obviously incorrect, since i'm measuring it with precise benchmark tools.
As said above, i'm looking for an explanation, and, if it exists, a solution to this issue. So far, i've got the feeling that calculating the memory address of a local (dynamic) variable costs a bit more than a global (or local static). Maybe something like an ADD operation difference. But that doesn't help finding a solution...
It really depends on your compiler, platform, and other details. However, I can describe one scenario where global variables are faster.
In many cases, a global variable is at a fixed offset. This allows the generated instructions to simply use that address directly. (Something along the lines of MOV AX,[MyVar].)
However, if you have a variable that's relative to the current stack pointer or a member of a class or array, some math is required to take the address of the array and determine the address of the actual variable.
Obviously, if you need to place some sort of mutex on your global variable in order to keep it thread-safe, then you'll almost certainly more than lose any performance gain.
Creating local variables can be literally free if they are POD types. You likely are overflowing a cache line with too many stack variables or other similar alignment-based causes which are very specific to your piece of code. I usually find that non-local variables significantly decrease performance.
It's hard to beat static allocation for speed, and while the 10% is a pretty small difference, it could be due to address calculation.
But if you're looking for speed,
your example in a comment while(p<end)stats[*p++]++; is an obvious candidate for unrolling, such as:
static int stats[M];
static int index_array[N];
int *p = index_array, *pend = p+N;
// ... initialize the arrays ...
while (p < pend-8){
stats[p[0]]++;
stats[p[1]]++;
stats[p[2]]++;
stats[p[3]]++;
stats[p[4]]++;
stats[p[5]]++;
stats[p[6]]++;
stats[p[7]]++;
p += 8;
}
while(p<pend) stats[*p++]++;
Don't count on the compiler to do it for you. It might or might not be able to figure it out.
Other possible optimizations come to mind, but they depend on what you're actually trying to do.
If you have something like
int stats[256]; while (p<end) stats[*p++]++;
static int stats[256]; while (p<end) stats[*p++]++;
you are not really comparing the same thing because for the first instance you are not doing an initialization of your array. Written explicitly the second line is equivalent to
static int stats[256] = { 0 }; while (p<end) stats[*p++]++;
So to be a fair comparison you should have the first read
int stats[256] = { 0 }; while (p<end) stats[*p++]++;
Your compiler might deduce much more things if he has the variables in a known state.
Now then, there could be runtime advantage of the static case, since the initialization is done at compile time (or program startup).
To test if this makes up for your difference you should run the same function with the static declaration and the loop several times, to see if the difference vanishes if your number of invocations grows.
But as other said already, best is to inspect the assembler that your compiler produces to see what effective difference there are in the code that is produced.
I think this is wrong, it should start as NULL and not with a random value. In the case that you have a pointer with a random memory address as its default value it could be a very dangerous thing, no?
The variables start out uninitialized because that's the fastest way - why waste the CPU cycles on initialization if you're going to write another value there anyway?
If you want a variable to be initialized after creation, just initialize it. :)
About it being a dangerous thing: Every good compiler will warn you if you try to use a variable without initialization.
No. C is a very efficient language, one that has traditionally been faster that a lot of other languages. One of the reasons for this is that it doesn't do too much on it's own. The programmer controls this.
In the case of initialization, C variables are not initialized to a random value. Rather, they are not initialized and so they contain whatever was at the memory location before.
If you wanted to initialize a variable to, say, 1 in your program, then it would be inefficient if the variable had already been initialized to zero or null. That would mean it was initialized twice.
Execution speed and overhead (or lack thereof) are the main reasons why. C is notorious for letting you walk off the proverbial cliff because it always assumes that the user knows better than it does.
Note that if you declared the variable as static it actually is guaranteed to be initialized to 0.
Variables start out with a random value because you are just handed a block of memory and told to deal with it yourself. It has whatever value that block of memory had before hand. Why should the program waste time setting the value to some arbitrary default when you are likely going to set it yourself later?
The design choice is performance, and it is one of the many reasons why C isn't the preferred language for most projects.
This has nothing to do with "if C were being designed today" or with efficiency of one initialization. Instead think of something like
void foo()
{
struct bar *ptrs[10000];
/* do something where only a few indices end up actually getting used */
}
Any language that forces useless initialization on you is doomed to be slow as hell for algorithms that can make use of sparse arrays where you don't care about the majority of the values, and have an easy way of knowing which values you care about.
If you don't like my example with such a large object on the stack, substitute malloc instead. It has the same semantics with regard to initialization.
In either case, if you want zero-initialization, you can get it with {0} or calloc.
It was a design choice made many ears ago, probably for efficiency reasons.
Statically allocated variables (globals and statics) are initialized to 0 if there's no explicit initialization - this could be justified even taking efficiency into account becuase it only occurs once. I'd guess the thinking was that for automatic variables (locals) that are allocated each time a scope is entered, implicit initialization was considered something that might cost too much and therefore should be left to the programmer's responsibility.
If C were being designed today, I wouldn't be surprised if that design decision were changed - especially since compilers are intelligent enough today to be able to optimize away an initialization that gets overwritten before any other use (or potential use).
However, there are so many C compiler toolchains that follow the spec of not initializing automatically, it would be foolish for a compiler to perform implicit initialization to a 'useful' value (like 0 or NULL). That would just encourage people targeting that tool chain to write code that didn't work correctly on other tool chains.
However, compilers can initialize local variables, and they often do. It's just that they initialize the locals to a values that's not generally useful (especially, that doesn't set a pointer to the null pointer). That kind of initialization isn't useful in writing your programming logic against, and it's not intended for that. It's intended to cause deterministic and reproducible errors so that if you erroneously use values that have been set by implicit initialization, you'll be able to find it easily in test/debug.
Usually this compiler behavior is turned on only for debug builds; I could see an argument being made for turning it on in release builds as well - particular if the release build can still optimize it away when the compiler can prove that the implicit initialized value is never used.