Why is the value of
int array[10];
undefined when declared in a function and is 0-initialized when declared as static?
I have been reading the answer of this question and it is clear that
[the expression int array[10];] in a function means: take the ownership of 10-int-size area of memory without doing any initialization. If the array is declared as a global one or as static in a function, then all elements are initialized to zero if they aren't initialized already.
Question: why this behaviour? Do the compiler programmers decide that (for a particular reason)? Can a particular compiler used do the things differently?
Why I am asking this: I am asking this question because I would like to make my code portable among architectures/compilers. In order to ensure it, I know I can always initialize the declared array. But this means that I will lose precious time only for this operation. So, which is the right decision?
An automatic int array[10]; isn't implicitly zeroed because the zeroing takes time and you might not need it zeroed. Additionally, you'd pay the cost not just once but each time control ran past the initialized variable.
A static/global int array[10]; is implicitly zeroed because statics/globals are allocated at load time. The memory will be fresh from the OS and if the OS is security conscious at all, the memory will have been zeroed already. Otherwise the loading code (the OS or a dynamic linker) will have to zero them (because the C standard requires it), but it should be able to do it in one call to memset for all globals/statics, which is considerably more efficient than zeroing each static/global variable at a time.
This initialization is done once. Even statics inside of functions are initialized just once, even if they have nonzero initializers (e.g., static int x = 42;. This is why C requires that the initializer of a static be a constant expression).
Since the loadtime zeroing of all globals/statics is either OS-guaranteed or efficiently implementable, it might as well be standard-guaranteed and thereby make programmers' lives easier.
The values are not undefined but indeterminate, and it behaves this way because the standard says so.
Section 6.7.9p10 of the C standard regarding initialization states:
If an object that has automatic storage duration is not
initialized explicitly, its value is indeterminate. If an
object that has static or thread storage duration is not
initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned) zero;
if it is an aggregate, every member is initialized (recursively) according to these rules,and any padding is initialized to zero bits;
if it is a union, the first named member is initialized (recursively) according to theserules, and any padding is initialized
to zero bits;
So for any variable defined either at file scope or static you can safely assume the values are zero-initialized. For variables declared inside of a function or scope, you cannot make any assumptions about uninitialized variables.
As for why, global/static variables are initialized at program startup or even at compile time, while locals have to be initialized each time they come into scope and doing so would take time.
The reason for not defining the initial value of the variables in stack-allocated/local variables is efficiency. The C Standard expects your program to allocate your array and later fill it:
int array[10];
for (i = 0; i < 10; ++i)
array[i] = i * 42;
In this case, any initialization would be pointless, so the C Standard wants to avoid it.
If your program needs these values initialized to zero, you can do it explicitly:
int array[10] = {0}; // initialize to zero so the accumulation below works
while (condition)
{
... // some code
for (i = 0; i < 10; ++i)
array[i] += other_array[i];
}
It is your decision whether to initialize or not, because you are supposed to know how your program behaves. This decision will be different for different arrays.
However, this decision will not depend on a compiler - they are all standard-compliant. One little detail regarding portability - if you don't initialize your array and still see all zeros in it when you use a particular compiler - don't be fooled; the values are still undefined; you cannot rely on them being 0.
Some other languages decided that zero initialization is cheap enough to do even if it's superfluous, and its advantage (safety) outweighs its disadvantage (performance). In C, performance is more important, so it decided otherwise.
The C philosophy is to a) always trust the programmer and b) prioritize execution speed over programmer convenience. C assumes that the programmer is in the best position to know whether an array (or any other auto variable) needs to be initialized to a specific value, and if so, is smart enough to write the code to do it themselves. Otherwise it won't waste the CPU cycles.
Same thing for bounds checking on array accesses, same thing for NULL checks on pointer dereferences, etc.
This is simultaneously C's greatest strength (fast code with a small footprint) and greatest weakness (lots of manual labor to make code safe and secure).
Related
Let's assume that I have a for loop, and a very large struct as a stack variable:
for (int x=0 ; x <10; x++)
{
MY_STRUCT structVar = {0};
…code using structVar…
}
Will every compiler actually zero out the struct at the start of every loop? Or do I need to use memset to zero it out?
This is a very large struct and I want to allocate it on the stack, and I need to make sure every member of it is zeroed out at the start of every iteration. So do I need to use memset?
I can manually inspect the executable that I compile, but I need to make sure if there is any standard for this, or it just depends on the compiler.
Note that this code does compile. I am using Visual Studio.
Will every compiler actually zero out the struct at the start of every loop?
Any compiler that conforms to the C Standard will do this. From this Draft C11 Standard (bold emphasis mine):
6.8 Statements and blocks
…
3 A block allows a set of declarations and statements to be grouped into
one syntactic unit. The initializers of objects that have automatic
storage duration, and the variable length array declarators of
ordinary identifiers with block scope, are evaluated and the values
are stored in the objects (including storing an indeterminate value in
objects without an initializer) each time the declaration is reached
in the order of execution, as if it were a statement, and within each
declaration in the order that declarators appear.
In the case of a for or while loop, a declaration/initializer inside the loop's scope block is reached repeatedly on each and every iteration of the loop.
See 6.2.4, paragraph 6:
If an initialization is specified for the object, it is performed each
time the declaration or compound literal is reached in the execution
of the block; otherwise, the value becomes indeterminate each time the
declaration is reached
Will every compiler actually zero out the struct at the start of every loop?
Yes, or it will produce machine code with equivalent functionality ("observable behavior") as if you had performed a zero-out.
As long as you initialize one single member in the struct, then the rest of them will get set to zero/null ("as if they had static storage duration"). Similarly, any padding bytes added to the struct by the compiler will get set to zero. This is guaranteed by the C standard ISO:9899:2018 6.7.9 §10, §19 and §21.
Generally, the place where the zero-out actually occurs in the resulting executable depends on how the data is used. If you for example zero the struct at the beginning of the loop body, then write to various members and print it all in the end of the loop body, the compiler don't have many other choices but to zero-out everything at each lap of the loop. Example:
for (int x=0 ; x <10; x++)
{
MY_STRUCT structVar = {0};
...
structVar.foo = a;
structVar.bar = b;
printf("%d %d\n", structVar.foo, structVar.bar);
}
On the other hand, the compiler might in this case be smart enough to realize that the struct is just a pointless middle man and replace this all with the equivalent printf("%d %d\n", a, b);, meaning that the struct would be removed entirely from the machine code.
Overall, discussing optimizations like this can't be done without a specific use-case, compiler and target system.
Or do I need to use memset to zero it out?
No. MY_STRUCT structVar = {0}; is functionally 100% equivalent of memset(structVar, 0, sizeof structVar);.
This is a very large struct and I want to allocate it on the stack
That's a different matter than initialization. It is indeed unwise to allocate large objects on the stack. In that case consider replacing it with for example this:
MY_STRUCT* structVar = malloc(sizeof *structVar);
for (int x=0 ; x <10; x++)
{
memset(structVar, 0, sizeof *structVar);
...
}
free(structVar);
For the variables (of automatic storage class) defined within the body of a loop, the variables are (notionally) recreated for each iteration of the loop and have indeterminate values for each iteration if not initialized.
Regarding use of the {0} initializer for an object with automatic storage duration, note the following:
As per 6.7.9/19 and 6.7.9/21, elements of the object that have no explicit initializer will be initialized implicitly the same as object of static storage duration. As per 6.7.9/10, those elements will be initialized to value zero or a null pointer as appropriate, and any padding will be initialized to zero bits.
Using memset(ptr, 0, size) sets size bytes from address ptr onwards to 0, but note the following:
For some unusual execution environments, a pointer object with all bytes zero might not represent a null pointer value (so might not compare equal to 0).
For some unusual execution environments, a floating point object with all bytes zero might not represent a valid floating point value or might not compare equal to 0.0.
In summary, using the {0} initializer is the most portable way to set all elements of the object to compare equal to 0, and to set all padding bits or bytes to 0. Using memset instead is generally OK except for some weird execution environments.
I know that declaring a char[] variable in a while loop is scoped, having seen this post: Redeclaring variables in C.
Going through a tutorial on creating a simple web server in C, I'm finding that I have to manually clear memory assigned to responseData in the example below, otherwise the contents of index.html are just continuously appended to the response and the response contains duplicated contents from index.html:
while (1)
{
int clientSocket = accept(serverSocket, NULL, NULL);
char httpResponse[8000] = "HTTP/1.1 200 OK\r\n\n";
FILE *htmlData = fopen("index.html", "r");
char line[100];
char responseData[8000];
while(fgets(line, 100, htmlData) != 0)
{
strcat(responseData, line);
}
strcat(httpResponse, responseData);
send(clientSocket, httpResponse, sizeof(httpResponse), 0);
close(clientSocket);
}
Correct by:
while (1)
{
...
char responseData[8000];
memset(responseData, 0, strlen(responseData));
...
}
Coming from JavaScript, this was surprising. Why would I want to declare a variable and have access to the memory contents of a variable declared in a different scope with the same name? Why wouldn't C just reset that memory behind the scenes?
Also... Why is it that variables of the same name declared in different scopes get assigned the same memory addresses?
According to this question: Variable declared interchangebly has the same pattern of memory address that ISN'T the case. However, I'm finding that this is occurring pretty reliably.
Not completely correct. You don't need to clear the whole responseData array - clearing its first byte is just enough:
responseData[0] = 0;
As Gabriel Pellegrino notes in the comment, a more idiomatic expression is
responseData[0] = '\0';
It explicitly defines a character via its code point of zero value, while the former uses an int constant zero. In both cases the right-side argument has type int which is implicitly converted (truncated) to char type for assignment. (Paragraph fixed thx to the pmg's comment.)
You could know that from the strcat documentation: the function appends its second argument string to the first one. If you need the very first chunk to get stored into the buffer, you want to append it to an empty string, so you need to ensure the string in the buffer is empty. That is, it consists of the terminating NUL character only. memset-ting the whole array is an overkill, hence a waste of time.
Additionally, using a strlen on the array is asking for troubles. You can't know what the actual contents of the memory block allocated for the array is. If it was not used yet or was overwritten with some other data since your last use, it may contain no NUL character. Then strlen will run out of the array causing Undefined Behavior. And even if it returns successfuly, it will give you the string's length bigger than the size of the array. As a result memset will run out of the array, possibly overwriting some vital data!
Use sizeof whenever you memset an array!
memset(responseData, 0, sizeof(responseData));
EDIT
In the above I tried to explain how to fix the issue with your code, but I didn't answer your questions. Here they are:
Why do variables (...) in different scopes get assigned the same memory addresses?
In regard of execution each iteration of the while(1) { ... } loop indeed creates a new scope. However, each scope terminates before the new one is created, so the compiler reserves appropriate block of memory on the stack and the loop re-uses it in every iteration. That also simplifies a compiled code: every iteration is executed by exactly the same code, which simply jumps at the end to the beginning. All instructions within the loop that access local variables use exactly the same addressing (relative to the stack) in each iteration. So, each variable in the next iteration has precisely the same location in memory as in all previous iterations.
I'm finding that I have to manually clear memory
Yes, automatic variables, allocated on the stack, are not initialized in C by default. We always need to explicitly assign an initial value before we use it – otherwise the value is undefined and may be incorrect (for example, a floating-point variable can appear not-a-number, a character array may appear not terminated, an enum variable may have a value out of the enum's definition, a pointer variable may not point at a valid, accessible location, etc.).
otherwise the contents (...) are just continuously appended
This one was answered above.
Coming from JavaScript, this was surprising
Yes, JavaScript apparently creates new variables at the new scope, hence each time you get a brand new array – and it is empty. In C you just get the same area of a previously allocated memory for an automatic variable, and it's your responsibility to initialize it.
Additionally, consider two consecutive loops:
void test()
{
int i;
for (i=0; i<5; i++) {
char buf1[10];
sprintf(buf1, "%d", i);
}
for (i=0; i<1; i++) {
char buf2[10];
printf("%s\n", buf2);
}
}
The first one prints a single-digit, character representation of five numbers into the character array, overwriting it each time - hence the last value of buf1[] (as a string) is "4".
What output do you expect from the second loop? Generally speaking, we can't know what buf2[] will contain, and printf-ing it causes UB. However we may suppose the same set of variables (namely a single 10-items character array) from both disjoint scopes will get allocated the same way in the same part of a stack. If this is the case, we'll get a digit 4 as an output from a (formally uninitialized) array.
This result depends on the compiler construction and should be considered a coincidence. Do not rely on it as this is UB!
Why wouldn't C just reset that memory behind the scenes?
Because it's not told to. The language was created to compile to effective, compact code. It does as little 'behind the scenes' as possible. Among others things it does not do is not initializing automatic variables unless it's told to. Which means you need to add an explicit initializer to a local variable declaration or add an initializing instruction (e.g. an assignment) before the first use. (This does not apply to global, module-scope variables; those are initialized to zeros by default.)
In higher-level languages some or all variables are initialized on creation, but not in C. That's its feature and we must live with it – or just not use this language.
With this line:
char responseData[8000];
You are saying to your compiler: Hey big C, give me a 8000 bytes chunk and name it responseData.
In runtime, if you don't specify, no one will ever clean or give you a "brand-new" chunk of memory. That means that the 8000 bytes chunk you get in every single execution can hold all the possible permutations of bits in this 8000 bytes. Something extraordinary that can happens, is that you're getting in every execution the same memory region and thus, the same bits in this 8000 bytes your big C gave to you in the first time. So, if you don't clean, you have the impression that you're using the same variable, but you're not! You're just using the same (never cleaned) memory region.
I'd add that it's part of the programmer's responsibilities to clean, if you need to, the memory you're allocating, in dynamic or static way.
Why would I want to declare a variable and have access to the memory contents of a variable declared in a different scope with the same name? Why wouldn't C just reset that memory behind the scenes?
Objects with auto storage duration (i.e., block-scope variables) are not automatically initialized - their initial contents are indeterminate. Remember that C is a product of the early 1970s, and errs on the side of runtime speed over convenience. The C philosophy is that the programmer is in the best position to know whether something should be initialized to a known value or not, and is smart enough to do it themselves if needed.
While you're logically creating and destroying a new instance of responseData on each loop iteration, it turns out the same memory location is being reused each time through. We like to think that space is allocated for each block-scope object as we enter the block and released as we leave it, but in practice that's (usually) not the case - space for all block-scope objects within a function is allocated on function entry, and released on function exit1.
Different objects in different scopes may map to the same memory behind the scenes. Consider something like
void bletch( void )
{
if ( some_condition )
{
int foo = some_function();
printf( "%d\n", foo );
}
else
{
int bar = some_other_function();
printf( "%d\n", bar );
}
It's impossible for both foo and bar to exist at the same time, so there's no reason to allocate separate space for both - the compiler will (usually) allocate space for one int object at function entry, and that space gets used for either foo or bar depending on which branch is taken.
So, what happens with responseData is that space for one 8000-character array is allocated on function entry, and that same space gets used for each iteration of the loop. That's why you need to clear it out on each iteration, either with a memset call or with an initializer like
char responseData[8000] = {0};
As M.M points out in a comment, this isn't true for variable-length arrays (and potentially other variably modified types) - space for those is set aside as needed, although where that space is taken from isn't specified by the language definition. For all other types, though, the usual practice is to allocate all necessary space on function entry.
As this answer on another question covers, using an aggregate initialization
struct foo {
size_t a;
size_t b;
};
struct foo bar = {0};
results in built-in types being initialized to zero.
Is there any difference between using the above and using
struct foo * bar2 = calloc(1, sizeof(struct foo));
leaving aside the fact that one variable is a pointer.
Looking at the debugger we can see that both a and b are indeed set to zero for both of the above examples.
What's the difference between two above examples, are there any gotchas or hidden issues?
Yes, there is a crucial difference (aside from storage-class of your object of type struct foo):
struct foo bar = {0};
struct foo * bar2 = calloc(1, sizeof *bar2);
Every member of bar is zero-initialized (and the padding is zeroed out for sub-object without initializer, or if bar is of static or thread_local storage-class),
while all of *bar2 is zeroed out, which might have completely different results:
Neither null-pointers (T*)0 nor floating-point-numbers with value 0 are guaranteed to be all-bits-0.
(Actually, only for char, unsigned char and signed char (as well as some of the optional exact-size-types from <stdint.h>) it is guaranteed that all-bits-0 matches value-0 till some time after C99. A later technical corrigenda guaranteed it for all integral types.)
The floating-point-format might not be IEEE754.
(On most modern systems you can ignore that possibility though.)
Cite from c-faq (Thanks to Jim Balter for linking it):
The Prime 50 series used segment 07777, offset 0 for the null pointer, at least for PL/I.
struct foo bar = {0};
This defines an object of type struct foo named bar, and initializes it to zero.
"Zero" is defined recursively. All integer subobjects are initialized to 0, all floating-point subobjects to 0.0, and all pointers to NULL.
struct foo * bar2 = calloc(1, sizeof(struct foo));
IMHO this is better (but equivalently) written as:
struct foo *bar2 = calloc(1, sizeof *bar2);
By not repeating the type name, we avoid the risk of a mismatch when the code is changed later on.
This dynamically allocates an object of type struct foo (on the heap), initializes that object to all-bits-zero, and initializes bar2 to point to it.
calloc can fail to allocate memory. If it does, it returns a null pointer. You should always check for that. (The declaration of bar also allocates memory, but if it fails it's a stack overflow, and there's no good way to handle it.)
And all-bits-zero is not guaranteed to be the same as "zero". For integer types (including size_t), it's very nearly guaranteed. For floating-point and pointer types, it's entirely legal for 0.0 or NULL to have some internal representation other than all-bits-zero. You're unlikely to run into this, and since all the members of your structure are integer you probably don't need to worry about it.
calloc gives you a heap dynamically allocated zeroed memory zone (into your bar2). But an automatic variable (like bar, assuming its declaration is inside a function) is allocated on the call stack. See also calloc(3)
In C, you need to explicitly free heap allocated memory zone. But stack allocated data is popped when its function is returning.
Rerad also wikipage on C dynamic memory allocation, and on garbage collection. Reference counting is a widely used technique in C and in C++, and could be viewed as a form of GC. Think about circular references, they are hard to handle.
The Boehm conservative GC can be used in C programs.
Notice that the liveness of a memory zone is a global program-wide property. You generally cannot claim that a give zone belongs to a particular function (or library). But you could adopt conventions about that.
When you code a function returning a heap-allocated pointer (i.e. some pointer to dynamic storage) you should document that fact and decide who is in charge of freeing it.
About initialization: a calloc pointer is zeroed (when calloc succeeds). An automatic variable initialized as {0} is also zeroed. In practice, some implementations may calloc differently big objects (by asking whole zeroed pages from the kernel for them, e.g. with mmap(2)) and small objects (by reusing, if available, a previously free-d zone and zeroing it). zero-ing a zone is using a fast equivalent of memset(3)
PS. I am ignoring the weird machines on which an all zero-bit memory zone is not a cleared data for the C standard, i.e. like {0}. I don't know such machines in practice, even if I know they are in principle possible (and in theory the NULL pointer might not be an all-zero-bit word)
BTW, the compiler may optimize an all-zero local structure (and perhaps not allocate it at all on the stack, since it would fit in registers).
(This answer focuses on the differences in initialization, in the case of a struct only containing integral types)
Both forms set a and b to 0. This is because the Standard defines that all-bits-zero for an integral type must represent a value of 0.
If there is structure padding, then the calloc version sets that but the zero-initialization may not. For example:
struct foo a = { 0 }, b = { 0 };
struct foo c, d; memset(&c, 0, sizeof c); memset(&d, 0, sizeof d);
if ( memcmp(&a, &b, sizeof a) )
printf("This line may appear.\n");
if ( memcmp(&c, &d, sizeof c) )
printf("This line must not appear.\n");
A technique you will sometimes see (especially in code designed to fit on systems with small amounts of storage) is that of using memcmp to compare two structs for equality. When there is padding between structure members, this is unreliable as the padding may be different even though the structure members are the same.
The programmer didn't want to compare structure members individually as that is too much code size, so instead, he will copy structs around using memcpy, initialize them using memset; in order to preserve the ability to use memcmp to check for equality.
In modern programming I'd strongly advise to not do this; and to always use the { 0 } form of initailization. Another benefit of the latter is that there is no chance of making a mistake with the size argument and accidentally setting too much memory or too little memory.
There is a serious difference: allocation of automatic variables is done at compile-time and comes for free (when the stack frame is reserved, the room is there.) On the opposite, dynamic allocation is done at run-time and has an unpredictible and non neglectible cost.
As regards initialization, the compiler has opportunities for optimization with automatic variables (for instance by not clearing if unnecessary); this is not possible with a call of calloc.
If you like the calloc style, you also have the option of performing memset on the automatic variable.
memset(&bar, 0, sizeof bar);
UPDATE: allocation of automatic variables is quasi-done at compile-time.
I know that sometimes if you don't initialize an int, you will get a random number if you print the integer.
But initializing everything to zero seems kind of silly.
I ask because I'm commenting up my C project and I'm pretty straight on the indenting and it compiles fully (90/90 thank you Stackoverflow) but I want to get 10/10 on the style points.
So, the question: when is it appropriate to initialize, and when should you just declare a variable:
int a = 0;
vs.
int a;
There are several circumstances where you should not initialize a variable:
When it has static storage duration (static keyword or global var) and you want the initial value to be zero. Most compilers will actually store zeros in the binary if you explicitly initialize, which is usually just a waste of space (possibly a huge waste for large arrays).
When you will be immediately passing the address of the variable to another function that fills its value. Here, initializing is just a waste of time and may be confusing to readers of the code who wonder why you're storing something in a variable that's about to be overwritten.
When a meaningful value for the variable can't be determined until subsequent code has completed execution. In this case, it's actively harmful to initialize the variable with a dummy value such as zero/NULL, as this prevents the compiler from warning you if you have some code paths where a meaningful value is never assigned. Compilers are good at warning you about accessing uninitialized variables, but can't warn you about "still contains dummy value" variables.
Aside from these issues, I'd say it's generally good practice to initialize your non-static variables when possible.
A rule that hasn't been mentioned yet is this: when the variable is declared inside a function it is not initialised, and when it is declared in static or global scope it's set to 0:
int a; // is set to 0
void foo() {
int b; // set to whatever happens to be in memory there
}
However - for readability I would usually initialise everything at declaration time.
If you're interested in learning this sort of thing in detail, I'd recommend this presentation and this book
I can think of a couple of reason off the top of my head:
When you're going to be initializing it later on in your code.
int x;
if(condition)
{
func();
x = 2;
}
else
{
x = 3;
}
anotherFunc(x); // x will have been set a value no matter what
When you need some memory to store a value set by a function or another piece of code:
int x; // Would be pointless to give x a value here
scanf("%d", &x);
If the variable is in the scope of of a function and not a member of a class I always initialize it because otherwise you will get warnings. Even if this variable will be used later I prefer to assign it on declaration.
As for member variables, you should initialize them in the constructor of your class.
For pointers, always initialize them to some default, particularly NULL, even if they are to be used later, they are dangerous when uninitialized.
Also it is recommended to build your code with the highest level of warnings that your compiler supports, it helps to identify bad practices and potential errors.
Static and global variables will be initialized to zero for you so you may skip initialization. Automatic variables (e.g. non-static variables defined in function body) may contain garbage and should probably always be initialized.
If there is a non-zero specific value you need at initialization then you should always initialize explicitly.
It's always good practice to initialize your variables, but sometimes it's not strictly necessary. Consider the following:
int a;
for (a = 0; a < 10; a++) { } // a is initialized later
or
void myfunc(int& num) {
num = 10;
}
int a;
myfunc(&a); // myfunc sets, but does not read, the value in a
or
char a;
cin >> a; // perhaps the most common example in code of where
// initialization isn't strictly necessary
These are just a couple of examples where it isn't strictly necessary to initialize a variable, since it's set later (but not accessed between declaration and initialization).
In general though, it doesn't hurt to always initialize your variables at declaration (and indeed, this is probably best practice).
In general, there's no need to initialize a variable, with 2 notable exceptions:
You're declaring a pointer (and not assigning it immediately) - you
should always set these to NULL as good style and defensive
programming.
If, when you declare the variable, you already know
what value is going to be assigned to it. Further assignments use up
more CPU cycles.
Beyond that, it's about getting the variables into the right state that you want them in for the operation you're going to perform. If you're not going to be reading them before an operation changes their value (and the operation doesn't care what state it is in), there's no need to initialize them.
Personally, I always like to initialize them anyway; if you forgot to assign it a value, and it's passed into a function by mistake (like a remaining buffer length) 0 is usually cleanly handled - 32532556 wouldn't be.
There is absolutely no reason why variables shouldn't be initialised, the compiler is clever enough to ignore the first assignment if a variable is being assigned twice. It is easy for code to grow in size where things you took for granted (such as assigning a variable before being used) are no longer true. Consider:
int MyVariable;
void Simplistic(int aArg){
MyVariable=aArg;
}
//Months later:
int MyVariable;
void Simplistic(int aArg){
MyVariable+=aArg; // Unsafe, since MyVariable was never initialized.
}
One is fine, the other lands you in a heap of trouble. Occasionally you'll have issues where your application will run in debug mode, but release mode will throw an exception, one reason for this is using an uninitialised variable.
As long as I have not read from a variable before writing to it, I have not had to bother with initializing it.
Reading before writing can cause serious and hard to catch bugs. I think this class of bugs is notorious enough to gain a mention in the popular SICP lecture videos.
Initializing a variable, even if it is not strictly required, is ALWAYS a good practice. The few extra characters (like "= 0") typed during development may save hours of debugging time later, particularly when it is forgotten that some variables remained uninitialized.
In passing, I feel it is good to declare a variable close to its use.
The following is bad:
int a; // line 30
...
a = 0; // line 40
The following is good:
int a = 0; // line 40
Also, if the variable is to be overwritten right after initialization, like
int a = 0;
a = foo();
it is better to write it as
int a = foo();
When declaring an array in C like this:
int array[10];
What is the initial value of the integers?? I'm getting different results with different compilers and I want to know if it has something to do with the compiler, or the OS.
If the array is declared in a function, then the value is undefined. int x[10]; in a function means: take the ownership of 10-int-size area of memory without doing any initialization. If the array is declared as a global one or as static in a function, then all elements are initialized to zero if they aren't initialized already.
As set by the standard, all global and function static variables automatically initialised to 0. Automatic variables are not initialised.
int a[10]; // global - all elements are initialised to 0
void foo(void) {
int b[10]; // automatic storage - contain junk
static int c[10]; // static - initialised to 0
}
However it is a good practice to always manually initialise function variable, regardless of its storage class. To set all array elements to 0 you just need to assign first array item to 0 - omitted elements will set to 0 automatically:
int b[10] = {0};
Why are function locals (auto storage class) not initialized when everything else is?
C is close to the hardware; that's its greatest strength and its biggest danger. The reason auto storage class objects have random initial values is because they are allocated on the stack, and a design decision was made not to automatically clear these (partly because they would need to be cleared on every function call).
On the other hand, the non-auto objects only have to be cleared once. Plus, the OS has to clear allocated pages for security reasons anyway. So the design decision here was to specify zero initialization. Why isn't security an issue with the stack, too? Actually it is cleared, at first. The junk you see is from earlier instances of your own program's call frames and the library code they called.
The end result is fast, memory-efficient code. All the advantages of assembly with none of the pain. Before dmr invented C, "HLL"s like Basic and entire OS kernels were really, literally, implemented as giant assembler programs. (With certain exceptions at places like IBM.)
According to the C standard, 6.7.8 (note 10):
If an object that has automatic
storage duration is not initialized
explicitly, its value is
indeterminate.
So it depends on the compiler. With MSVC, debug builds will initialize automatic variables with 0xcc, whereas non-debug builds will not initialize those variables at all.
A C variable declaration just tells the compiler to set aside and name an area of memory for you. For automatic variables, also known as stack variables, the values in that memory are not changed from what they were before. Global and static variables are set to zero when the program starts.
Some compilers in unoptimized debug mode set automatic variables to zero. However, it has become common in newer compilers to set the values to a known bad value so that the programmer does not unknowingly write code that depends on a zero being set.
In order to ask the compiler to set an array to zero for you, you can write it as:
int array[10] = {0};
Better yet is to set the array with the values it should have. That is more efficient and avoids writing into the array twice.
In most latest compilers(eg. gcc/vc++), partially initialized local array/structure members are default initialized to zero(int), NULL(char/char string), 0.000000(float/double).
Apart from local array/structure data as above, static(global/local) and global space members are also maintain the same property.
int a[5] = {0,1,2};
printf("%d %d %d\n",*a, *(a+2), *(a+4));
struct s1
{
int i1;
int i2;
int i3;
char c;
char str[5];
};
struct s1 s11 = {1};
printf("%d %d %d %c %s\n",s11.i1,s11.i2, s11.i3, s11.c, s11.str);
if(!s11.c)
printf("s11.c is null\n");
if(!*(s11.str))
printf("s11.str is null\n");
In gcc/vc++, output should be:
0 2 0
1 0 0 0.000000
s11.c is null
s11.str is null
Text from http://www.cplusplus.com/doc/tutorial/arrays/
SUMMARY:
Initializing arrays. When declaring a
regular array of local scope (within a
function, for example), if we do not
specify otherwise, its elements will
not be initialized to any value by
default, so their content will be
undetermined until we store some value
in them. The elements of global and
static arrays, on the other hand, are
automatically initialized with their
default values, which for all
fundamental types this means they are
filled with zeros.
In both cases, local and global, when
we declare an array, we have the
possibility to assign initial values
to each one of its elements by
enclosing the values in braces { }.
For example:
int billy [5] = { 16, 2, 77, 40, 12071 };
The relevant sections from the C standard (emphasis mine):
5.1.2 Execution environments
All objects with static storage duration shall be initialized (set to their initial values) before program startup.
6.2.4 Storage durations of objects
An object whose identifier is declared with external or internal linkage, or with the storage-class specifier static has static storage duration.
6.2.5 Types
Array and structure types are collectively called aggregate types.
6.7.8 Initialization
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned) zero;
if it is an aggregate, every member is initialized (recursively) according to these rules;
if it is a union, the first named member is initialized (recursively) according to these rules.
It depends from the location of your array.
if it is global/static array it will be part of bss section which means it will be zero initialized at run time by C copy routine.
If it is local array inside a function, then it will be located within the stack and initial value is not known.
if array is declared inside a function then it has undefined value but if the array declared as global one or it is static inside the function then the array has default value of 0.