Different ways of suppressing 'uninitialized variable warnings' in C - c

I have encountered multiple uses of the uninitialized_var() macro designed to get rid of warnings like:
warning: ‘ptr’ is used uninitialized in this function [-Wuninitialized]
For GCC (<linux/compiler-gcc.h>) it is defined such a way:
/*
* A trick to suppress uninitialized variable warning without generating any
* code
*/
#define uninitialized_var(x) x = x
But also I discovered that <linux/compiler-clang.h> has the same macro defined in a different way:
#define uninitialized_var(x) x = *(&(x))
Why we have two different definitions? For what reason the first way may be insufficient? Is the first way insufficient just for Clang or in some other cases too?
c example:
#define uninitialized_var(x) x = x
struct some {
int a;
char b;
};
int main(void) {
struct some *ptr;
struct some *uninitialized_var(ptr2);
if (1)
printf("%d %d\n", ptr->a, ptr2->a); // warning about ptr, not ptr2
}

Compilers are made to recognize certain constructs as indications that the author intended something deliberately, when the compiler would otherwise warn about it. For example, given if (b = a), GCC and Clang both warn that an assignment is being used as a conditional, but they do not warn about if ((b = a)) even though it is equivalent in terms of the C standard. This particular construct with extra parentheses has simply been set as a way to tell the compiler the author truly intends this code.
Similarly, x = x has been set as a way to tell GCC not to warn about x being uninitialized. There are times where a function may appear to have a code path in which an object is used without being initialized, but the author knows the function is intended not to be used with parameters that would ever cause that particular code path to be executed and, for reasons of efficiency, they want to silence the compiler warning rather than add an initialization that is not actually necessary for program correctness.
Clang was presumably designed not to recognize GCC’s idiom for this and needed a different method.

Why we have two different definitions?
Unclear, but I speculate that it's because Clang still produces a warning for x = x when x is uninitialized, but not for x = *(&(x)). Under almost every circumstance* in which one of those expressions has well-defined behavior, the other has the same well-defined behavior. Under other circumstances, such as when the value of x is undefined or indeterminate, both have undefined behavior, or the behavior of x = x is defined and that of x = *(&(x)) undefined, so the latter provides no advantage.
For what reason the first way may be insufficient?
Because the behavior of both is undefined in the use cases for which they seem to be intended. It is not at all surprising, then, that different compilers handle them differently.
Is the first way insufficient just for Clang or in some other cases too?
Both expressions' meaning and behavior is undefined. In one sense, then, one cannot safely conclude that either is sufficient for anything. In the empirical sense of whether using one or the other fools certain compilers into not emitting warnings that they otherwise would, and still should, emit, it is likely that there were, are, and / or will be compilers that handle the undefined behavior associated with both of those expressions differently than GCC and Clang do.
* The exception being when x is declared with register storage class, in which case the second expression has undefined behavior regardless of whether x has a well-defined value.

Related

Is it always undefined behaviour to copy the bits of a variable through an incompatible pointer?

For example, can this
unsigned f(float x) {
unsigned u = *(unsigned *)&x;
return u;
}
cause unpredictable results on a platform where,
unsigned and float are both 32-bit
a pointer has a fixed size for all types
unsigned and float can be stored to and loaded from the same part of memory.
I know about strict aliasing rules, but most examples showing problematic cases of violating strict aliasing is like the following.
static int g(int *i, float *f) {
*i = 1;
*f = 0;
return *i;
}
int h() {
int n;
return g(&n, (float *)&n);
}
In my understanding, the compiler is free to assume that i and f are implicitly restrict. The return value of h could be 1 if the compiler thinks *f = 0; is redundant (because i and f can't alias), or it could be 0 if it puts into account that the values of i and f are the same. This is undefined behaviour, so technically, anything else can happen.
However, the first example is a bit different.
unsigned f(float x) {
unsigned u = *(unsigned *)&x;
return u;
}
Sorry for my unclear wording, but everything is done "in-place". I can't think of any other way the compiler might interpret the line unsigned u = *(unsigned *)&x;, other than "copy the bits of x to u".
In practice, all compilers for various architectures I tested in https://godbolt.org/ with full optimization produce the same result for the first example, and varying results (either 0 or 1) for the second example.
I know it's technically possible that unsigned and float have different sizes and alignment requirements, or should be stored in different memory segments. In that case even the first code won't make sense. But on most modern platforms where the following holds, is the first example still undefined behaviour (can it produce unpredictable results)?
unsigned and float are both 32-bit
a pointer has a fixed size for all types
unsigned and float can be stored to and loaded from the same part of memory.
In real code, I do write
unsigned f(float x) {
unsigned u;
memcpy(&u, &x, sizeof(x));
return u;
}
The compiled result is the same as using pointer casting, after optimization. This question is about interpretation of the standard about strict aliasing rules for code such as the first example.
Is it always undefined behaviour to copy the bits of a variable through an incompatible pointer?
Yes.
The rule is https://port70.net/~nsz/c/c11/n1570.html#6.5p7 :
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
a type that is the signed or unsigned type corresponding to the effective type of the
object,
a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its
members (including, recursively, a member of a subaggregate or contained union), or
a character type.
The effective type of the object x is float - it is defined with that type.
unsigned is not compatible with float,
unsigned is not a qualified version of float,
unsigned is not a signed or unsigned type of float,
unsigned is not a signed or unsigned type corresponding to qualified version of float,
unsigned is not an aggregate or union type
and unsigned is not a character type.
The "shall" is violated, it is undefined behavior (see https://port70.net/~nsz/c/c11/n1570.html#4p2 ). There is no other interpretation.
We also have https://port70.net/~nsz/c/c11/n1570.html#J.2 :
The behavior is undefined in the following circumstances:
An object has its stored value accessed other than by an lvalue of an allowable type (6.5).
As Kamil explains, it's UB. Even int and long (or long and long long) aren't alias-compatible even when they're the same size. (But interestingly, unsigned int is compatible with int)
It's nothing to do with being the same size, or using the same register-set as suggested in a comment, it's mainly a way to let compilers assume that different pointers don't point to overlapping memory when optimizing. They still have to support C99 union type-punning, not just memcpy. So for example a dst[i] = src[i] loop doesn't need to check for possible overlap when unrolling or vectorizing, if dst and src have different types.1
If you're accessing the same integer data, the standard requires that you use the exact same type, modulo only things like signed vs. unsigned and const. Or that you use (unsigned) char*, which is like GNU C __attribute__((may_alias)).
The other part of your question seems to be why it appears to work in practice, despite the UB.
Your godbolt link forgot to link the actual compilers you tried.
https://godbolt.org/z/rvj3d4e4o shows GCC4.1, from before GCC went out of its way to support "obvious" local compile-time-visible cases like this, to sometimes not break people's buggy code using non-portable idioms like this.
It loads garbage from stack memory, unless you use -fno-strict-aliasing to make it movd to that location first. (Store/reload instead of movd %xmm0, %eax is a missed-optimization bug that's been fixed in later GCC versions for most cases.)
f: # GCC4.1 -O3
movl -4(%rsp), %eax
ret
f: # GCC4.1 -O3 -fno-strict-aliasing
movss %xmm0, -4(%rsp)
movl -4(%rsp), %eax
ret
Even that old GCC version warns warning: dereferencing type-punned pointer will break strict-aliasing rules which should make it obvious that GCC notices this and does not consider it well-defined. Later GCC that do choose to support this code still warn.
It's debatable whether it's better to sometimes work in simple cases, but break other times, vs. always failing. But given that GCC -Wall does still warn about it, that's probably a good tradeoff between convenience for people dealing with legacy code or porting from MSVC. Another option would be to always break it unless people use -fno-strict-aliasing, which they should if dealing with codebases that depend on this behaviour.
Being UB doesn't mean required-to-fail
Just the opposite; it would take tons of extra work to actually trap on every signed overflow in the C abstract machine, for example, especially when optimizing stuff like 2 + c - 3 into c - 1. That's what gcc -fsanitize=undefined tries to do, adding x86 jo instructions after additions (except it still does constant-propagation so it's just adding -1, not detecting temporary overflow on INT_MAX. https://godbolt.org/z/WM9jGT3ac). And it seems strict-aliasing is not one of the kinds of UB it tries to detect at run time.
See also the clang blog article: What Every C Programmer Should Know About Undefined Behavior
An implementation is free to define behaviour the ISO C standard leaves undefined
For example, MSVC always defines this aliasing behaviour, like GCC/clang/ICC do with -fno-strict-aliasing. Of course, that doesn't change the fact that pure ISO C leaves it undefined.
It just means that on those specific C implementations, the code is guaranteed to work the way you want, rather than happening to do so by chance or by de-facto compiler behaviour if it's simple enough for modern GCC to recognize and do the more "friendly" thing.
Just like gcc -fwrapv for signed-integer overflows.
Footnote 1: example of strict-aliasing helping code-gen
#define QUALIFIER // restrict
void convert(float *QUALIFIER pf, const int *pi) {
for(int i=0 ; i<10240 ; i++){
pf[i] = pi[i];
}
}
Godbolt shows that with the -O3 defaults for GCC11.2 for x86-64, we get just a SIMD loop with movdqu / cvtdq2ps / movups and loop overhead. With -O3 -fno-strict-aliasing, we get two versions of the loop, and an overlap check to see if we can run the scalar or the SIMD version.
Is there actual cases where strict aliasing helps better code generation, in which the same cannot be achieved with restrict
You might well have a pointer that might point into either of two int arrays, but definitely not at any float variable, so you can't use restrict on it. Strict-aliasing will let the compiler still avoid spill/reload of float objects around stores through the pointer, even if the float objects are global vars or otherwise aren't provably local to the function. (Escape analysis.)
Or a struct node * that definitely isn't the same type as the payload in a tree.
Also, most code doesn't use restrict all over the place. It could get quite cumbersome. Not just in loops, but in every function that deals with pointers to structs. And if you get it wrong and promise something that's not true, your code's broken.
The Standard was never intended to fully, accurately, and unambiguously partition programs that have defined behavior and those that don't(*), but instead relies upon compiler writers to exercise a certain amount of common sense.
(*) If it was intended for that purpose, it fails miserably, as evidenced by the amount of confusion stemming from it.
Consider the following two code snippets:
/* Assume suitable declarations of u are available everywhere */
union test { uint32_t ww[4]; float ff[4]; } u;
/* Snippet #1 */
uint32_t proc1(int i, int j)
{
u.ww[i] = 1;
u.ff[j] = 2.0f;
return u.ww[i];
}
/* Snippet #2, part 1, in one compilation unit */
uint32_t proc2a(uint32_t *p1, float *p2)
{
*p1 = 1;
*p2 = 2.0f;
return *p1;
}
/* Snippet #2, part 2, in another compilation unit */
uint32_t proc2(int i, int j)
{
return proc2a(u.ww+i, u.ff+j);
}
It is clear that the authors of the Standard intended that the first version of the code be processed meaningfully on platforms where that would make sense, but it's also clear that at least some of the authors of C99 and later versions did not intend to require that the second version be processed likewise (some of the authors of C89 may have intended that the "strict aliasing rule" only apply to situations where a directly named object would be accessed via pointer of another type, as shown in the example given in the published Rationale; nothing in the Rationale suggests a desire to apply it more broadly).
On the other hand, the Standard defines the [] operator in such a fashion that proc1 is semantically equivalent to:
uint32_t proc3(int i, int j)
{
*(u.ww+i) = 1;
*(u.ff+j) = 2.0f;
return *(u.ww+i);
}
and there's nothing in the Standard that would imply that proc() shouldn't have the same semantics. What gcc and clang seem to do is special-case the [] operator as having a different meaning from pointer dereferencing, but nothing in the Standard makes such a distinction. The only way to consistently interpret the Standard is to recognize that the form with [] falls in the category of actions which the Standard doesn't require that implementations process meaningfully, but relies upon them to handle anyway.
Constructs such as yours example of using a directly-cast pointer to access storage associated with an object of the original pointer's type fall in a similar category of constructs which at least some authors of the Standard likely expected (and would have demanded, if they didn't expect) that compilers would handle reliably, with or without a mandate, since there was no imaginable reason why a quality compiler would do otherwise. Since then, however, clang and gcc have evolved to defy such expectations. Even if clang and gcc would normally generate useful machine code for a function, they seek to perform aggressive inter-procedural optimizations that make it impossible to predict what constructs will be 100% reliable. Unlike some compilers which refrain from applying potential optimizing transforms unless they can prove that they are sound, clang and gcc seek to perform transforms that can't be proven to affect program behavior.

What is the expected output of initializing a variable to itself? [duplicate]

I noticed just now that the following code can be compiled with clang/gcc/clang++/g++, using c99, c11, c++11 standards.
int main(void) {
int i = i;
}
and even with -Wall -Wextra, none of the compilers even reports warnings.
By modifying the code to int i = i + 1; and with -Wall, they may report:
why.c:2:13: warning: variable 'i' is uninitialized when used within its own initialization [-Wuninitialized]
int i = i + 1;
~ ^
1 warning generated.
My questions:
Why is this even allowed by compilers?
What does the C/C++ standards say about this? Specifically, what's the behavior of this? UB or implementation dependent?
Because i is uninitialized when use to initialize itself, it has an indeterminate value at that time. An indeterminate value can be either an unspecified value or a trap representation.
If your implementation supports padding bits in integer types and if the indeterminate value in question happens to be a trap representation, then using it results in undefined behavior.
If your implementation does not have padding in integers, then the value is simply unspecified and there is no undefined behavior.
EDIT:
To elaborate further, the behavior can still be undefined if i never has its address taken at some point. This is detailed in section 6.3.2.1p2 of the C11 standard:
If the lvalue designates an object of automatic storage
duration that could have been declared with the register storage
class (never had its address taken), and that object is uninitialized
(not declared with an initializer and no assignment to it
has been performed prior to use), the behavior is undefined.
So if you never take the address of i, then you have undefined behavior. Otherwise, the statements above apply.
This is a warning, it's not related to the standard.
Warnings are heuristic with "optimistic" approach. The warning is issued only when the compiler is sure that it's going to be a problem. In cases like this you have better luck with clang or newest versions of gcc as stated in comments (see another related question of mine: why am I not getting an "used uninitialized" warning from gcc in this trivial example?).
anyway, in the first case:
int i = i;
does nothing, since i==i already. It is possible that the assignment is completely optimized out as it's useless. With compilers which don't "see" self-initialization as a problem you can do this without a warning:
int i = i;
printf("%d\n",i);
Whereas this triggers a warning all right:
int i;
printf("%d\n",i);
Still, it's bad enough not to be warned about this, since from now on i is seen as initialized.
In the second case:
int i = i + 1;
A computation between an uninitialized value and 1 must be performed. Undefined behaviour happens there.
I believe you are okay with getting the warning in case of
int i = i + 1;
as expected, however, you expect the warning to be displayed even in case of
int i = i;
also.
Why is this even allowed by compilers?
There is nothing inherently wrong with the statement. See the related discussions:
Why does the compiler allow initializing a variable with itself?
Why is initialization of a new variable by itself valid?
for more insight.
What does the C/C++ standards say about this? Specifically, what's the behavior of this? UB or implementation dependent?
This is undefined behavior, as the type int can have trap representation and you never have taken the address of the variable in discussion. So, technically, you'll face UB as soon as you try to use the (indeterminate) value stored in variable i.
You should turn on your compiler warnings. In gcc,
compile with -Winit-self to get a warning. in C.
For C++, -Winit-self is enabled with -Wall already.

behaviour of const static qualifier [duplicate]

I wrote some thing similar to this in my code
const int x=1;
int *ptr;
ptr = &x;
*ptr = 2;
Does this work on all compilers? Why doesn't the GCC compiler notice that we are changing a constant variable?
const actually doesn't mean "constant". Something that's "constant" in C has a value that's determined at compile time; a literal 42 is an example. The const keyword really means read-only. Consider, for example:
const int r = rand();
The value of r is not determined until program execution time, but the const keyword means that you're not permitted to modify r after it's been initialized.
In your code:
const int x=1;
int *ptr;
ptr = &x;
*ptr = 2;
the assignment ptr = &x; is a constraint violation, meaning that a conforming compiler is required to complain about it; you can't legally assign a const int* (pointer to const int) value to a non-const int* object. If the compiler generates an executable (which it needn't do; it could just reject it), then the behavior is not defined by the C standard.
For example, the generated code might actually store the value 2 in x -- but then a later reference to x might yield the value 1, because the compiler knows that x can't have been modified after its initialization. And it knows that because you told it so, by defining x as const. If you lie to the compiler, the consequences can be arbitrarily bad.
Actually, the worst thing that can happen is that the program behaves as you expect it to; that means you have a bug that's very difficult to detect. (But the diagnostic you should have gotten will have been a large clue.)
Online C 2011 draft:
6.7.3 Type qualifiers
...
6 If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined.133)
133) This applies to those objects that behave as if they were defined with qualified types, even if they are
never actually defined as objects in the program (such as an object at a memory-mapped input/output
address).
Emphasis added.
Since the behavior is left undefined, the compiler is not required to issue a diagnostic, nor is it required to halt translation. This would be difficult to catch in the general case; suppose you had a function like
void foo( int *p ) { *p = ...; }
defined in it's own separate translation unit. During translation, the compiler has no way of knowing if p could be pointing to a const-qualified object or not. If your call is something like
const int x;
foo( &x );
you may get a warning like parameter 1 of 'foo' discards qualifiers or something similarly illuminating.
Also note that the const qualifier doesn't necessarily mean that the associated variable will be stored in read-only memory, so it's possible the above code would "work" (update the value in x) in that you'd successfully update x by doing an end-run around the const semantics. But then you might as well just not declare x to be const.
There is a good discussion of this here:Does the evil cast get trumped by the evil compiler?
I would expect gcc to compile this because:
ptr is allowed to point to x, otherwise reading it would be impossible, although as the comment below says it's not exactly brilliant code and the compiler should complain. Warning options will (I guess) affect whether or not it's actually warned about.
when you write to x, the compiler no longer "knows" that it is writing to a const, all this is in the coders hands in C. However, it does really know, so it may well warn you, depending on the warning options you've selected.
whether it works or not, however, will depend on how the code is compiled, the way const is implemented for the compile options selected and the target CPU and architecture. It may work. Or it may crash. Or you may write to a "random" bit of memory and cause (or not) some freaky effect. It's not a good coding strategy, but that wasn't your question :-)
Bad programmer. No Moon Pie!
If you need to modify a const, copy it to a non-const variable and then work with it. It's const for a reason. Trying to "sneak" around a const can cause serious runtime issues. i.e. the optimizer has likely used the value inline, etc.
const int x=1;
int non_const_x = x;
non_const_x = 2;

In C, can a const variable be modified via a pointer?

I wrote some thing similar to this in my code
const int x=1;
int *ptr;
ptr = &x;
*ptr = 2;
Does this work on all compilers? Why doesn't the GCC compiler notice that we are changing a constant variable?
const actually doesn't mean "constant". Something that's "constant" in C has a value that's determined at compile time; a literal 42 is an example. The const keyword really means read-only. Consider, for example:
const int r = rand();
The value of r is not determined until program execution time, but the const keyword means that you're not permitted to modify r after it's been initialized.
In your code:
const int x=1;
int *ptr;
ptr = &x;
*ptr = 2;
the assignment ptr = &x; is a constraint violation, meaning that a conforming compiler is required to complain about it; you can't legally assign a const int* (pointer to const int) value to a non-const int* object. If the compiler generates an executable (which it needn't do; it could just reject it), then the behavior is not defined by the C standard.
For example, the generated code might actually store the value 2 in x -- but then a later reference to x might yield the value 1, because the compiler knows that x can't have been modified after its initialization. And it knows that because you told it so, by defining x as const. If you lie to the compiler, the consequences can be arbitrarily bad.
Actually, the worst thing that can happen is that the program behaves as you expect it to; that means you have a bug that's very difficult to detect. (But the diagnostic you should have gotten will have been a large clue.)
Online C 2011 draft:
6.7.3 Type qualifiers
...
6 If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined.133)
133) This applies to those objects that behave as if they were defined with qualified types, even if they are
never actually defined as objects in the program (such as an object at a memory-mapped input/output
address).
Emphasis added.
Since the behavior is left undefined, the compiler is not required to issue a diagnostic, nor is it required to halt translation. This would be difficult to catch in the general case; suppose you had a function like
void foo( int *p ) { *p = ...; }
defined in it's own separate translation unit. During translation, the compiler has no way of knowing if p could be pointing to a const-qualified object or not. If your call is something like
const int x;
foo( &x );
you may get a warning like parameter 1 of 'foo' discards qualifiers or something similarly illuminating.
Also note that the const qualifier doesn't necessarily mean that the associated variable will be stored in read-only memory, so it's possible the above code would "work" (update the value in x) in that you'd successfully update x by doing an end-run around the const semantics. But then you might as well just not declare x to be const.
There is a good discussion of this here:Does the evil cast get trumped by the evil compiler?
I would expect gcc to compile this because:
ptr is allowed to point to x, otherwise reading it would be impossible, although as the comment below says it's not exactly brilliant code and the compiler should complain. Warning options will (I guess) affect whether or not it's actually warned about.
when you write to x, the compiler no longer "knows" that it is writing to a const, all this is in the coders hands in C. However, it does really know, so it may well warn you, depending on the warning options you've selected.
whether it works or not, however, will depend on how the code is compiled, the way const is implemented for the compile options selected and the target CPU and architecture. It may work. Or it may crash. Or you may write to a "random" bit of memory and cause (or not) some freaky effect. It's not a good coding strategy, but that wasn't your question :-)
Bad programmer. No Moon Pie!
If you need to modify a const, copy it to a non-const variable and then work with it. It's const for a reason. Trying to "sneak" around a const can cause serious runtime issues. i.e. the optimizer has likely used the value inline, etc.
const int x=1;
int non_const_x = x;
non_const_x = 2;

Variables defined and assigned at the same time

A coding style presentation that I attended lately in office advocated that variables should NOT be assigned (to a default value) when they are defined. Instead, they should be assigned a default value just before their use.
So, something like
int a = 0;
should be frowned upon.
Obviously, an example of 'int' is simplistic but the same follows for other types also like pointers etc.
Further, it was also mentioned that the C99 compatible compilers now throw up a warning in the above mentioned case.
The above approach looks useful to me only for structures i.e. you memset them only before use. This would be efficient if the structure is used (or filled) only in an error leg.
For all other cases, I find defining and assigning to a default value a prudent exercise as I have encountered a lot of bugs because of un-initialized pointers both while writing and maintaining code. Further, I believe C++ via constructors also advocates the same approach i.e. define and assign.
I am wondering why(if) C99 standard does not like defining & assigning. Is their any considerable merit in doing what the coding style presentation advocated?
Usually I'd recommend initialising variables when they are defined if the value they should have is known, and leave variables uninitialised if the value isn't. Either way, put them as close to their use as scoping rules allow.
Instead, they should be assigned a default value just before their use.
Usually you shouldn't use a default value at all. In C99 you can mix code and declarations, so there's no point defining the variable before you assign a value to it. If you know the value it's supposed to take, then there is no point in having a default value.
Further, it was also mentioned that the C99 compatible compilers now throw up a warning in the above mentioned case.
Not for the case you show - you don't get a warning for having int x = 0;. I strongly suspect that someone got this mixed up. Compilers warn if you use a variable without assigning a value to it, and if you have:
... some code ...
int x;
if ( a )
x = 1;
else if ( b )
x = 2;
// oops, forgot the last case else x = 3;
return x * y;
then you will get a warning that x may be used without being initialised, at least with gcc.
You won't get a warning if you assign a value to x before the if, but it is irrelevant whether the assignment is done as an initialiser or as a separate statement.
Unless you have a particular reason to assign the value twice for two of the branches, there's no point assigning the default value to x first, as it stops the compiler warning you that you've covered every branch.
There's no such requirement (or even guideline that I'm aware of) in C99, nor does the compiler warn you about it. It's simply a matter of style.
As far as coding style is concerned, I think you took things too literally. For example, your statement is right in the following case...
int i = 0;
for (; i < n; i++)
do_something(i);
... or even in ...
int i = 1;
[some code follows here]
while (i < a)
do_something(i);
... but there are other cases that, in my mind, are better handled with an early "declare and assign". Consider structures constructed on the stack or various OOP constructs, like in:
struct foo {
int bar;
void *private;
};
int my_callback(struct foo *foo)
{
struct my_struct *my_struct = foo->private;
[do something with my_struct]
return 0;
}
Or like in (C99 struct initializers):
void do_something(int a, int b, int c)
{
struct foo foo = {
.a = a,
.b = b + 1,
.c = c / 2,
};
write_foo(&foo);
}
I sort of concur with the advice, even though I'm not altogether sure the standard says anything about it, and I very much doubt the bit about compiler warnings is true.
The thing is, modern compilers can and do detect the use of uninitialised variables. If you set your variables to default values at initialisation, you lose that detection. And default values can cause bugs too; certainly in the case of your example, int a = 0;. Who says 0 is an appropriate value for a?
In the 1990s, the advice would've been wrong. Nowadays, it's correct.
I find it highly useful to pre-assign some default data to variables so that i don't have to do (as many) null checks in code.
I have seen so many bugs due to uninitialized pointers that I always advocated to declare each variable with NULL_PTR and each primitivewith some invalid/default value.
Since I work on RTOS and high performance but low resource systems, it is possible that the compilers we use do not catch non-initialized usage. Though I doubt modern compilers can also be relied on 100%.
In large projects where Macro's are extensively used, I have seen rare scenarios where even Kloclwork /Purify have failed to find non-initialized usage.
So I say stick with it as long as you are using plain old C/C++.
Modern languages like .Net can guarantee to initialize varaibles, or give a compiler error for uninitialized variable usage. Following link does a performance analysis and validates that there is a 10-20% performance hit for .NET. The analysis is in quite detail and is explained well.
http://www.codeproject.com/KB/dotnet/DontInitializeVariables.aspx

Resources