Defining a variable before or within a loop - c

Is there a difference between the following two approaches to defining a for-loop variable in C?
int i;
for (i = 0; i < X; i++) {
// something
}
And:
for (int i = 0; i < X; i++) {
// something
}
My preference is to use the second approach if the i is just always a throw-away, but is there any reason that it wouldn't be a good idea to do that?

Yes.
As the i variable is usually used only to count the number of iterations needed, it doesn't make sense to have a variable live outside the scope of the loop. That, if you can, should be avoided.
As some comments to the question mention, there are some cases in which you can't use the second, but that's not the general case.
As for the compiler later compiling to the same assembly, that may be true, but conceptually the second is cleaner, and for someone reading the code from outside, it makes it clear that the variable is never used again.
Hope this helps!

but is there any reason that it wouldn't be a good idea to do that?
In C, you should prefer the second form, because it reduces the scope of the variable and makes it more obvious where it is going to be used. Unless...
...it goes against the coding guidelines of a given project. For instance, the Linux kernel declares all variables at the top of a function.
...you want to be conforming to C90: you cannot use loop initial declarations.
In C++, however, objects may have very expensive constructors, which means that, at times, you may want to re-use them rather than initialize a new one every time (e.g. if you construct a new one within the body of the loop).

There are a number of differences between the two.
The second form is illegal in the 1989 ANSI (1990 ISO) C standard. The first is supported in C from 1999 and in standard C++. It is supported by some C compilers older than 1999, either as a non-standard or optional extension or because those compilers were actually C++ compilers with a C mode.
In the first form, i exists after the loop, so its value can still be accessed, but redefining it results in a diagnostic (compile time error). In the second form, i does not exist after the loop, so accessing its value after the loop gives a diagnostic, but i can be redefined.
In general terms, it is advisable to ensure that variables only exist for as long as needed, and cease to exist when no longer needed. The second form explicitly allows that.
Obviously, variables that need to exist outside the loop, need to be defined outside it. But, if the variable i is not needed outside the loop, then I would favour the second form. This allows the compilers to catch problems, such as unintended use of variable i after the loop.
Some older C and C++ compilers (mainly dating from before the 1998 C++ standard was ratified, but some from the early 2000s) implement the second form so the variable i still exists after the loop. This effectively makes the two forms equivalent when using those compilers.

Related

Is declaring a variable inside an if statement in c a bad habit?

My assumption is that this is going to mess with checkers and stack analysis.
I can't prove my assumption and I don't think C99 will complain. Probably neither c89 will because the definition is immediately after the opening of the curly brace:
if(true == condition){
int i = 0;
/* do stuff with i */
}else{
foo():
}
The two paths will lead to different stack usage.
Declaring i outside the if/else statement will lead to a more defined stack usage (ok, I am branching to foo,so the stack will not be exactly the same in the two cases).
But Misra advises to limit the scope of a variable closest to its usage.
Am I overthinking it or could there be a rationale in my assumption?
Is declaring a variable inside an if statement in c a bad habit?
No.
A modern approach is to minimize the scope of the variables used, so that logical (hard-to-fix) and syntactical (easy-to-fix) errors are avoided.
Of course, there are people that still like to see all the variables defined at the topmost part of the code, because this was the convention in the past, as #Clifford commented.
BTW, your code should compile fine, both with C89 and C99.
This stack usage thought is the result of overthinking, and I suggest you follow the Ancient Hellenic phrase: Métron áriston.
The code is fine in any version of C (except C90 does not support true).
The two paths will lead to different stack usage.
This is mostly a myth. Modern compilers stack a variable if they can determine that it is needed, regardless of where you place the declaration.
If the variable is allocated in a register, then it will only be allocated when the program takes the path where your example declares the variable. This is not because of where the declaration is placed, but because that path will be executed. So again, for the sake of performance, it doesn't matter where the variable is declared, as long as it is somewhere in local scope and not at file scope.
It is good practice to limit the scope of variables as much as possible. But this is to avoid unintentional bugs and namespace collisions.
But Misra advises to limit the scope of a variable closest to its usage.
No it doesn't, but some static analysers require you to do that, on top of the MISRA requirement. Both MISRA-C:2004 8.7 and MISRA-C:2012 8.9 only require that you place a variable at block scope, if it is only used by one function. That's it.
MISRA does however say:
Within a function, whether objects are defined at the outermost or innermost block is largely a matter of style

How to check if a variable has been initialized in C?

Is there a way to check if a variable has been initialized or not in C?
Consider the following example,
int main(){
int a = 3, b = 7, c;
if ( a > b )
c = a-b;
// Now, how can I check if the variable "c" has some value or not
// I don't want check like this,
// if ( isalpha(c) ) or if ( isdigit(c) )
// or anything similar to like that
}
In other words, does C has some function like defined in Perl. In Perl, I can simply do if (defined c)that would check if the variable is defined or not, and it'd return False for above example. How can I achieve the same in C?
C does not have this ability. You have two main options:
A sentinel value
For example, if you know that the value of c will never be negative, then initialize it to -1, and test that for that.
Add another variable
Create another variable bool we_set_c_to_something = false; and then set it to true when you write to c.
C is a compiled language which doesn't support runtime variable binding, while Perl is a interpreted language which support dynamic typing. So you can check the definition of a variable in Perl, but not in C.
When you declare a variable in C int c;, this variable c is defined but without initialization. The declaration and definition are in one statement.
The definition of a variable in C is not checked by code writer. The compilers do it for you. When compile and link your C code, the compiler will check all variable's definitions. An error will be invoked and the compiling or linking process will stop if there are undefined variables found in your code.
Hope this will make you distinguish the differences.
Wrong question. You're not asking whether the variable is defined. If the variable is not defined then compilation fails. Look up the difference between "declaration" and "definition". In the case of those local variables, you have defined the variable c.
What you're looking for is initialisation. Many compilers will warn you about using variables before they're initialised, but if you persist in running that code then the assumption is that you know better than the compiler. And at that point it's your problem. :) Some languages (e.g. Perl) have an extra flag that travels along with a variable to say whether it's been initialised or not, and they hide from you that there's this extra flag hanging around which you may or may not need. If you want this in C, you need to code it yourself.
Since C++ allows operator overloading, it's relatively easily to implement this in C++. Boost provides an "optional" template which does it, or you could roll your own if you want a coding exercise. C doesn't have the concept of operator overloading though (hell, the concept didn't really exist, and the compilers of the day probably couldn't have supported it anyway) so you get what you get.
Perl is a special case because it rolls the two together, but C doesn't. It's entirely possible in C to have variables which are defined but not initialised. Indeed there are a lot of cases where we want that to be the case, particularly when you start doing low-level access to memory for drivers and stuff like that.

Difference of variable declaration in for statement in C [duplicate]

This question already has answers here:
Declaring variables inside loops, good practice or bad practice?
(9 answers)
Closed 7 years ago.
let's assume that it is not only in visual studio but also in C99, C11 and etc.
there are two different ways of declaring variable "i" in for statement.
1)
int i;
for(i = 0 ; i < index ; ++i)
2)
for(int i = 0 ; i < index ; ++i)
Both work same. but I think there will be some difference between them.
Do you have any idea about that?
If yes, please let me know.
I just wondering about your opinion, and how it works differently.
Sorry. for answers, I know that the scope of "i" is different.
Is there any difference in view of system(I mean memory or etc.) or compiler work differently or assembled code is different or something like this.
The only difference is that in the first case, the variable i is outside for scope so you could use it later on. There are no differences in term of efficiency.
If you use i only once, then definitely the 2nd case is better:
for(int i = 0 ; i < index ; ++i)
If you have loops that use index i, then it might make sense declaring it outside all loops.
But generally, the rule is to limit the scope of the variable - so the 2nd case is better. It's usually safer to limit the scope of the variable.
It'd worth noting that the 2nd case syntax only works with C99 or newer C11 (did not work with old C89). So some compilers would complain if you declare variable inside the loop. For example, gcc requires explicit flag -std=c99 to allow that syntax.
The scope and lifetime of i is different.
In the second example it is just inside the loop body. In the first, it extends beyond.
Apart from that, they are equal.
Declaring a new variable in the initialization of the for loop is a C99 extension.
C89 requires that variables be declared at the beginning of a block.
Semantically, declaring variables in the initialization portion of the loop would limit the variables' scope to the body of the loop.
Limiting the scope is often desired to avoid misuse of variables after the body of the for loop has executed. For example, if you are doing a simple iteration, you may not want your index to exist after the for loop.
There is no right answer on which to use. The question becomes what you want your scope to be, and what compilers/language versions you intended to support.
In C99 the correct way is No 1). It requires the variables are declared before use. Looks like your compiler supports several standards, therefore it looks transparent to you what construction to use, and they all result in the same behavior. My personal preference in this case is 2) because it reduces scope of the variable i, and also prevents from unitialized value use (less risky).

Declare variable in C - Many ways?

I'm discussing with a friend what's the correct way to declare some variables in C, exactly in the for loop.
He has a compiler I can't remember and I have Dev-C++.
He does:
for (int i = 0; i<10; i++)
// ... and it works
I do:
int i;
for (i = 0; i<10; i++)
// ... and it works
If I do it like he does, Dev-C++ gives me an error. What's the technically correct way to do this? I was taught to do it the way I do but now I'm confused because he does it in the other way and it works for him D:
Declaring the variable in the loop, like your friend does, is supported in C99 and in C++. It is likely that your friend is coming from a C++ background, where such style of declaration is the norm. Declaring the loop variable outside the loop, like you do, is correct in older C, such as C89, which is what your compiler apparently supports.
If you have access to a C99 compiler, which style to choose is mostly a matter of preference. Seasoned C programmers don't mind declaring variables outside loop bodies, but it is considered slightly cleaner to declare them inside because it restricts the scope of the variable to the least possible lexical region. Declaring the variable outside the loop body is, of course, necessary if you plan to use it after the loop is done — for example, to inspect how far the loop has progressed.
Depending of which version of C you're using. Ansi C (original, Ritchie & Kernighan) only supports declaration at begin of block while modern C (and any flavour of C++) allows mixing statement and declaration.
{
int a;
printf ("Stuff);
int b; /* not allowed */
}
Declaring a value inside of the for-loop header causes an error in any compiler that predates c99. If you compile this with the c99 standard or newer, it will work just fine.
Formally the physical difference between THESE TWO is the performance. put the definition within the brackets after for may have more chances to be as a register-only variables. But on the other hand there are many other factors which can decide the detail of optimization results, with the help of the analizing mechanism of the compiler. So the final result may be no different or even be opposite.
There is indeed a different for sure: if you define a variable within the brackets after 'for', that variable won't be able to be used at the outside of that for-loop.

Why are nested functions not supported by the C standard?

It doesn't seem like it would be too hard to implement in assembly.
gcc also has a flag (-fnested-functions) to enable their use.
It turns out they're not actually all that easy to implement properly.
Should an internal function have access to the containing scope's variables?
If not, there's no point in nesting it; just make it static (to limit visibility to the translation unit it's in) and add a comment saying "This is a helper function used only by myfunc()".
If you want access to the containing scope's variables, though, you're basically forcing it to generate closures (the alternative is restricting what you can do with nested functions enough to make them useless).
I think GCC actually handles this by generating (at runtime) a unique thunk for every invocation of the containing function, that sets up a context pointer and then calls the nested function. This ends up being a rather Icky hack, and something that some perfectly reasonable implementations can't do (for example, on a system that forbids execution of writable memory - which a lot of modern OSs do for security reasons).
The only reasonable way to make it work in general is to force all function pointers to carry around a hidden context argument, and all functions to accept it (because in the general case you don't know when you call it whether it's a closure or an unclosed function). This is inappropriate to require in C for both technical and cultural reasons, so we're stuck with the option of either using explicit context pointers to fake a closure instead of nesting functions, or using a higher-level language that has the infrastructure needed to do it properly.
I'd like to quote something from the BDFL (Guido van Rossum):
This is because nested function definitions don't have access to the
local variables of the surrounding block -- only to the globals of the
containing module. This is done so that lookup of globals doesn't
have to walk a chain of dictionaries -- as in C, there are just two
nested scopes: locals and globals (and beyond this, built-ins).
Therefore, nested functions have only a limited use. This was a
deliberate decision, based upon experience with languages allowing
arbitraries nesting such as Pascal and both Algols -- code with too
many nested scopes is about as readable as code with too many GOTOs.
Emphasis is mine.
I believe he was referring to nested scope in Python (and as David points out in the comments, this was from 1993, and Python does support fully nested functions now) -- but I think the statement still applies.
The other part of it could have been closures.
If you have a function like this C-like code:
(*int()) foo() {
int x = 5;
int bar() {
x = x + 1;
return x;
}
return &bar;
}
If you use bar in a callback of some sort, what happens with x? This is well-defined in many newer, higher-level languages, but AFAIK there's no well-defined way to track that x in C -- does bar return 6 every time, or do successive calls to bar return incrementing values? That could have potentially added a whole new layer of complication to C's relatively simple definition.
See C FAQ 20.24 and the GCC manual for potential problems:
If you try to call the nested function
through its address after the
containing function has exited, all
hell will break loose. If you try to
call it after a containing scope level
has exited, and if it refers to some
of the variables that are no longer in
scope, you may be lucky, but it's not
wise to take the risk. If, however,
the nested function does not refer to
anything that has gone out of scope,
you should be safe.
This is not really more severe than some other problematic parts of the C standard, so I'd say the reasons are mostly historical (C99 isn't really that different from K&R C feature-wise).
There are some cases where nested functions with lexical scope might be useful (consider a recursive inner function which doesn't need extra stack space for the variables in the outer scope without the need for a static variable), but hopefully you can trust the compiler to correctly inline such functions, ie a solution with a seperate function will just be more verbose.
Nested functions are a very delicate thing. Will you make them closures? If not, then they have no advantage to regular functions, since they can't access any local variables. If they do, then what do you do to stack-allocated variables? You have to put them somewhere else so that if you call the nested function later, the variable is still there. This means they'll take memory, so you have to allocate room for them on the heap. With no GC, this means that the programmer is now in charge of cleaning up the functions. Etc... C# does this, but they have a GC, and it's a considerably newer language than C.
It also wouldn't be too hard to add members functions to structs but they are not in the standard either.
Features are not added to C standard based on soley whether or not they are easy to implement. It's a combination of many other factors including the point in time in which the standard was written and what was common / practical then.
One more reason: it is not at all clear that nested functions are valuable. Twenty-odd years ago I used to do large scale programming and maintenance in (VAX) Pascal. We had lots of old code that made heavy use of nested functions. At first, I thought this was way cool (compared to K&R C, which I had been working in before) and started doing it myself. After awhile, I decided it was a disaster, and stopped.
The problem was that a function could have a great many variables in scope, counting the variables of all the functions in which it was nested. (Some old code had ten levels of nesting; five was quite common, and until I changed my mind I coded a few of the latter myself.) Variables in the nesting stack could have the same names, so that "inner" function local variables could mask variables of the same name in more "outer" functions. A local variable of a function, that in C-like languages is totally private to it, could be modified by a call to a nested function. The set of possible combinations of this jazz was near infinite, and a nightmare to comprehend when reading code.
So, I started calling this programming construct "semi-global variables" instead of "nested functions", and telling other people working on the code that the only thing worse than a global variable was a semi-global variable, and please do not create any more. I would have banned it from the language, if I could. Sadly, there was no such option for the compiler...
ANSI C has been established for 20 years. Perhaps between 1983 and 1989 the committee may have discussed it in the light of the state of compiler technology at the time but if they did their reasoning is lost in dim and distant past.
I disagree with Dave Vandervies.
Defining a nested function is much better coding style than defining it in global scope, making it static and adding a comment saying "This is a helper function used only by myfunc()".
What if you needed a helper function for this helper function? Would you add a comment "This is a helper function for the first helper function used only by myfunc"? Where do you take the names from needed for all those functions without polluting the namespace completely?
How confusing can code be written?
But of course, there is the problem with how to deal with closuring, i.e. returning a pointer to a function that has access to variables defined in the function from which it is returned.
Either you don't allow references to local variables of the containing function in the contained one, and the nesting is just a scoping feature without much use, or you do. If you do, it is not a so simple feature: you have to be able to call a nested function from another one while accessing the correct data, and you also have to take into account recursive calls. That's not impossible -- techniques are well known for that and where well mastered when C was designed (Algol 60 had already the feature). But it complicates the run-time organization and the compiler and prevent a simple mapping to assembly language (a function pointer must carry on information about that; well there are alternatives such as the one gcc use). It was out of scope for the system implementation language C was designed to be.

Resources