Is there a difference between the following two approaches to defining a for-loop variable in C?
int i;
for (i = 0; i < X; i++) {
// something
}
And:
for (int i = 0; i < X; i++) {
// something
}
My preference is to use the second approach if the i is just always a throw-away, but is there any reason that it wouldn't be a good idea to do that?
Yes.
As the i variable is usually used only to count the number of iterations needed, it doesn't make sense to have a variable live outside the scope of the loop. That, if you can, should be avoided.
As some comments to the question mention, there are some cases in which you can't use the second, but that's not the general case.
As for the compiler later compiling to the same assembly, that may be true, but conceptually the second is cleaner, and for someone reading the code from outside, it makes it clear that the variable is never used again.
Hope this helps!
but is there any reason that it wouldn't be a good idea to do that?
In C, you should prefer the second form, because it reduces the scope of the variable and makes it more obvious where it is going to be used. Unless...
...it goes against the coding guidelines of a given project. For instance, the Linux kernel declares all variables at the top of a function.
...you want to be conforming to C90: you cannot use loop initial declarations.
In C++, however, objects may have very expensive constructors, which means that, at times, you may want to re-use them rather than initialize a new one every time (e.g. if you construct a new one within the body of the loop).
There are a number of differences between the two.
The second form is illegal in the 1989 ANSI (1990 ISO) C standard. The first is supported in C from 1999 and in standard C++. It is supported by some C compilers older than 1999, either as a non-standard or optional extension or because those compilers were actually C++ compilers with a C mode.
In the first form, i exists after the loop, so its value can still be accessed, but redefining it results in a diagnostic (compile time error). In the second form, i does not exist after the loop, so accessing its value after the loop gives a diagnostic, but i can be redefined.
In general terms, it is advisable to ensure that variables only exist for as long as needed, and cease to exist when no longer needed. The second form explicitly allows that.
Obviously, variables that need to exist outside the loop, need to be defined outside it. But, if the variable i is not needed outside the loop, then I would favour the second form. This allows the compilers to catch problems, such as unintended use of variable i after the loop.
Some older C and C++ compilers (mainly dating from before the 1998 C++ standard was ratified, but some from the early 2000s) implement the second form so the variable i still exists after the loop. This effectively makes the two forms equivalent when using those compilers.
My assumption is that this is going to mess with checkers and stack analysis.
I can't prove my assumption and I don't think C99 will complain. Probably neither c89 will because the definition is immediately after the opening of the curly brace:
if(true == condition){
int i = 0;
/* do stuff with i */
}else{
foo():
}
The two paths will lead to different stack usage.
Declaring i outside the if/else statement will lead to a more defined stack usage (ok, I am branching to foo,so the stack will not be exactly the same in the two cases).
But Misra advises to limit the scope of a variable closest to its usage.
Am I overthinking it or could there be a rationale in my assumption?
Is declaring a variable inside an if statement in c a bad habit?
No.
A modern approach is to minimize the scope of the variables used, so that logical (hard-to-fix) and syntactical (easy-to-fix) errors are avoided.
Of course, there are people that still like to see all the variables defined at the topmost part of the code, because this was the convention in the past, as #Clifford commented.
BTW, your code should compile fine, both with C89 and C99.
This stack usage thought is the result of overthinking, and I suggest you follow the Ancient Hellenic phrase: Métron áriston.
The code is fine in any version of C (except C90 does not support true).
The two paths will lead to different stack usage.
This is mostly a myth. Modern compilers stack a variable if they can determine that it is needed, regardless of where you place the declaration.
If the variable is allocated in a register, then it will only be allocated when the program takes the path where your example declares the variable. This is not because of where the declaration is placed, but because that path will be executed. So again, for the sake of performance, it doesn't matter where the variable is declared, as long as it is somewhere in local scope and not at file scope.
It is good practice to limit the scope of variables as much as possible. But this is to avoid unintentional bugs and namespace collisions.
But Misra advises to limit the scope of a variable closest to its usage.
No it doesn't, but some static analysers require you to do that, on top of the MISRA requirement. Both MISRA-C:2004 8.7 and MISRA-C:2012 8.9 only require that you place a variable at block scope, if it is only used by one function. That's it.
MISRA does however say:
Within a function, whether objects are defined at the outermost or innermost block is largely a matter of style
This is not a lambda function question, I know that I can assign a lambda to a variable.
What's the point of allowing us to declare, but not define a function inside code?
For example:
#include <iostream>
int main()
{
// This is illegal
// int one(int bar) { return 13 + bar; }
// This is legal, but why would I want this?
int two(int bar);
// This gets the job done but man it's complicated
class three{
int m_iBar;
public:
three(int bar):m_iBar(13 + bar){}
operator int(){return m_iBar;}
};
std::cout << three(42) << '\n';
return 0;
}
So what I want to know is why would C++ allow two which seems useless, and three which seems far more complicated, but disallow one?
EDIT:
From the answers it seems that there in-code declaration may be able to prevent namespace pollution, what I was hoping to hear though is why the ability to declare functions has been allowed but the ability to define functions has been disallowed.
It is not obvious why one is not allowed; nested functions were proposed a long time ago in N0295 which says:
We discuss the introduction of nested functions into C++. Nested
functions are well understood and their introduction requires little
effort from either compiler vendors, programmers, or the committee.
Nested functions offer significant advantages, [...]
Obviously this proposal was rejected, but since we don't have meeting minutes available online for 1993 we don't have a possible source for the rationale for this rejection.
In fact this proposal is noted in Lambda expressions and closures for C
++ as a possible alternative:
One article [Bre88] and proposal N0295 to the C
++ committee [SH93] suggest adding nested functions to C
++ . Nested functions are similar to lambda expressions, but are defined as statements within a function body, and the resulting
closure cannot be used unless that function is active. These proposals
also do not include adding a new type for each lambda expression, but
instead implementing them more like normal functions, including
allowing a special kind of function pointer to refer to them. Both of
these proposals predate the addition of templates to C
++ , and so do not mention the use of nested functions in combination with generic algorithms. Also, these proposals have no way to copy
local variables into a closure, and so the nested functions they
produce are completely unusable outside their enclosing function
Considering we do now have lambdas we are unlikely to see nested functions since, as the paper outlines, they are alternatives for the same problem and nested functions have several limitations relative to lambdas.
As for this part of your question:
// This is legal, but why would I want this?
int two(int bar);
There are cases where this would be a useful way to call the function you want. The draft C++ standard section 3.4.1 [basic.lookup.unqual] gives us one interesting example:
namespace NS {
class T { };
void f(T);
void g(T, int);
}
NS::T parm;
void g(NS::T, float);
int main() {
f(parm); // OK: calls NS::f
extern void g(NS::T, float);
g(parm, 1); // OK: calls g(NS::T, float)
}
Well, the answer is "historical reasons". In C you could have function declarations at block scope, and the C++ designers did not see the benefit in removing that option.
An example usage would be:
#include <iostream>
int main()
{
int func();
func();
}
int func()
{
std::cout << "Hello\n";
}
IMO this is a bad idea because it is easy to make a mistake by providing a declaration that does not match the function's real definition, leading to undefined behaviour which will not be diagnosed by the compiler.
In the example you give, void two(int) is being declared as an external function, with that declaration only being valid within the scope of the main function.
That's reasonable if you only wish to make the name two available within main() so as to avoid polluting the global namespace within the current compilation unit.
Example in response to comments:
main.cpp:
int main() {
int foo();
return foo();
}
foo.cpp:
int foo() {
return 0;
}
no need for header files. compile and link with
c++ main.cpp foo.cpp
it'll compile and run, and the program will return 0 as expected.
You can do these things, largely because they're actually not all that difficult to do.
From the viewpoint of the compiler, having a function declaration inside another function is pretty trivial to implement. The compiler needs a mechanism to allow declarations inside of functions to handle other declarations (e.g., int x;) inside a function anyway.
It will typically have a general mechanism for parsing a declaration. For the guy writing the compiler, it doesn't really matter at all whether that mechanism is invoked when parsing code inside or outside of another function--it's just a declaration, so when it sees enough to know that what's there is a declaration, it invokes the part of the compiler that deals with declarations.
In fact, prohibiting these particular declarations inside a function would probably add extra complexity, because the compiler would then need an entirely gratuitous check to see if it's already looking at code inside a function definition and based on that decide whether to allow or prohibit this particular declaration.
That leaves the question of how a nested function is different. A nested function is different because of how it affects code generation. In languages that allow nested functions (e.g., Pascal) you normally expect that the code in the nested function has direct access to the variables of the function in which it's nested. For example:
int foo() {
int x;
int bar() {
x = 1; // Should assign to the `x` defined in `foo`.
}
}
Without local functions, the code to access local variables is fairly simple. In a typical implementation, when execution enters the function, some block of space for local variables is allocated on the stack. All the local variables are allocated in that single block, and each variable is treated as simply an offset from the beginning (or end) of the block. For example, let's consider a function something like this:
int f() {
int x;
int y;
x = 1;
y = x;
return y;
}
A compiler (assuming it didn't optimize away the extra code) might generate code for this roughly equivalent to this:
stack_pointer -= 2 * sizeof(int); // allocate space for local variables
x_offset = 0;
y_offset = sizeof(int);
stack_pointer[x_offset] = 1; // x = 1;
stack_pointer[y_offset] = stack_pointer[x_offset]; // y = x;
return_location = stack_pointer[y_offset]; // return y;
stack_pointer += 2 * sizeof(int);
In particular, it has one location pointing to the beginning of the block of local variables, and all access to the local variables is as offsets from that location.
With nested functions, that's no longer the case--instead, a function has access not only to its own local variables, but to the variables local to all the functions in which it's nested. Instead of just having one "stack_pointer" from which it computes an offset, it needs to walk back up the stack to find the stack_pointers local to the functions in which it's nested.
Now, in a trivial case that's not all that terrible either--if bar is nested inside of foo, then bar can just look up the stack at the previous stack pointer to access foo's variables. Right?
Wrong! Well, there are cases where this can be true, but it's not necessarily the case. In particular, bar could be recursive, in which case a given invocation of bar might have to look some nearly arbitrary number of levels back up the stack to find the variables of the surrounding function. Generally speaking, you need to do one of two things: either you put some extra data on the stack, so it can search back up the stack at run-time to find its surrounding function's stack frame, or else you effectively pass a pointer to the surrounding function's stack frame as a hidden parameter to the nested function. Oh, but there's not necessarily just one surrounding function either--if you can nest functions, you can probably nest them (more or less) arbitrarily deep, so you need to be ready to pass an arbitrary number of hidden parameters. That means you typically end up with something like a linked list of stack frames to surrounding functions, and access to variables of surrounding functions is done by walking that linked list to find its stack pointer, then accessing an offset from that stack pointer.
That, however, means that access to a "local" variable may not be a trivial matter. Finding the correct stack frame to access the variable can be non-trivial, so access to variables of surrounding functions is also (at least usually) slower than access to truly local variables. And, of course, the compiler has to generate code to find the right stack frames, access variables via any of an arbitrary number of stack frames, and so on.
This is the complexity that C was avoiding by prohibiting nested functions. Now, it's certainly true that a current C++ compiler is a rather different sort of beast from a 1970's vintage C compiler. With things like multiple, virtual inheritance, a C++ compiler has to deal with things on this same general nature in any case (i.e., finding the location of a base-class variable in such cases can be non-trivial as well). On a percentage basis, supporting nested functions wouldn't add much complexity to a current C++ compiler (and some, such as gcc, already support them).
At the same time, it rarely adds much utility either. In particular, if you want to define something that acts like a function inside of a function, you can use a lambda expression. What this actually creates is an object (i.e., an instance of some class) that overloads the function call operator (operator()) but it still gives function-like capabilities. It makes capturing (or not) data from the surrounding context more explicit though, which allows it to use existing mechanisms rather than inventing a whole new mechanism and set of rules for its use.
Bottom line: even though it might initially seem like nested declarations are hard and nested functions are trivial, more or less the opposite is true: nested functions are actually much more complex to support than nested declarations.
The first one is a function definition, and it is not allowed. Obvious, wt is the usage of putting a definition of a function inside another function.
But the other twos are just declarations. Imagine you need to use int two(int bar); function inside the main method. But it is defined below the main() function, so that function declaration inside the function makes you to use that function with declarations.
The same applies to the third. Class declarations inside the function allows you to use a class inside the function without providing an appropriate header or reference.
int main()
{
// This is legal, but why would I want this?
int two(int bar);
//Call two
int x = two(7);
class three {
int m_iBar;
public:
three(int bar):m_iBar(13 + bar) {}
operator int() {return m_iBar;}
};
//Use class
three *threeObj = new three();
return 0;
}
This language feature was inherited from C, where it served some purpose in C's early days (function declaration scoping maybe?).
I don't know if this feature is used much by modern C programmers and I sincerely doubt it.
So, to sum up the answer:
there is no purpose for this feature in modern C++ (that I know of, at least), it is here because of C++-to-C backward compatibility (I suppose :) ).
Thanks to the comment below:
Function prototype is scoped to the function it is declared in, so one can have a tidier global namespace - by referring to external functions/symbols without #include.
Actually, there is one use case which is conceivably useful. If you want to make sure that a certain function is called (and your code compiles), no matter what the surrounding code declares, you can open your own block and declare the function prototype in it. (The inspiration is originally from Johannes Schaub, https://stackoverflow.com/a/929902/3150802, via TeKa, https://stackoverflow.com/a/8821992/3150802).
This may be particularily useful if you have to include headers which you don't control, or if you have a multi-line macro which may be used in unknown code.
The key is that a local declaration supersedes previous declarations in the innermost enclosing block. While that can introduce subtle bugs (and, I think, is forbidden in C#), it can be used consciously. Consider:
// somebody's header
void f();
// your code
{ int i;
int f(); // your different f()!
i = f();
// ...
}
Linking may be interesting because chances are the headers belong to a library, but I guess you can adjust the linker arguments so that f() is resolved to your function by the time that library is considered. Or you tell it to ignore duplicate symbols. Or you don't link against the library.
This is not an answer to the OP question, but rather a reply to several comments.
I disagree with these points in the comments and answers: 1 that nested declarations are allegedly harmless, and 2 that nested definitions are useless.
1 The prime counterexample for the alleged harmlessness of nested function declarations is the infamous Most Vexing Parse. IMO the spread of confusion caused by it is enough to warrant an extra rule forbidding nested declarations.
2 The 1st counterexample to the alleged uselessness of nested function definitions is frequent need to perform the same operation in several places inside exactly one function. There is an obvious workaround for this:
private:
inline void bar(int abc)
{
// Do the repeating operation
}
public:
void foo()
{
int a, b, c;
bar(a);
bar(b);
bar(c);
}
However, this solution often enough contaminates the class definition with numerous private functions, each of which is used in exactly one caller. A nested function declaration would be much cleaner.
Specifically answering this question:
From the answers it seems that there in-code declaration may be able to prevent namespace pollution, what I was hoping to hear though is why the ability to declare functions has been allowed but the ability to define functions has been disallowed.
Because consider this code:
int main()
{
int foo() {
// Do something
return 0;
}
return 0;
}
Questions for language designers:
Should foo() be available to other functions?
If so, what should be its name? int main(void)::foo()?
(Note that 2 would not be possible in C, the originator of C++)
If we want a local function, we already have a way - make it a static member of a locally-defined class. So should we add another syntactic method of achieving the same result? Why do that? Wouldn't it increase the maintenance burden of C++ compiler developers?
And so on...
Just wanted to point out that the GCC compiler allows you to declare functions inside functions. Read more about it here. Also with the introduction of lambdas to C++, this question is a bit obsolete now.
The ability to declare function headers inside other functions, I found useful in the following case:
void do_something(int&);
int main() {
int my_number = 10 * 10 * 10;
do_something(my_number);
return 0;
}
void do_something(int& num) {
void do_something_helper(int&); // declare helper here
do_something_helper(num);
// Do something else
}
void do_something_helper(int& num) {
num += std::abs(num - 1337);
}
What do we have here? Basically, you have a function that is supposed to be called from main, so what you do is that you forward declare it like normal. But then you realize, this function also needs another function to help it with what it's doing. So rather than declaring that helper function above main, you declare it inside the function that needs it and then it can be called from that function and that function only.
My point is, declaring function headers inside functions can be an indirect method of function encapsulation, which allows a function to hide some parts of what it's doing by delegating to some other function that only it is aware of, almost giving an illusion of a nested function.
Nested function declarations are allowed probably for
1. Forward references
2. To be able to declare a pointer to function(s) and pass around other function(s) in a limited scope.
Nested function definitions are not allowed probably due to issues like
1. Optimization
2. Recursion (enclosing and nested defined function(s))
3. Re-entrancy
4. Concurrency and other multithread access issues.
From my limited understanding :)
This question already has answers here:
Declaring variables inside loops, good practice or bad practice?
(9 answers)
Closed 7 years ago.
let's assume that it is not only in visual studio but also in C99, C11 and etc.
there are two different ways of declaring variable "i" in for statement.
1)
int i;
for(i = 0 ; i < index ; ++i)
2)
for(int i = 0 ; i < index ; ++i)
Both work same. but I think there will be some difference between them.
Do you have any idea about that?
If yes, please let me know.
I just wondering about your opinion, and how it works differently.
Sorry. for answers, I know that the scope of "i" is different.
Is there any difference in view of system(I mean memory or etc.) or compiler work differently or assembled code is different or something like this.
The only difference is that in the first case, the variable i is outside for scope so you could use it later on. There are no differences in term of efficiency.
If you use i only once, then definitely the 2nd case is better:
for(int i = 0 ; i < index ; ++i)
If you have loops that use index i, then it might make sense declaring it outside all loops.
But generally, the rule is to limit the scope of the variable - so the 2nd case is better. It's usually safer to limit the scope of the variable.
It'd worth noting that the 2nd case syntax only works with C99 or newer C11 (did not work with old C89). So some compilers would complain if you declare variable inside the loop. For example, gcc requires explicit flag -std=c99 to allow that syntax.
The scope and lifetime of i is different.
In the second example it is just inside the loop body. In the first, it extends beyond.
Apart from that, they are equal.
Declaring a new variable in the initialization of the for loop is a C99 extension.
C89 requires that variables be declared at the beginning of a block.
Semantically, declaring variables in the initialization portion of the loop would limit the variables' scope to the body of the loop.
Limiting the scope is often desired to avoid misuse of variables after the body of the for loop has executed. For example, if you are doing a simple iteration, you may not want your index to exist after the for loop.
There is no right answer on which to use. The question becomes what you want your scope to be, and what compilers/language versions you intended to support.
In C99 the correct way is No 1). It requires the variables are declared before use. Looks like your compiler supports several standards, therefore it looks transparent to you what construction to use, and they all result in the same behavior. My personal preference in this case is 2) because it reduces scope of the variable i, and also prevents from unitialized value use (less risky).
I'm discussing with a friend what's the correct way to declare some variables in C, exactly in the for loop.
He has a compiler I can't remember and I have Dev-C++.
He does:
for (int i = 0; i<10; i++)
// ... and it works
I do:
int i;
for (i = 0; i<10; i++)
// ... and it works
If I do it like he does, Dev-C++ gives me an error. What's the technically correct way to do this? I was taught to do it the way I do but now I'm confused because he does it in the other way and it works for him D:
Declaring the variable in the loop, like your friend does, is supported in C99 and in C++. It is likely that your friend is coming from a C++ background, where such style of declaration is the norm. Declaring the loop variable outside the loop, like you do, is correct in older C, such as C89, which is what your compiler apparently supports.
If you have access to a C99 compiler, which style to choose is mostly a matter of preference. Seasoned C programmers don't mind declaring variables outside loop bodies, but it is considered slightly cleaner to declare them inside because it restricts the scope of the variable to the least possible lexical region. Declaring the variable outside the loop body is, of course, necessary if you plan to use it after the loop is done — for example, to inspect how far the loop has progressed.
Depending of which version of C you're using. Ansi C (original, Ritchie & Kernighan) only supports declaration at begin of block while modern C (and any flavour of C++) allows mixing statement and declaration.
{
int a;
printf ("Stuff);
int b; /* not allowed */
}
Declaring a value inside of the for-loop header causes an error in any compiler that predates c99. If you compile this with the c99 standard or newer, it will work just fine.
Formally the physical difference between THESE TWO is the performance. put the definition within the brackets after for may have more chances to be as a register-only variables. But on the other hand there are many other factors which can decide the detail of optimization results, with the help of the analizing mechanism of the compiler. So the final result may be no different or even be opposite.
There is indeed a different for sure: if you define a variable within the brackets after 'for', that variable won't be able to be used at the outside of that for-loop.