I'm currently studying computer science in Germany but did work on several C/C++ opensource projects before.
Today we kind of started with C at school and my teacher said it would be a no go to modify a for loop variable inside the loop, which I absolutely agree with. However, I often use a for loop without the last incrementing part and then modify it only inside the loop, which he also did not like.
So basically it comes down to
for(int i=0; i<100;) {
[conditionally modify i]
}
vs
int i=0;
while(i<100) {
[conditionally modify i]
}
I know that they are essentially the same after compile, but I don't like using a while loop because:
It's common practice to limit variables to smallest possible scope
It can introduce bugs if you reuse i (which you have to because of larger scope)
You can not use a different type for i in a later loop without using a different name
Are there any style guides/common practices which one to choose ?
If you answer with "I like while/for more" at least provide a reason why you do so.
I did already search for similar questions, however, I could not find any real answer to this case.
Style guides differ between different people / teams.
The C standard describes the syntax of for like this:
for ( clause-1 ; expression-2 ; expression-3 ) statement
and it's common practice to use for as soon as you have a valid use for "clause-1", and the reason to do so is indeed because of the limited scope:
[...] If clause-1 is a
declaration, the scope of any identifiers it declares is the remainder of the declaration and
the entire loop, including the other two expressions; [...]
So, your argumentation is fine, and you could try to convince your teacher. But don't try too hard. After all, questions of style rarely have one definitive answer, and if your teacher insists on his rules, just follow them when coding for that class.
There are some common practices which programmers follow while taking the decision on which loop to use - for or while!
The for loop seems most appropriate when the number of iterations is known in advance. For example -
/*N - size of array*/
for (i = 0; i < N; i++) {
...
}
But, there could be many complex problems where the number of iterations depend upon a certain condition and can't be predicted beforehand, while loop is preferable in those situations.
For example -
while( fgets( buf, MAXBUF, stdin ) != NULL)
However, both for and while loops are entry controlled loops and are interchangeable. Its completely up to the programmer to take the decision on which one to use.
If you are modifying the loop counter inside the body, I would recommend not using a for loop because it is more likely to cause reader's confusion than the while loop.
You can work around the lack of scope limitation with an additional block:
{
int i = 0;
while (i < 100) {
[conditionally modify i]
}
}
for(int i=0; i<100;) {
[conditionally modify i]
}
Because this looks confusing, not standard way to write for loop. Also, conditionally modify i is dangerous, you don't want to do that. Someone reading your code would have problems understanding how you increment, why, when..etc. You want to keep your code clean and easy to understand.
I would personally never write for loop with conditionally modifying iterator. If you need it, you can use additional counter, or something like that. But you shouldn't control flow with conditioning iterator, some people even avoid break and continue.
Related
How can I use the continue keyword, to continue n times a loop. Like in shell we can do continue 2 to skip 2 iterations.
How is it possible to achieve this in C.
You cannot do it using continue, but you can skip for loops incrementing the variable to skip n iterations.
for (int i = 0; i < 10; i++)
{
if (some_condition)
i += nSkip;
(...)
}
This is at its root a programming style question, and those are always prone to opinionated debate. Reader beware. :-)
C does not have break(n) and continue(n). This was a deliberate choice. It was felt that these constructs are too difficult to maintain. It's hard for a later programmer to count and keep track of the nested loops. It's too easy for a later programmer to insert or delete a nesting level, throwing off the counts.
Veering down to the root of one of the biggest style debates there is, there are those who say that goto is evil and should never be used. There are also those who say that break and continue are just goto's in disguise and that they should never be used, either. Personally, I don't agree with either of those positions, but I do agree that break(n) and continue(n) have little to no value over pure goto; they're at least as confusing and prone to error. So if you find yourself needing break(n) or continue(n) in C, and there's no other way around, it, just bite the bullet and use a goto. The fact that you needed break(n) or continue(n) proves that you're doing something irretrievably ugly, so a goto won't make it any worse.
(Now, it's true, with that said, it's easy to replace break(n) with goto out;, but it's not nearly so easy. in general, to replace continue(n) with a goto. So you'll probably have to do something else, and it'll probably be ugly, but again, by the time you get here, you're doomed to that anyway.)
See also this question and its answers, although there the discussion is specifically about break(n), not continue(n).
int i;
for (int i = 0; i < 10; i++)
{
if (n<=2)
continue;
else
<do your task>
}
Edit: If you fundamentally disagree with the Fedora guide here, please explain why this approach would be worse in an objective way than classic loops. As far as I know even the CERT standard doesn't make any statement on using index variables over pointers.
I'm currently reading the Fedora Defensive Coding Guide and it suggests the following:
Always keep track of the size of the array you are working with.
Often, code is more obviously correct when you keep a pointer past the
last element of the array, and calculate the number of remaining
elements by substracting the current position from that pointer. The
alternative, updating a separate variable every time when the position
is advanced, is usually less obviously correct.
This means for a given array
int numbers[] = {1, 2, 3, 4, 5};
I should not use the classic
size_t length = 5;
for (size_t i = 0; i < length; ++i) {
printf("%d ", numbers[i]);
}
but instead this:
int *end = numbers + 5;
for (int *start = numbers; start < end; ++start) {
printf("%d ", *start);
}
or this:
int *start = numbers;
int *end = numbers + 5;
while (start < end) {
printf("%d ", *start++);
}
Is my understanding the recommendation correct?
Is my implementation correct?
Which of the last 2 is safer?
Your understanding of what the text recommends is correct, as is your implementation. But regarding the basis of the recommendation, I think you are confusing safe with correct.
It's not that using a pointer is safer than using an index. The argument is that, in reasoning about the code, it is easier to decide that the logic is correct when using pointers. Safety is about failure modes: what happens if the code is incorrect (references a location outside the array). Correctness is more fundamental: that the algorithm provably does what it sets out to do. We might say that correct code doesn't need safety.
The recommendation might have been influenced by Andrew Koenig's series in Dr. Dobbs a couple of years ago. How C Makes It Hard To Check Array Bounds. Koenig says,
In addition to being faster in many cases, pointers have another big advantage over arrays: A pointer to an array element is a single value that is enough to identify that element uniquely. [...] Without pointers, we need three parameters to identify the range: the array and two indices. By using pointers, we can get by with only two parameters.
In C, referencing a location outside the array, whether via pointer or index, is equally unsafe. The compiler will not catch you out (absent use of extensions to the standard). Koenig is arguing that with fewer balls in the air, you have a better shot at getting the logic right.
The more complicated the construction, the more obvious it is that he's right. If you want a better illustration of the difference, write strcat(3) both ways. Using indexes, you have two names and two indexes inside the loop. It's possible to use the index for one with the name for the other. Using pointers, that's impossible. All you have are two pointers.
Is my understanding the recommendation correct?
Is my implementation correct?
Yes, so it seems.
The method for(type_t start = &array; start != end; start++) is sometimes used when you have arrays of more complex items. It is mostly a matter of style.
This style is sometimes used when you already have the start and end pointers available for some reason. Or in cases where you aren't really interested in the size, but just want to repeatedly compare against the end of the array. For example, suppose you have a ring buffer ADT with a start pointer and an end pointer and want to iterate through all items.
This way of doing loops is actually the very reason why C explicitly allows pointers to point 1 item out-of-bounds of an array, you can set an end pointer to one item past the array without invoking undefined behavior (as long as that item isn't de-referenced).
(It is the very same method as used by STL iterators in C++, although there's more of a rationale in C++, since it has operator overload. For example iterator++ in C++ doesn't necessarily give an item adjacently allocated in the next memory cell. For example, iterators could be used for iterating through a linked list ADT, where the ++ would translate to node->next behind the lines.)
However, to claim that this form is always the preferred one is just subjective nonsense. Particularly when you have an array of integers and know the size. Your first example is the most readable form of a loop in C and therefore always preferred whenever possible.
On some compilers/systems, the first form could also give faster code than the second form. Pointer arithmetic might give slower code on some systems. (And I suppose that the first form might give faster data cache access on some systems, though I'd have to verify that assumption with some compiler guru.)
Which of the last 2 is safer?
Neither form is safer than the other. To claim otherwise would be subjective opinions. The statement "...is usually less obviously correct" is nonsense.
Which style to pick vary on case-to-case basis.
Overall, those "Fedora" guidelines you link seem to contain lots of questionable code, questionable rules and blatant opinions. Seems more like someone wanted to show off various C tricks than a serious attempt to write a coding standard. Overall, it smells like the "Linux kernel guidelines", which I would not recommended to read either.
If you want a serious coding standard for/by professionals, use CERT-C or MISRA-C.
I'm writing a for loop for a variable start whose value has already been calculated elsewhere in the program.
Doing for(start; start<end; start++) gives a warning, and
for(start=start; start<end; start++) seems like an unnecessary assignment.
The other option I can think of would be the following--is this okay, or would you classify it as poor coding style?
for(; start<end; start++){
//do stuff
}
That's not poor coding style IMHO, but perhaps you want to use a while instead?
while (start < end)
{
//do stuff
++start;
}
It's just a matter of taste, really.
Leaving any part of the for loop out is OK. Leaving all parts out is OK too - in fact, it's the idiomatic way of expressing an infinite loop as shown in the K&R book.
You should carefully consider your other options though; it is possible that a while or a do / while loop presents a more readable alternative.
Yes, you can do this. However, I would suggest assigning start into a variable like i or something, because now you are changing the value of start as you iterate.
You can write your code as the following for the desired result:
while(start++<end){
// your code
}
You get to take advantage of the fact that adding "++" after the variable name increments it after it is checked in the condition statement.
Generally, it is good practice to avoid GOTOs. Keeping that in mind I've been having a debate with a coworker over this topic.
Consider the following code:
Line:
while( <> ) {
next Line if (insert logic);
}
Does using a loop label count as a goto?
Here is what perlsyn in perldoc has to say:
Here's how a C programmer might code up a particular algorithm in Perl:
for (my $i = 0; $i < #ary1; $i++) {
for (my $j = 0; $j < #ary2; $j++) {
if ($ary1[$i] > $ary2[$j]) {
last; # can't go to outer :-(
}
$ary1[$i] += $ary2[$j];
}
# this is where that last takes me
}
Whereas here's how a Perl programmer more comfortable with the idiom might do it:
OUTER: for my $wid (#ary1) {
INNER: for my $jet (#ary2) {
next OUTER if $wid > $jet;
$wid += $jet;
}
}
My take on this is no because you are explicitly telling a loop to short circuit and advance however my coworker disagrees, says that it is just a fancy GOTO and should be avoided. I'm looking for either a compelling argument or documentation that explains why this is or is not a GOTO. I'll also accept an explanation for why this is or is not considered good practice in perl.
Dijkstras intent was never that anything resembling goto is to be considered harmful. It was that the structure of code where gotos are used as the main construct for almost any kind of program flow change will result in what we today call spaghetti code.
You should read the original article and keep in mind that it was written in 1968 when labeled jumps was the main flow control constructs in just about all programming languages.
https://www.cs.utexas.edu/users/EWD/ewd02xx/EWD215.PDF
The danger of GOTO labels is that they create spaghetti code and make the logic unreadable. Neither of those will happen in this case. There is a lot of validity in using GOTO statements, much of the defense coming from Donald Knuth [article].
Delving into the differences between your C and Perl example... If you consider what is happening at the assembly level with your C programs, it all compiles down to GOTOs anyway. And if you've done any MIPS or other assembly programming, then you've seen that most of those languages don't have any looping constructs, only conditional and unconditional branches.
In the end it comes down to readability and understandability. Both of which are helped an enormous amount by being consistent. If your company has a style guide, follow that, otherwise following the perl style guide sounds like a good idea to me. That way when other perl developers join your team in the future, they'll be able to hit the ground running and be comfortable with your code base.
Who cares whether it counts as goto as long as it makes the code easier to understand? Using goto can often be MORE readable than having a bunch of extra tests in if() and loop conditions.
IMO, your code comparison is unfair. The goal is readable code.
To be fair, you should compare an idiomatic Perl nested loop with labels against one without them. The C style for and blocked if statement add noise that make it impossible to compare the approaches.
Labels:
OUTER: for my $wid (#ary1) {
INNER: for my $jet (#ary2) {
next OUTER if $wid > $jet;
$wid += $jet;
}
}
Without labels:
for my $wid (#ary1) {
for my $jet (#ary2) {
last if $wid > $jet;
$wid += $jet;
}
}
I prefer the labeled version because it is explicit about the effect of the condition $wid > $jet. Without labels you need to remember that last operates on the inner loop and that when the inner loop is done, we move to the next item in the outer loop. This isn't exactly rocket-science, but it is real, demonstrable, cognitive overhead. Used correctly, labels make the code more readable.
Update:
stocherilac asked what happens if you have code after the nested loop. It depends on whether you want to skip it based on the inner conditional or always execute it.
If you want to skip the code in the outer loop, the labeled code works as desired.
If you want to be sure it is executed every time, you can use a continue block.
OUTER: for my $wid (#ary1) {
INNER: for my $jet (#ary2) {
next OUTER if $wid > $jet;
$wid += $jet;
}
}
continue {
# This code will execute when next OUTER is called.
}
I think the distinction is somewhat fuzzy, but here's what the goto perldoc states about the (frowned upon) goto statement:
The goto-LABEL form finds the statement labeled with LABEL and resumes execution there.
...
The author of Perl has never felt the need to use this form of goto (in Perl, that is; C is another matter). (The difference is that C does not offer named loops combined with loop control. Perl does, and this replaces most structured uses of goto in other languages.)
The perlsyn perldoc, however, says this:
The while statement executes the block as long as the expression is true. The until statement executes the block as long as the expression is false. The LABEL is optional, and if present, consists of an identifier followed by a colon. The LABEL identifies the loop for the loop control statements next, last, and redo. If the LABEL is omitted, the loop control statement refers to the innermost enclosing loop. This may include dynamically looking back your call-stack at run time to find the LABEL. Such desperate behavior triggers a warning if you use the use warnings pragma or the -w flag.
The desperate behaviour bit doesn't look too good to me, but I may be misinterpreting its meaning.
The Learning Perl book (5th edition, page 162) has this to say:
When you need to work with a loop block that's not the innermost one, use a label.
...
Notice that the label names the entire block; it's not marking a target point in the code. [This isn't goto after all.]
Does that help clear things up? Probably not... :-)
Labeled loop jumps in Perl are GOTOs as much as C's break and continue are.
I would answer it like this, and I'm not sure if this is sufficiently different from what others have said:
Because you can only only move inside of the current scope, or to a parent scope, they're much less dangerous than what is typically implied by goto, observe:
if (1) {
goto BAR;
die 'bar'
}
BAR:
This should work obviously, but this won't (can't move in this direction).
if (0) {
BAR:
die 'bar'
}
goto BAR;
Many use cases of labels differ from goto in that they're just more explicit variants of core flow control. To make a statement that they're categorically worse would be to imply that:
LOOP: while (1) {
next LOOP if /foo;
}
is somehow worse than
while (1) {
next if /foo/;
}
which is simply illogical if you exclude style. But, speaking of style, the latter variant is much easier to read - and it does stop you from having to look up for the properly named label. The reader knows more with next (that you're restarting the loop in the current scope), and that is better.
Let's look at another example
while (1) {
while (1) {
last;
}
stuff;
}
-vs-
FOO: while (1) {
BAR: while (1) {
next FOO;
}
stuff;
}
In the latter example here next FOO, skips stuff -- you might desire this, but it is bad idea. It implies that the programmer has read a parent scope to completion which is an assumption probably better avoided. In summary, label isn't as bad as goto and sometimes they can simplify code; but, in most cases they should be avoided. I usually rewrite loops without labels when I encounter them on CPAN.
gotos are bad because they create hard to understand code--particularly, what is often called "Spaghetti Code". What's hard to understand about next Line...??
You can call it a loop "name", and it really is something to help emphasize loop boundaries. You're not jumping into an arbitrary point in relation to the loop; you're going back to the top of a loop.
Sadly enough, if it is a group or house standard, there might be nothing to convince the group that it's not a goto. I had a manager who absolutely insisted that a ternary operator made things hard to read, and preferred I use if-blocks for everything. I had a pretty good argument anything can be done in the clauses of an if-else, but that a ternary made it explicit that you were looking for a particular value. No sale.
This kind of jump is a disciplined used of a goto-like statement. So it's certainly less harmful than undisciplined use of goto. (As kasperjj wrote, "Dijkstras intent was never that anything resembling goto is to be considered harmful.")
IMO, this Perl kind of jump is even better design than C's "break" and "continue", because it makes clear what loop we break or continue, and it makes it more solid in the face of code changes. (Besides, it also allows to break or continue an outer loop.)
There are pundits who don't like break/continue and the like, but at some point there is a tradeoff to make between rules of thumb and readability, and a well-chosen break/continue or even goto may become more readable than "politically correct" code.
break/last and continue/next ARE gotos. I don't understand why anyone would go to such lengths to avoid a keyword yet use a different keyword that does the same thing...
4.4.4. Loop Control
We mentioned that you can put a LABEL on a loop to give it a name. The loop's LABEL identifies the loop for the loop-control operators next, last, and redo. The LABEL names the loop as a whole, not just the top of the loop. Hence, a loop-control operator referring to the loop doesn't actually "go to" the loop label itself. As far as the computer is concerned, the label could just as easily have been placed at the end of the loop. But people like things labeled at the top, for some reason.
Programming Perl
I just found ... AGAIN ... a real time wastage bug as follows
for (int i = 0; i < length; i++)
{ //...Lots of code
for (int j = 0; i < length; j++)
{
//...Lots of code
}
}
Did you notice straight ahead the inner i which SHOULD BE j ? Neither did I. So from now on I am going to use:
for (int i = 0; i < length; i++)
{
for (int i1 = 0; i1 < length; i1++)
{
}
}
What are your tips for inner and outer while and for loops ?
Edit: Thanks for the valuable responses. Herewith short summary of the proposed tips:
use meaningful variables names for index variables ( instead i use SomeObjCollectionLength )
place the contents of the inner loop into a separate method and call that method from the outer loop
not manageable amount of lines of code between the outer and inner loop is a strong signal for code smell
avoid copy pasting and rushing , write the index vars with care
You might want to check the summary by LBushkin for the following
use foreach and iterators whenever possible
initialize the variables just before entering the loops
Make each loop perform only one function. Avoid mixing responsibilities in a single loop
When possible, make your loops short enough to view all at once
Don't use i & j (or any other single letter variable) as index names. Use proper names and you will not get into this type of problems.
One of the simplest and cleanest solutions is to place the contents of the inner loop into a method so it becomes:
for (int i = 0; i < length; i++)
{
DoSomething();
}
private void DoSomething(int outerValue)
{
for (int i = 0; i < length; i++)
{
// Do something else
}
}
For me, the 'code smell' here is 'lots of code'.
If the amount of code in the loops is particularly large, the distance between the inner and outer loops means that they're not as likely to be compared against each other for correctness.
Admittedly, looking at the start of the inner loop in isolation should bring the issue to your attention, but having the main structure in as small a section of code as possible gives your brain less to digest.
It may be possible to extract the 'lots of code' sections into separate functions/methods, in order to reduce the size of the main structure - but this may not alway be practical.
Also, I'd say that 'i1' isn't a particulary good choice of variable name, as that tends to encourage 'i2', 'i3' etc, which doesn't really lead to understandable code. Maybe replacing all of the loop variables with something more meaningful would help the clarity of the code, and reduce the chances of the original error.
My top advice (in no particular order) for writing better loop code (much of this is from the excellent book Code Complete):
Avoid multiple exit points for loops.
Use continue/break sparingly.
Refactor nested loops into separate routines, when possible.
Use meaningful variable names to make nested loops readable.
Use foreach() loops when possible, rather than for(i=...) loops.
Enter the loop from one location only. Don't jump into a loop with goto's. Ever.
Put initialization code immediately before the loop.
Keep loop initialization statements with the loop they are related to.
Avoid reusing variables between non-nested loops.
10.Limit the scope of loop-index variables to the loop itself.
Use while(true) for infinite loops, rather than for(;;)
In languages that provide block constructs (e.g. '{' and '}') use them rather than indenting to enclose the statements of a loop. Yes, even for single line loops.
Avoid empty loops.
Avoid placing housekeeping chores in the middle of a loop, place them at the beginning and/or end instead.
Make each loop perform only one function. Avoid mixing responsibilities in a single loop.
Make loop termination conditions obvious.
Don't monkey with the loop index variable of a for() loop to make it terminate.
Avoid code that depends on the loop indexer's final value.
Consider using safety counters in complex loops - they can be checked to make sure the loop doesn't execute too many, or too few times.
Use break statements, when possible, to terminate while loops.
When possible, make your loops short enough to view all at once.
That's a copy-paste mistake, avoid copy paste.
As for your solution, its not much better. The mistake can still slip between tons of code. I tend to use meaningful names even for loop temporary variables.
leverage your IDE, on VS, try to use this: http://msdn.microsoft.com/en-us/library/z4c5cc9b(VS.80).aspx
sample: type for, then press Tab Tab successively
I came here to be smart and say "I just write it right the first time". But then I saw your example and, well, I've done that too many times myself.
When you need nested loops like that, my only solution is to be alert and thinking when you write the code.
Where possible, using iterators and for each loops are nice.
Also, I can't see how your suggested solution is going to be any better. And it doesn't look as nice either.
First of all, reduce the loop body size, i.e. move stuff to separate functions. It is generally a bad idea to have functions longer than what can fit into the screen, so loops should be even smaller.
Secondly, use meaningful variable names in cases like this. I would only use i and j in simple loops with a few lines of code. For instance, if you are going through a two-dimensional array, "col" and "row" would make much more sense, make the code easier to read ("which was which?") and easier to spot mistakes like this.
You just have to take extra care of such issues, there's no magic bullet against this. Even with "better naming" you propose you will once in a while lose track of whether this is Nth or (N+M)th level of nested loop and make an error.
If nested loop is necessary write it carefully. If it can be avoided by extracting the outer loop body into a function that would be a good guard against indices misuse.
As in this as in many things, there's some excellent advice in Steve McConnell's Code Complete. It would be well worth your time to read what he's got to say about building good looping code. I don't have my copy of the book handy here but the whole book is worth your time.
I use 'ii' and 'jj' for transient loop counters if I really need them - they are easier to search for than 'i' and 'j' and also easier to spot in examples like the above. To go one better you can actually use a real variable name. If you're looping over a string then you can call it characterIndex or something. It's more typing, but it documents itself and saves time on debugging obscure problems later.
Better still would be to avoid numerical counters and use named iterators over a collection. They make the intent clearer, in my opinion.
Finally, if possible it's nice to do away with the loop entirely: Boost::Foreach is one way of doing this in C++, although I generally prefer to use languages such as Python which natively allow direct iteration over the contents of a container without a need for incrementing an index value or iterator.
Try to use more declarative loop constructs. For instance, if you don't really need indices (those is and js) and your programming environment allows for it, you can use a foreach construct to iterate over the collection.