This question already has answers here:
Undefined behavior and sequence points
(5 answers)
Closed 9 years ago.
I am solving a problem that reads a set of text lines and prints the longest.
The problem is from K&R "The C programming language" section 1.9. The book provides a solution but I tried to solve it in my own way. The code below works but before I get it to work, I was getting a problem due to the line longestStr[i++] = tmpStr[j++] as I have previously used longestStr[i++] = tmpStr[i++] thinking that i would be incremented once the assignment was done. But that was not the case. Is this how postfix operator normally works?
#include <stdio.h>
#define MAXLINE 100
main()
{
int lineLength = 0, longestLine = 0;
int c, i, j;
char longestStr[MAXLINE];
char tmpStr[MAXLINE];
while((c = getchar()) != EOF)
{
tmpStr[lineLength++] = c;
if(c == '\n')
{
if( lineLength > longestLine)
{
longestLine = lineLength;
i = 0, j = 0;
while(i < lineLength)
{
longestStr[i++] = tmpStr[j++]; // I tried longestStr[i++] = tmpStr[i++] but it gives wrong result
}
}
lineLength = 0;
}
}
printf("Longest line is - %d long\n", longestLine-1);
for(i = 0; i < longestLine-1; i++)
printf("%c", longestStr[i]);
putchar('\n');
}
The postfix increments or decrements are operators that perform the increment or decrement after the use of the variable, such that if you wish to use a integer value to print out and add it at the same time you would use postfix.
int i = 0;
printf("%d",i++);
//prints out 0
Prefix increment/decrement however works in the opposite manner such that it performs the increment/decrement prior to the use of the variable, such that if you wish to increment/decrement a variable before printing it out, you would use prefix
int i = 0;
printf("%d",++i);
//prints out 1
The problem you have is not related to how the postfix operator works, but rather to what you intend to do in the code line which gives you a problem.
As asked in the comments, what is the meaning of the line as you wrote it initially?
longestStr[i++] = tmpStr[i++];
Because the C standard does not specify it, this line can be interpreted in several ways:
longestStr[i] = tmpStr[i+1];
i += 2;
or
longestStr[i+1] = tmpStr[i];
i += 2;
or
longestStr[i] = tmpStr[i];
i += 2;
In every case, you end up with a counter incremented twice, which will mess up your algorithm.
The correct way is to increment the counter in a separate line, or use the working solution (you are providing), which should be compiled down to the same code with any decent compiler.
Note that you should probably check for the counter not going beyond the maximum allowed line size MAXLINE (unless the problem states that this cannot happen with the input, but even in that case, it would be helpful for situations like yours, where the code wrongly increment the counter twice).
try to understand how post fix increment works.
f(i++) is equivalent to the operation 'call f on the current value of i, then increment i'
For example, when you use i++ twice, and initial value of i = 1, a(i++) = b(i++) means, a(1)=b(2) and the value of i after the operation is i=3.
If you want to eliminate one variable in what you are trying to do, you have to make sure you use increment only once. Do it like, a(i)=(b(i); i++
Whereas the postfix increment operator evaluates the original value, it is understandable to think the actual incrementating doesn't happen until sometime down the road. However this is not the case. You should conceptualize postfix as if the incrementing is happening while the original value is being returned.
The bigger issue though is that your statement increments i twice. This means i will increase by two with each iteration of your loop. Futhermore your index values will be off by one from each other, That's because if a variable changes during one part of statement evaluation, those changes become instantly visible to parts of the statement which have not yet evaluated.
Related
This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Order of operations for pre-increment and post-increment in a function argument? [duplicate]
(4 answers)
function parameter evaluation order
(5 answers)
Closed 4 months ago.
I'm new to C and I'm writting some simple stuff to get familiar. I just tried pre-increment on an array index and what I got was really not what I was expecting.
The code in question:
#include <stdio.h>
int main()
{
int array[5] = {1,2,3,4,5};
int i = 0;
printf("%d %d",array[i], array[++i]);
return 0;
}
which should display elements of array[0] and array[1], right?
But it doesn't, as the output is 2 2
What am I missing?
Here's the thing: You can't guarantee what order function arguments are evaluated in.
What this means is, in your printf call, there's no way to tell whether array[i] or array[++i] will be evaluated first. In your case it appears that array[++i] won, so what happens is that you end up printing array[1] twice.
That's not a good use of pre-increment.
If you want to look at array[i] and array[i+1], just do that:
printf("%d %d\n", array[i], array[i+1]);
Remember, i++ does not just take i's old value and add 1 to it. It takes i's old value, and adds 1 to it, and stores the new value back into i. When you were printing array[i] and array[i+1], did you really want to change the value of i?
If you did want to change the value of i — or, in any case, since that mention of array[i++] did change the value of i — you introduced undefined behavior. Once you've got an expression that's modifying the value of i, how do you know what array[i] will print? How do you know that array[i] won't end up using the new value of i? You probably assumed that things are evaluated from left to right, but it turns out that's not necessarily true. So weird, weird things can happen.
Here's an example that helps show how ++ works, and that's well-defined:
#include <stdio.h>
int main()
{
int array[5] = {1,2,3,4,5};
int i = 1;
array[i++] = 22;
array[++i] = 44;
for(i = 0; i < 5; i++)
printf("%d: %d\n", i, array[i]);
return 0;
}
Or, as you suggested in a comment, if you did want to modify i's value, perhaps to advance the variable i along the array, you sometimes have to take care to move the i++ into a separate statement. The simple rule is, if you're applying ++ or -- to a variable, you can only use that variable once in the same statement. So you can't have i++ + i++, since that has two i's in it and they're both modified. But you also can't have i + i++, or f(a[i], a[i++]), since those have a spot where you modify i, and a spot where you use i, and there's no way to know whether you use the old or the new value of i. (And, again, in C there's not a general left-to-right rule that helps you here.)
I'm new in C, stuck in understanding the expression s[i ++] = t[j ++], I don't know how it's possible that an array element gets accessed with a variable and then the variable increase itself and then the array element just accessed is again accessed with the original variable and then gets assigned to another array's element, I'm confused, I think to understand the exact process might involve some low-level knowledges, but I don't want to digress too far away, is there any way to understand it easily and clearly?
In C language, the expression i++ causes i to be incremented, but the expression i++ itself evaluates to the value i had before being incremented. So the expression s[i++] = t[j++] has the same behaviour as:
s[i] = t[j];
i = i + 1;
j = j + 1;
except that the precise order is not specified. For that last reason, the rule is that a variable should only be modified once: s[i++] = t[i++] would invoke Undefined Behaviour.
Like any other complicated-looking expression, it's easier to understand this if you break it down into parts.
The key is that innermost part (or "subexpression") i++. I assume you know what i++ does by itself, although in this example, we're hopefully going to get a deeper appreciation of what i++ is actually good for. Why would you want to "increment i, but return the old value"? What's the use of this? Well, the main use is that it's super useful for moving along an array.
Lets look at a simpler example. Suppose we have an array a that we want to store some numbers in. The most basic way is
int a[10];
a[0] = 12;
a[1] = 34;
a[2] = 5678;
Another very good way is to use a second variable like i to keep track of where we're storing:
i = 0;
a[i] = 12;
i = i + 1;
a[i] = 34;
i = i + 1;
a[i] = 5678;
i = i + 1;
I've written this out in "longhand", but of course in C, you would almost never write it this way, because the "C way" is the much more concise
i = 0;
a[i++] = 12;
a[i++] = 34;
a[i++] = 5678;
So first, make sure you understand that the "shorthand" and "longhand" forms work exactly the same way. Make sure you understand that when we say something like
a[i++] = 34;
what this means is "store 34 into the slot in array a indicated by i, and then update i to be one more than is used to be, so that it indicates the next slot."
In other words, we use an expression like a[i++] whenever we want to move along an array and do something with its elements, one by one, in order.
So far we were storing values into the array, but the idiom works just as well for fetching values out of an array. For example, this code prints those three elements, again one at a time, in order:
i = 0;
printf("%d\n", a[i++]);
printf("%d\n", a[i++]);
printf("%d\n", a[i++]);
My point is, again, that any time you see an expression like a[i++], you should think "we're moving along the array".
So now, finally, we can look at the expression you initially asked about:
s[i++] = t[j++];
Here we have two instances of the idiom. We're using i to move along the array s, and we're using j to move along the array t. We're fetching from t as we move along, and we're storing the values into s.
I don't know whether s and t are arrays of characters, or integers, or what. Also I don't know that s and t are truly arrays -- they might actually be pointers, pointing into some arrays. But I don't really have to know those things to know that the essential meaning of s[i++] = t[j++] is "copy elements from array t to array s, using j to keep where we are in t, and i to keep track of where we are in s".
[The above is an answer to your original question. The rest of this answer isn't directly related, but is essential to avoid inadvertently writing incorrect programs using ++ and --.]
As I said, the subexpression i++ and the idiom a[i++] are super useful for moving through arrays. But there are a couple things to beware of. (Actually it's just one thing, but it crops up in lots of different ways.)
Earlier I wrote the code
i = 0;
printf("%d\n", a[i++]);
printf("%d\n", a[i++]);
printf("%d\n", a[i++]);
to print the first three elements of the array a. But it prints them as bare, isolated numbers. What if I want to always see which array index each number comes from? That is, what if I'm tempted to write something like this:
i = 0;
printf("%d: %d\n", i, a[i++]); /* WRONG */
printf("%d: %d\n", i, a[i++]); /* WRONG */
printf("%d: %d\n", i, a[i++]); /* WRONG */
If I wrote this, my intent would be that I would see the obvious display
0: 12
1: 34
2: 56788
But when I actually tried it just now, I got this instead:
1: 12
2: 34
3: 5678
The numbers 12 and 34 and 5678 are right, but the indices 1, 2, and 3 are all wrong -- they're off by one! How did that happen?
And the answer is that although i++ is, as I said, "super useful", it turns out that there's a fine line between "super useful" and what's called undefined behavior.
That printf call
printf("%d: %d\n", i, a[i++]); /* WRONG */
looks fine, but it's not actually well-defined, because the compiler does not necessarily evaluate everything left-to-right, so it's not actually guaranteed that it will use the old value of i for the %d: part. The compiler might evaluate things from right to left, meaning that a[i++] will happen first, meaning that %d: will print the new value, instead -- which appears to be what happened when I tried it.
Here's another potential issue. Your original question was about
s[i++] = t[j++];
which, as we've seen, copies elements from t to s based on two possibly-different indices i and j. But what if we know we always want to copy t[1] to s[1], t[2] to s[2], t[3] to s[3], etc.? That is, what if we know that i and j will always be the same, so we don't even need separate i and j variables? How would we write that? Our first try might be
s[i++] = t[i++]; /* WRONG */
but that can't be right, because now we're incrementing i twice, and we'll probably do something totally broken like copying t[1] to s[2] and t[3] to s[4]. But if we want to only increment i once, should it be
s[i++] = t[i]; /* WRONG */
or
s[i] = t[i++]; /* WRONG */
But the answer is that neither of these will work. In expressions like these, which have i in one place and i++ in the other place, there's no way to tell whether i gets the old value or the new value. (In particular, there's no left-to-right or right-to-left rule that would tell us.)
So although expressions like i++ and a[i++] are indeed super useful, you have to be careful when you use them, to make sure you don't go over the edge and have too much happening at once, such that the evaluation order becomes undefined. Sometimes this means you have to back off, and not use the "super useful" idiom, after all. For example, a safe way to print those values would be
printf("%d: %d\n", i, a[i]); i++;
printf("%d: %d\n", i, a[i]); i++;
printf("%d: %d\n", i, a[i]); i++;
and a safe way to copy from t[1] to s[i] would be
s[i] = t[i]; i++;
You can read more in this answer about how to recognize well-defined expressions involving ++ and --, and how to avoid the undefined ones.
The evaluations of s[i++] and t[j++] are unsequenced relative to each other. Semantically, it's equivalent to:
t1 = i;
t2 = j;
s[t1] = t[t2];
i = i + 1;
j = j + 1;
with the caveat that the last three assignments can happen in any order, even simultaneously (either in parallel or interleaved)1. The compiler doesn't have to create temporaries, either - the whole thing can be evaluated as
s[i] = t[j];
i = i + 1;
j = j + 1;
Alternately, the side effects of i++ and j++ can be applied before the update to s:
t1 = j;
j = j + 1;
t2 = i;
i = i + 1;
s[t2] = t[t1];
The current values of i and j must be known before you can index into the arrays, and the value of t[j] must be known before it can be assigned to s[i], but beyond that there's no fixed order of evaluation or of the application of side effects.
This is why expressions like x = x++ or a = b++ * b++ or a[i] = i++ all invoke undefined behavior - there's no fixed order for evaluating or applying side effects, so the results can vary by compiler, compiler settings, even by the surrounding code, and the results don't have to be consistent from run to run.
I was trying to write a program that lists the factors of a given number. It works when I write this:
int main () {
int num, i;
printf("\nEnter a number: ");
scanf("%d", &num);
for(i == 0; i <= num; ++i) {
if (num % i == 0) {
printf("\n\t%d", i);
}
}
}
but not when the for loop is
for(i = 0; i <= num; ++i)
I'm confused as I thought this was the format of a for loop. Could someone point out my error please?
You should start the loop in this case with i=1 as in for(i = 1; i <= num; ++i), otherwise you are trying to divide a number by 0. Factor can never be 0.
i == 0 is an expression with no side effects, so writing for(i == 0; i <= num; ++i) is the same as writing for(; i <= num; ++i), i.e. it just does nothing in the initialization part. Now since you never initialize i anywhere this means that you invoke undefined behaviour.
If you do i = 0, i is initialized, but you still invoke undefined behaviour: num % i will cause division by zero because i will be 0 at the beginning.
Now it happened that on your system the division by zero caused your program to crash, while the version where you used i uninitialized happened to run without crashing. That's why it may have appeared to you that the version using i==0 "worked".
In C there are two separate operators that look somewhat similar - an assignment statement which uses a single equal sign, and an equality comparison, which uses two equal signs. The two should not be confused (although very often they are).
i == 0 is a comparison with zero, not an assignment of zero. It is a valid expression, though, so the compiler does not complain: for loop allows any kind of expression to be in its header, as long as the expression is formed correctly.
However, your code would not work, because it has undefined behavior: i remains uninitialized throughout the loop, making your program invalid.
The first part of the for loop is defining the starting value of your iterator (i).
Since you are not initializing it, it could have an undefined value, or sometimes just garbage value...
So when you use i==0 you're not assigning the value to i, just checking if it's equal to 0.
The value of i when not defined is usually very random, but it can have the integer max value, so incrementing it once actually makes it turn to the initger minimum value.
So actually the correct for is:
for(i=0; i<something; i++) // or ++i, doesn't mater
EDIT: oh yeah... and sinc the starting value of the for loop is 0, the something % i is equal to something % 0 which means you're making a division by zero.
EDIT 2: i see it was already mentioned...
I'm just learning some C, or rather, getting a sense of some of the arcane details. And I was using VTC advanced C programming in which I found that the sequence points are :
Semicolon
Comma
Logical OR / AND
Ternary operator
Function calls (Any expression used as an argument to a function call is finalized the call is made)
are all these correct ?. Regarding the last one I tried:
void foo (int bar) { printf("I received %d\n", bar); }
int main(void)
{
int i = 0;
foo(i++);
return 0;
}
And it didnt print 1, which according to what the VTC's guy said and if I undertood correctly, it should, right ?. Also, are these parens in the function call the same as the grouping parens ? (I mean, their precedence). Maybe it is because parens have higher precedence than ++ but I've also tried foo((i++)); and got the same result. Only doing foo(i = i + 1); yielded 1.
Thank you in advance. Please consider that I'm from South America so if I wasnt clear or nothing makes sense please, oh please, tell me.
Warmest regards,
Sebastian.
Your code is working like it should and it has nothing to do with sequence points. The point is that the postfix ++ operator returns the non-incremented value (but increments the variable by 1 nonetheless).
Basically:
i++ – Increment i by one and return the previous value
++i – Increment i by one and return the value after the increment
The position of the operator gives a slight hint for its semantics.
Sequence means i++ is evaluted before foo is invoked.
Consider this case (I am not printing bar!):
int i = 0;
void foo (int bar) { printf("i = %d\n", i); }
int main(void){
foo(i++);
return 0;
}
i = 1 must be printed.
C implements pass-by-value semantics. First i ++ is evaluated, and the value is kept, then i is modified (this may happen any time between the evaluation and the next sequence point), then the function is entered with the backup value as the argument.
The value passed into a function is always the same as the one you would see if using the argument expression in any other way. Other behavior would be fairly surprising, and make it difficult to refactor common subexpressions into functions.
When you do something like:
int i = 0, j;
j = i++;
the value of i is used first and then incremented. hence in your case the values of i which is 0 is used (hence passed to your function foo) and then incremented. the incremented values of i (now 1) will be available only for main as it is its local variable.
If you want to print 1 the do call foo this way:
foo(++i);
this will print 1. Rest you know, why!
guys i'm new at programming and i was surprised by the result of post increment value, now i'm bound by confusion after i found out and executed the code below, if for loop says
1. initialize
2. check for condition if false terminate
3. incrementation.
where does i++ happens? where does i value is equal to 1?
int main()
{
int i, j;
for (int i =0; i<1; i++)
{
printf("Value of 'i' in inner loo[ is %d \n", i);
j=i;
printf("Value of 'i' in outter loop is %d \n", j);
// the value of j=i is equals to 0, why variable i didn't increment here?
}
//note if i increments after the statement inside for loop runs, then why j=i is equals to 4226400? isn't spose to be 1 already? bcause the inside statements were done, then the incrementation process? where does i increments and become equals 1?
//if we have j=; and print j here
//j=i; //the ouput of j in console is 4226400
//when does i++ executes? or when does it becomes to i=1?
return 0;
}
if Post increment uses the value and add 1? i'm lost... Please explain... thank you very much.
I'm not really sure what you're asking, but sometimes it's easier for beginners to understand if rewritten as a while loop:
int i = 0;
while (i < 1)
{
...
i++; // equivalent to "i = i + 1", in this case.
}
Your loop declares new variable i and it shadows the i declared earlier in main(). So if you assign i to j outside of the loop, you are invoking undefined behaviour because i is not initialised in that context.
Prior to the first iteration, i is initialised to 0. This is the "initialize" phase, as you called it.
The loop condition is then evaluated. The loop continues on a true value.
The loop body is then executed. If there's a continue; statement, that will cause execution to jump to the end of the loop, just before the }.
The increment operator is then evaluated for it's side-effects.
Hence, it is after the first iteration that i changes to 1. i keeps the value 1 for the entirety of the second iteration.
It looks like you have a clash of variable names: i is declared before the loop, and also inside the loop.
The i that is declared in the for statement is the only one that will ever be 1. It will be one just after the body of the loop is executed.
Try setting a breakpoint and using the debugger to step through the loop whilst you watch the value of the variable (here's a video of what I mean by stepping with the debugger).
To remove the ambiguitity of having two variables called i you could change the for loop to:
for (i = 0; i < 1; i++) // remove the `int`
this will ensure that there is only one i in your code.
A comment on #CarlNorum's answer which wouldn't look good as a comment:
The C Standard defines
for ( A; B; C ) STATEMENT
as meaning almost the same thing as
{
A;
while (B) {
STATEMENT
C;
}
}
(where a {} block containing any number of statements is itself a kind of statement). But a continue; statement within the for loop will jump to just before the next statement C;, not to the next test of expression B.
for (i=0; i<x; i++)
Is equivalent to
i=0;
while(i<x) {
// body
i = i + 1;
}