Can't understand small part of strcmp function - c

I'm reading a book in C and have seen these two strcmp algorithm.
I have learned my self how the usel for loop works.
But these two for loop are new for me. I don't understand these parts
for (i = 0; s[i] == t[i]; i++)
It have no length instead have this s[i] == t[i].
for ( ; *s == *t; s++, t++) what means this guy ;.
The other parts i understand and I'm also aware what these function returns.
/* strcmp: return <0 if s<t, 0 if s==t, >0 if s>t */
int strcmp(char *s, char *t)
{
int i;
for (i = 0; s[i] == t[i]; i++)
if (s[i] == '\0')
return 0;
return s[i] - t[i];
}
int strcmp(char *s, char *t)
{
for ( ; *s == *t; s++, t++)
if (*s == '\0')
return 0;
return *s - *t;
}

First, some basics.
The syntax of a for loop is
for ( expr1opt ; expr2opt ; expr3opt ) statement
Each of expr1, expr2, and expr3 are optional. The statement
for ( ; ; ) { // do something }
will loop "forever", unless there's a break or return statement somewhere in the body of the loop.
expr1, if present, is evaluated exactly once before loop execution - it's used to establish some initial state (such as setting an index to 0, or assigning a pointer value, or something like that).
expr2, if present, is evaluated before each iteration of the loop body. It's the test condition for continuing loop execution. If the expression evaluates to a non-zero value, the loop body is executed; otherwise, the loop exits. If expr2 is missing, it is assumed to evaluate to 1 (true).
expr3, if present, is evaluated after each iteration of the loop body. It usually updates whatever is being tested in expr2.
for (i = 0; s[i] == t[i]; i++) It have no length instead have this s[i] == t[i]
This loop will execute as long as s[i] == t[i]; as soon as t[i] is not equal to s[i], the loop will exit. By itself, this means the loop will run past the end of the string in case you have identical strings - if both s and t contain "foo", then the loop will run as
s[0] == t[0] == 'f'
s[1] == t[1] == 'o'
s[2] == t[2] == 'o'
s[3] == t[3] == 0
s[4] == t[4] // danger, past the end of the string
So, within the body of the loop, the code also checks to see if a[i] is 0 - if so, that means we've matched everything up to the 0 terminator, and the strings are identical.
So, basically, it goes...
s[0] == t[0] == 'f', s[0] != 0, keep going
s[1] == t[1] == 'o', s[1] != 0, keep going
s[2] == t[2] == 'o', s[2] != 0, keep going
s[3] == t[3] == 0, s[3] == 0, at end of s, strings match
for ( ; *s == *t; s++, t++)
does exactly the same thing as the first loop, but instead of using the [] operator to index into s and t, it just uses the pointers. Since there's nothing to initialize, the first expression is just left empty.

In the first case, the code after the for statement is checking to see if the end-of-string marker has been found, and if so the function returns 0.
In the case of the second for statement, the initialization part of the for statement is not filled in, so the statement starts with for( ;. This is perfectly legitimate.
Best of luck.

For loop has 3 parts - initialization , condition and loop expresion. All these are optional.
So this loop-
for (i = 0; s[i] == t[i]; i++)
It runs till character s[i] is equal to t[i]. So this is condition. If it is false loop breaks.
It is not necessary that condition is always based on length.
And this one -
for ( ; *s == *t; s++, t++)
As we see above intialization is optional and is not present here which is perfectly fine. Condition in this loop is also same i.e loop till characters are equal.

for (i = 0; s[i] == t[i]; i++) // It has no length
Actually this code is slightly dangerous, as it assumes that the passed strings are NULL-terminated (but read later). The cycle goes on only while the left-part of the strings are equal so, inside the loop, the only possible result to be returned is 0 (equal), when a NULL is encountered (the for(;;) condition ensures that the two strings both have the NULL in the same position).
About the length, to calculate it you should scan the whole string anyway... and two times (because there are two strings). This cycle instead combines all in one. Moreover, strings in C must be NULL terminated. Definitely, there is no other way to do this comparison!
for ( ; *s == *t; s++, t++) // what means this guys
This is about the same as the previous, but instead of dereferencing s and t using an index (and without touching them), they are modified to point to the characters, one after another. I believe this is faster, but depends on the compiler. Moreover, incrementing s and t makes you lose the start of the strings; but in this function it is not a problem.
About the syntax of for(;;), a comment already explained why it is written like this. The last part of the for(), between the semicolon and the closing bracket, is executed after every iteration. In this case we need to increment two variables, so there are two statements separated by a comma.

It doesn't have be a length. The for loop is run until the condition is true, so in this case it means it will be running until s[i] is not equal to t[i].
for ( ; *s == *t; s++, t++)
; here means that the first clause of the for loop is omitted. As
bot s and t are defined outside of for loop there is no need to define them here.
It's allowed by the C standard:
for ( clause-1 ; expression-2 ; expression-3 ) statement
(...)
Both clause-1 and expression-3 can be omitted. An omitted expression-2 is
replaced by a nonzero constant.
Some compilers such as clang produce a warning when an already defined variable is put in the first clause. For example this code:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int i = 0;
for (i; i < 10; i++)
puts("Hi");
return EXIT_SUCCESS;
}
compiled with clang produces a warning:
main.c:7:8: warning: expression result unused [-Wunused-value]
for (i; i < 10; i++)
^

Related

K&R 1.6 Array. Not understanding the code [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I start reading the book K&R The C programmming ( 2nd edition). And I got stuck on the 1.6 Array; I just can't seem to figure out what the code does (even tho it says it counts digits, white spaces and others!). Here is the code:
#include <stdio.h>
/* count digits, white space, others */
main()
{
int c, i, nwhite, nother;
int ndigit[10];
nwhite = nother = 0;
for (i = 0; i < 10; ++i)
ndigit[i] = 0;
while ((c = getchar()) != EOF)
if (c >= '0' && c <= '9')
++ndigit[c-'0'];
else if (c == ' ' || c == '\n' || c == '\t')
++nwhite;
else
++nother;
printf("digits =");
for (i = 0; i < 10; ++i)
printf(" %d", ndigit[i]);
printf(", white space = %d, other = %d\n",nwhite, nother);
}
So first it defines Integers, ( c,i,nwhite,nother);
After that it creates an array of 10 digits, ( 0 -9 )
After that it sets nwhite and nother to 0.
the for loops set I to 0, i < 10 means if its lower, add i = i + 1.
ndigit[i] = 0? I dont quite understand it, isnt i already is 0?
while ((c = getchar() != EOF) means What ever the input is and isnt at the end of the file?.
After that part I kinda got lost and I'm not sure what
if (c >= '0' && c <= '9')
++ndigit[c-'0'];
Does at all.
And I don't quite understand why the for (i = 0; i < 10 ; +=i ) is repeated . I do understand English but some expensive use of words will confuse me. So if you dont mind, please keep it basic for me. I really hope there is someone out there who can help me understanding this code 100%. Because after all, who wants a programmer who cant even understand the code? :)
Let us step through the code and see what is happening.
main()
{
int c, i, nwhite, nother;
int ndigit[10];
nwhite = nother = 0;
In the first first line of code we are declaring[0] (to the compiler) that c, i, nwhite and nother will be integer variables. At this point, while we have declared these variables, we have not given them any value.
The next line we are declaring that ndigit will be an array of 10 integers, again no initialization is happening so we have no idea of what the value of those ten integers might be.
In the third line we are defining nwhite and nother to be zero, in other words we are initializing them to some value.
for (i = 0; i < 10; ++i)
ndigit[i] = 0;
In this loop, we are initializing the variable i to be zero, and we will increment it by one ever time through the loop, till the value become ten or larger. The body of the loop sets each element of the array to zero. This is a common c-idiom for initializing the elements of an array.
while ((c = getchar()) != EOF)
{
if (c >= '0' && c <= '9')
++ndigit[c-'0'];
else if (c == ' ' || c == '\n' || c == '\t')
++nwhite;
else
++nother;
}
The next block of code does the actual counting. While the code in K&R is syntactically correct, I prefer enclosing the bode of the while loop with curly-braces, I find it easier to read, but it is a personal thing [1].
The condition of the while loop ((c = getchar()) != EOF), can be kind of confusing. We perform the operation in parenthesis first, which is c = getchar() which has the effect of getting the next character and assigning it to the variable c. (remember that in C a character (i.e. variable of type char), is just a small integer so we can assign a character type to an integer type). The assignment statement has a return value[2], in that it returns the value on the right side of the assignment operator, so the operation in parenthesis returns the value of getchar(), which is then compared to EOF, and if it doesn't equal EOF we enter the body of the while statement.
The first if statement checks to see if the character is a number. In ASCII, number have the value of 0x30 ('0') through 0x39 ('9'), so we check to see if the character is in that range. If it is, we increment the appropriate value in the ndigit array. For example, suppose that we have read in the character '5' which has an ASCII value of 0x35. Because 0x35 is between 0x30 and 0x39 we have a digit. Performing the subtraction c - '0' is equivalent to 0x35 - 0x30 which equals 0x05. We then use this as the index into the array, and increment the appropriate value with ++ndigit[c-'0'].
The next branch of the if-block, check to see if c is a a white space, i.e. the
expression c == ' ' || c == '\n' || c == '\t' check to see if c is a space or if c is a new-line or if c is a tab. If c is one of those characters we then
increment nwhite.
Finally, the else branch is taken if we do not have a digit or white space, and we then increment nother.
printf("digits =");
for (i = 0; i < 10; ++i)
printf(" %d", ndigit[i]);
printf(", white space = %d, other = %d\n", nwhite, nother);
}
The last bit of code just prints out the results. Because we want to look at all ten elements of the ndigit array, we need to step through the array again so we use the for loop structure to look at each element of the array.
Hopefully, this clears up some stuff. Something you may want to try is to modify this code so that it counts the of letters that are appearing in the input as well. First just try and count letters, with out regard to case, and then see if you can count upper and lower case letters.
notes:
[0] Declaring a variable is just specifying the name and type of the variable, so int x; is just a declaration. We are providing just enough information to the compiler that it can check our usage of x. A definition is when we assign a value to the variable, so x=5; is a definition. Note that the declaration and definition can be combined into a single line int x = 5;. At the assembly level, a declaration causes storage to be allocated for the variable, but does not set what the storage location contains.
[1] The C grammar says that the curly-braces are not needed for a while block if
it consists of a single statement, i.e.
while(n > 10)
c--;
and
while(n > 10)
{
c--;
}
are equivalent, I just find the second easier to read. Also, the C grammar
says that curly braces are not need for the body of an if statement if the body consists of a single statement, so for example
if(n < 10)
n = n - 10;
and
if(n < 10)
{
n = n - 10;
}
are equivalent.
Finally, the else if and end all are part of the if statement so the statement
if (c >= '0' && c <= '9')
++ndigit[c-'0'];
else if (c == ' ' || c == '\n' || c == '\t')
++nwhite;
else
++nother;
is effectively a single statement, and thus why the curly braces are not needed.
Also, for readability and maintainability I tend to use curly braces with if / else if / else blocks - but again it is a personal think.
[2] The assignment statement has a return value of the left hand side, so a simple expression of a = 10; the return value is just ignored. Having a return value allows us to write something like, a = b = c = 10 which will have the effect of setting a, b and c to 10. In addition of having a return value, the assignment operator is right associative, so the above expression would be
interpreted as a = (b = (c = 10)).
-T.

String Incrementing Function in C

For my programming class, I am trying to write a function incrementstring() that takes the string 'str' passed in from a driver, and adds one to them. It should work with both letters and numbers (ex. '1' goes to '2', 'a' goes to 'b', 'z' goes to 'aa', 'ZZ' goes to 'AAA'). I have almost every test condition working, except for one bug that I can't seem to find a way around.
This is what I currently have:
void incrementstring(char* str){
int i;
int j;
int length = strlen(str);
for(i = strlen(str)-1; i >= 0; i--){
if (str[i] == '9'){
str[i] = '0';
if (str[0] == '0'){
for (j = strlen(str)-1; j>=0; j--){ //This loop is the problem
str[j+1] = str[j];
}
str[0] = '1';
}
}
else if (str[i] == 'z'){
if (str[0] == 'z'){
str[i] = 'a';
str[i+1] = 'a';
}
str[i] = 'a';
}
else if (str[i] == 'Z'){
if(str[0] == 'Z'){
str[i] = 'A';
str[i+1] = 'A';
}
str[i] = 'a';
}
else{
str[i]++;
return;
}
}
}
When I run the function, this is what the driver outputs:
1. testing "1"... = 2. Correct!
2. testing "99"... = 100. Correct!
3. testing "a"... = b. Correct!
4. testing "d"... = e. Correct!
5. testing "z"... = INCORRECT: we got "aa0". We should be getting "aa" instead.
6. testing "aa"... = ab. Correct!
7. testing "Az"... = Ba. Correct!
8. testing "zz"... = aaa. Correct!
9. testing "cw"... = cx. Correct!
10. testing "tab"... = tac. Correct!
11. testing "500"... = 501. Correct!
11 tests run.
I wrote a for loop in line 9 to handle the '99' to '100' condition. It takes every index of the string and shifts it one to the right, and then adds a '1' to the beginning of the string. However, this loop for some reason messes up the 5th test condition, as seen above. If I take the loop out, '99' will go to '00', but the 5th test will pass with no problems. I've hit a brick wall here and I was wondering if anybody can provide some insight.
I appreciate the help, thanks.
While also keeping track of string length to make sure you do not overwrite its allocated space, add a null terminating character to each of your if() and if else segments:
str[0] = '1';
str[1] = 0;
...
str[i] = 'a';
str[i+1] = 0;
And so on.
This final statement may not be doing what you expect it should do.
I believe what you want to do is to increment the expression to point to the next element of memory owned bystr.
Keep in mind that str is actually not an array. It is a pointer. The [...]
notation you are using is a convenience provided in C to allow array like referencing of pointers.
So, the expression str[i] for example can also be expressed as *(str + i).
If it is the next memory location (where the next char is stored)you want, the expression would be:
*(str + i++), which when using array notation translates to: str[i++]
Change the following from
else{
str[i]++;
to:
else{
str[i++]=0;
Your issue is that you're not NULL-terminating your string in the driver program. Running your code with my own driver program works perfectly, so any additional help would require you to share your driver program with us.
All you have to do is, after you fill the char * with the string, make the next character a '\0' character. Since the strlen function simply iterates over the array of chars until it reaches a NULL terminating character, you have to terminate all strings with that character before using them.
Does it work for "zaz" correctly?

how the following for loop function differently

#include<stdio.h>
void main()
{
int a,b,c;
for(b = c = 10; a = "- FIGURE?, UMKC,XYZHello Folks,TFy!QJu ROo TNn(ROo)SLq SLq ULo+UHs UJq TNn*RPn/QPbEWS_JSWQAIJO^NBELPeHBFHT}TnALVlBLOFAkHFOuFETpHCStHAUFAgcEAelclcn^r^r\\tZvYxXyT|S~Pn SPm SOn TNn ULo0ULo#ULo-WHq!WFs XDt!"[b+++21];)
{
for(;a-->64;)
{
putchar((++c == 'Z') ? (c = c/9) : (33^b&1));
}
}
getch();
}
Above program in c language gives the output as map of India. In the above program outer for loop has 2 slots and the third one is left empty. However I understood how the program works but the doubt is that the condition slot of the outer for loop works as an assignment slot. Syntactically and logically this should be wrong but it works. According to the value in array index, ASCII code of corresponding char is assigned variable a.
How this works?
The condition slot of the outer for loop assigns to a one of the characters of the string literal, at the same time incrementing b. Because the assignment operator also returns the assigned value, the condition of the outer for loop becomes the value of some of the characters of the string literal. Because strings are '\0'-delimited in C, the condition is true until the expression b++ + 21 reaches the end of string (then the last (extra) character of the string is returned, and it's equal to 0, thus evaluating as false)
In fact, this is an obfuscated and more complex version of a common C idiom for iterating a string, which looks like this:
char *string = "my string";
int i;
for (i = 0; string[i]; ++i)
/* do something with string[i] */
which can be simplified to:
int i = 0;
for (; string[i++]; )
/* do something with string[i] */
Moreover, the current character can be extracted to a separate char variable c:
int i = 0;
char c;
for (; c = string[i++]; )
/* do something with c */
A while loop can be used instead as well:
while (c = string[i++])
/* do something with c */

Interchanging statements in for loop

This is the line of code in C.
The condition of loop here is ++i.
So how does compiler decide which condition to consider because here other two appear as conditions?
char i=0;
for(i<=5&&i>-1;++i;i>0)
printf("%d",i);
output
1234..127-128-127....-2-1
The for statement works like this:
for (X; Y; Z)
{
...
}
translates to
X;
while (Y)
{
...
Z;
}
So your code changes from:
char i=0;
for(i<=5&&i>-1;++i;i>0)
printf("%d",i);
to:
char i = 0;
i<=5 && i>-1; // X
while (++i) // Y
{
printf("%d", i);
i > 0; // Z
}
As you can see, lines marked with X and Z are completely useless. Therefore:
char i = 0;
while (++i)
printf("%d", i);
This means it will print from 1 up to whenever result of ++i is zero.
If char in your compiler is signed, then the behavior is left to implementation, even though most likely it will overflow to a negative value and work its way up to zero.
If char is positive, this will print positive values up to where it overflows back to 0.
It doesn't. It runs the first part and i gets set to any side effect of this, then it terminates when the second part is false, in this case when i is 0, then on every loop it runs the 3rd part.
SO the compiler essentially rewrites this as:
char i=0;
i<=5&&i>-1;
do {
printf("%d",i);
i>0;
} while ( (++i) != 0)
hint: remember char is signed and twos complement, so i will go 1,2,3....128, -127,-126.... 0
the loop termination condition here is ++i. There is no mystery about it. The loop will stop when i hits 0 (because the it will be 'false')

strncmp function does not stop checking at n characters?

My program compares the 2 strings entirely and does not stop once n number of characters are reached? Why does this happen?
int strncompare (const char* mystring1,const char* mystring2, int number)
{
int z;
z = number - 1;
while ((*mystring1==*mystring2) && (*mystring1 != '\0') && (*mystring2 != '\0'))
{
*mystring1++;
*mystring2++;
if ((*mystring1 == mystring1[z]) && (*mystring2 == mystring2[z]))
{
break;
}
}
return (mystring1++ - mystring2++);
}
Because you don't stop when you've compared number characters.
There are several ways to do this, but I would recommend changing your loop condition to
while (*mystring1 && *mystring2 && *mystring1 == *mystring2 && number-- > 0)
Also remove
if ((*mystring1 == mystring1[z]) && (*mystring2 == mystring2[z]))
{
break;
}
Because, although it seems like that was your attempt at making it stop, it's coded wrong; you don't care if the characters are the same, you only care if you've compared number characters. Also you use && which makes the condition even more restrictive than it already was.
Also change
*mystring1++;
*mystring2++;
To
mystring1++; // or better, ++mystring1
mystring2++; // or better, ++mystring2
The * dereferences the pointer but you're not doing anything with it so it's pointless (pun intended).
You also can remove the ++ from these:
return (mystring1++ - mystring2++);
So it would be
return mystring1 - mystring2;
However, that is undefined behaviour when the two pointers point to different arrays (which they probably always will). You need to be doing something else. What? I don't know because I don't know what your function should return.
You have no condition in your function that examines number, or z that you derive from it. What would make it stop?
Why don't you simply decrement number and break when it reaches 0 assuming the loop hasn't broken by that point
You should update z on each iteration and then check if it reaches zero, try adding this to your code:
if (z == 0)
break;
else
z -= 1;
Also, that check you have there is really faulty, if it worked it could stop at an unwanted time, for example on the strings "abcdec" and "xxcddc", where number = 6, it would stop at 3, because the characters at those indexes are the same as those on index 6.
Re-read your code very thoroughly and make sure you really understand it before taking any of these answers into account.
This will walk until it finds a difference, or the end of the string.
while(n > 0) {
if(*str1 != *str2 || *str1 == '\0'){
return *str1 - *str2;; //they're different, or we've reached the end.
}
++str1; //until you understand how ++ works it's a good idea to leave them on their own line.
++str2;
--n;
}
return 0;// I originally had *str1 - *str2 here, but what if n came in as zero..
the problem with the z compare is it's a moving target.
think of [] as a + sign.. mystring1[z] could be represented like this *(mystring1 + z)
That means the line above ++mystring1; (as it should be) is moving the pointer and thus moving where z is looking..
It might help to think of pointers as address on a street.. when you ++ you move up a house..
Say z = 1.. and the house that mystring1 points at is yours, and z is your neighbor. add one to the house you're looking at, and mystring1 is now pointing at your neighbor, and z is pointing at his neighbor because z is still saying what your pointing at + 1.
Thanks all...I fixed the error...added another condition to the while loop.
int i;
i=0;
z = number - 1;
while((*mystring1==*mystring2) && (*mystring1 !='\0') && (*mystring2 !='\0') && (i<z))
and then incrementing i till it comes out of this loop.

Resources