K&R 1.6 Array. Not understanding the code [closed] - c

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I start reading the book K&R The C programmming ( 2nd edition). And I got stuck on the 1.6 Array; I just can't seem to figure out what the code does (even tho it says it counts digits, white spaces and others!). Here is the code:
#include <stdio.h>
/* count digits, white space, others */
main()
{
int c, i, nwhite, nother;
int ndigit[10];
nwhite = nother = 0;
for (i = 0; i < 10; ++i)
ndigit[i] = 0;
while ((c = getchar()) != EOF)
if (c >= '0' && c <= '9')
++ndigit[c-'0'];
else if (c == ' ' || c == '\n' || c == '\t')
++nwhite;
else
++nother;
printf("digits =");
for (i = 0; i < 10; ++i)
printf(" %d", ndigit[i]);
printf(", white space = %d, other = %d\n",nwhite, nother);
}
So first it defines Integers, ( c,i,nwhite,nother);
After that it creates an array of 10 digits, ( 0 -9 )
After that it sets nwhite and nother to 0.
the for loops set I to 0, i < 10 means if its lower, add i = i + 1.
ndigit[i] = 0? I dont quite understand it, isnt i already is 0?
while ((c = getchar() != EOF) means What ever the input is and isnt at the end of the file?.
After that part I kinda got lost and I'm not sure what
if (c >= '0' && c <= '9')
++ndigit[c-'0'];
Does at all.
And I don't quite understand why the for (i = 0; i < 10 ; +=i ) is repeated . I do understand English but some expensive use of words will confuse me. So if you dont mind, please keep it basic for me. I really hope there is someone out there who can help me understanding this code 100%. Because after all, who wants a programmer who cant even understand the code? :)

Let us step through the code and see what is happening.
main()
{
int c, i, nwhite, nother;
int ndigit[10];
nwhite = nother = 0;
In the first first line of code we are declaring[0] (to the compiler) that c, i, nwhite and nother will be integer variables. At this point, while we have declared these variables, we have not given them any value.
The next line we are declaring that ndigit will be an array of 10 integers, again no initialization is happening so we have no idea of what the value of those ten integers might be.
In the third line we are defining nwhite and nother to be zero, in other words we are initializing them to some value.
for (i = 0; i < 10; ++i)
ndigit[i] = 0;
In this loop, we are initializing the variable i to be zero, and we will increment it by one ever time through the loop, till the value become ten or larger. The body of the loop sets each element of the array to zero. This is a common c-idiom for initializing the elements of an array.
while ((c = getchar()) != EOF)
{
if (c >= '0' && c <= '9')
++ndigit[c-'0'];
else if (c == ' ' || c == '\n' || c == '\t')
++nwhite;
else
++nother;
}
The next block of code does the actual counting. While the code in K&R is syntactically correct, I prefer enclosing the bode of the while loop with curly-braces, I find it easier to read, but it is a personal thing [1].
The condition of the while loop ((c = getchar()) != EOF), can be kind of confusing. We perform the operation in parenthesis first, which is c = getchar() which has the effect of getting the next character and assigning it to the variable c. (remember that in C a character (i.e. variable of type char), is just a small integer so we can assign a character type to an integer type). The assignment statement has a return value[2], in that it returns the value on the right side of the assignment operator, so the operation in parenthesis returns the value of getchar(), which is then compared to EOF, and if it doesn't equal EOF we enter the body of the while statement.
The first if statement checks to see if the character is a number. In ASCII, number have the value of 0x30 ('0') through 0x39 ('9'), so we check to see if the character is in that range. If it is, we increment the appropriate value in the ndigit array. For example, suppose that we have read in the character '5' which has an ASCII value of 0x35. Because 0x35 is between 0x30 and 0x39 we have a digit. Performing the subtraction c - '0' is equivalent to 0x35 - 0x30 which equals 0x05. We then use this as the index into the array, and increment the appropriate value with ++ndigit[c-'0'].
The next branch of the if-block, check to see if c is a a white space, i.e. the
expression c == ' ' || c == '\n' || c == '\t' check to see if c is a space or if c is a new-line or if c is a tab. If c is one of those characters we then
increment nwhite.
Finally, the else branch is taken if we do not have a digit or white space, and we then increment nother.
printf("digits =");
for (i = 0; i < 10; ++i)
printf(" %d", ndigit[i]);
printf(", white space = %d, other = %d\n", nwhite, nother);
}
The last bit of code just prints out the results. Because we want to look at all ten elements of the ndigit array, we need to step through the array again so we use the for loop structure to look at each element of the array.
Hopefully, this clears up some stuff. Something you may want to try is to modify this code so that it counts the of letters that are appearing in the input as well. First just try and count letters, with out regard to case, and then see if you can count upper and lower case letters.
notes:
[0] Declaring a variable is just specifying the name and type of the variable, so int x; is just a declaration. We are providing just enough information to the compiler that it can check our usage of x. A definition is when we assign a value to the variable, so x=5; is a definition. Note that the declaration and definition can be combined into a single line int x = 5;. At the assembly level, a declaration causes storage to be allocated for the variable, but does not set what the storage location contains.
[1] The C grammar says that the curly-braces are not needed for a while block if
it consists of a single statement, i.e.
while(n > 10)
c--;
and
while(n > 10)
{
c--;
}
are equivalent, I just find the second easier to read. Also, the C grammar
says that curly braces are not need for the body of an if statement if the body consists of a single statement, so for example
if(n < 10)
n = n - 10;
and
if(n < 10)
{
n = n - 10;
}
are equivalent.
Finally, the else if and end all are part of the if statement so the statement
if (c >= '0' && c <= '9')
++ndigit[c-'0'];
else if (c == ' ' || c == '\n' || c == '\t')
++nwhite;
else
++nother;
is effectively a single statement, and thus why the curly braces are not needed.
Also, for readability and maintainability I tend to use curly braces with if / else if / else blocks - but again it is a personal think.
[2] The assignment statement has a return value of the left hand side, so a simple expression of a = 10; the return value is just ignored. Having a return value allows us to write something like, a = b = c = 10 which will have the effect of setting a, b and c to 10. In addition of having a return value, the assignment operator is right associative, so the above expression would be
interpreted as a = (b = (c = 10)).
-T.

Related

Can't understand small part of strcmp function

I'm reading a book in C and have seen these two strcmp algorithm.
I have learned my self how the usel for loop works.
But these two for loop are new for me. I don't understand these parts
for (i = 0; s[i] == t[i]; i++)
It have no length instead have this s[i] == t[i].
for ( ; *s == *t; s++, t++) what means this guy ;.
The other parts i understand and I'm also aware what these function returns.
/* strcmp: return <0 if s<t, 0 if s==t, >0 if s>t */
int strcmp(char *s, char *t)
{
int i;
for (i = 0; s[i] == t[i]; i++)
if (s[i] == '\0')
return 0;
return s[i] - t[i];
}
int strcmp(char *s, char *t)
{
for ( ; *s == *t; s++, t++)
if (*s == '\0')
return 0;
return *s - *t;
}
First, some basics.
The syntax of a for loop is
for ( expr1opt ; expr2opt ; expr3opt ) statement
Each of expr1, expr2, and expr3 are optional. The statement
for ( ; ; ) { // do something }
will loop "forever", unless there's a break or return statement somewhere in the body of the loop.
expr1, if present, is evaluated exactly once before loop execution - it's used to establish some initial state (such as setting an index to 0, or assigning a pointer value, or something like that).
expr2, if present, is evaluated before each iteration of the loop body. It's the test condition for continuing loop execution. If the expression evaluates to a non-zero value, the loop body is executed; otherwise, the loop exits. If expr2 is missing, it is assumed to evaluate to 1 (true).
expr3, if present, is evaluated after each iteration of the loop body. It usually updates whatever is being tested in expr2.
for (i = 0; s[i] == t[i]; i++) It have no length instead have this s[i] == t[i]
This loop will execute as long as s[i] == t[i]; as soon as t[i] is not equal to s[i], the loop will exit. By itself, this means the loop will run past the end of the string in case you have identical strings - if both s and t contain "foo", then the loop will run as
s[0] == t[0] == 'f'
s[1] == t[1] == 'o'
s[2] == t[2] == 'o'
s[3] == t[3] == 0
s[4] == t[4] // danger, past the end of the string
So, within the body of the loop, the code also checks to see if a[i] is 0 - if so, that means we've matched everything up to the 0 terminator, and the strings are identical.
So, basically, it goes...
s[0] == t[0] == 'f', s[0] != 0, keep going
s[1] == t[1] == 'o', s[1] != 0, keep going
s[2] == t[2] == 'o', s[2] != 0, keep going
s[3] == t[3] == 0, s[3] == 0, at end of s, strings match
for ( ; *s == *t; s++, t++)
does exactly the same thing as the first loop, but instead of using the [] operator to index into s and t, it just uses the pointers. Since there's nothing to initialize, the first expression is just left empty.
In the first case, the code after the for statement is checking to see if the end-of-string marker has been found, and if so the function returns 0.
In the case of the second for statement, the initialization part of the for statement is not filled in, so the statement starts with for( ;. This is perfectly legitimate.
Best of luck.
For loop has 3 parts - initialization , condition and loop expresion. All these are optional.
So this loop-
for (i = 0; s[i] == t[i]; i++)
It runs till character s[i] is equal to t[i]. So this is condition. If it is false loop breaks.
It is not necessary that condition is always based on length.
And this one -
for ( ; *s == *t; s++, t++)
As we see above intialization is optional and is not present here which is perfectly fine. Condition in this loop is also same i.e loop till characters are equal.
for (i = 0; s[i] == t[i]; i++) // It has no length
Actually this code is slightly dangerous, as it assumes that the passed strings are NULL-terminated (but read later). The cycle goes on only while the left-part of the strings are equal so, inside the loop, the only possible result to be returned is 0 (equal), when a NULL is encountered (the for(;;) condition ensures that the two strings both have the NULL in the same position).
About the length, to calculate it you should scan the whole string anyway... and two times (because there are two strings). This cycle instead combines all in one. Moreover, strings in C must be NULL terminated. Definitely, there is no other way to do this comparison!
for ( ; *s == *t; s++, t++) // what means this guys
This is about the same as the previous, but instead of dereferencing s and t using an index (and without touching them), they are modified to point to the characters, one after another. I believe this is faster, but depends on the compiler. Moreover, incrementing s and t makes you lose the start of the strings; but in this function it is not a problem.
About the syntax of for(;;), a comment already explained why it is written like this. The last part of the for(), between the semicolon and the closing bracket, is executed after every iteration. In this case we need to increment two variables, so there are two statements separated by a comma.
It doesn't have be a length. The for loop is run until the condition is true, so in this case it means it will be running until s[i] is not equal to t[i].
for ( ; *s == *t; s++, t++)
; here means that the first clause of the for loop is omitted. As
bot s and t are defined outside of for loop there is no need to define them here.
It's allowed by the C standard:
for ( clause-1 ; expression-2 ; expression-3 ) statement
(...)
Both clause-1 and expression-3 can be omitted. An omitted expression-2 is
replaced by a nonzero constant.
Some compilers such as clang produce a warning when an already defined variable is put in the first clause. For example this code:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int i = 0;
for (i; i < 10; i++)
puts("Hi");
return EXIT_SUCCESS;
}
compiled with clang produces a warning:
main.c:7:8: warning: expression result unused [-Wunused-value]
for (i; i < 10; i++)
^

Taking equations as user input in c

I've been racking my brain at this problem since yesterday and I was hoping someone could point me in the right direction.
I'm new to C and we must create a program where the user enters a series of linear equations that must be solved with Cramer's rule.
The math is not a problem, however I am not sure how to get the coefficients from an entire equation composed of chars and ints.
The user input should look like a series of linear equations such as:
-3x-3y-1z=6
2x+3y+4z=9
3x+2y+4z=10
This would be simple if we were allowed to only enter the coefficients but sadly the whole equation must be entered. You can assume that there are no spaces in the equation, the order of variables will be the same, and that the equations are valid.
I was thinking about storing the whole equation in an array and searching for each variable (x,y,z) then finding the int before the variable, but I cannot determine a way to convert these found variables into integers.
Any help is greatly appreciated. Thanks in advance.
You can split on x/y/z/= with strtok and then use atoi to transform the char* into int.
Read the man strtok and the man atoi for further information (functions from stdlib).
Your idea will work. I once did that on a very similar project at school, it was a nightmare but it (kinda) worked. You'll need some logic to read more than one number, unless you want to restrict yourself to coefficients lower than two digits. If I remember correctly, I started reading the characters until I found a variable in the expression, then I converted and assigned the value I found to that variable for resolving.
To transform your characters into integers you can use the atoi() function, which receives a string of characters and returns the corresponding integer.
If you're willing to invest extra time, and if you're working under *nix, you may want to dig into regular expression's territory with regex.h. You'll minimize your code, but it won't be easy if you haven't worked with regular expressions before.
//ax+bx+cz=d, -999 <= a,b,c,d <= 999
int a, b, c, d, i ,j;
char A[5], B[5], C[5], D[5], str[22];
char *pntr;
printf("Enter equation: ");
fgets(str, 21, stdin);
pntr = A;
i = j = 0;
while(1){
if(str[i] == 'x'){pntr[j] = '\0'; pntr = B; j = 0; i++; continue;}
else if(str[i] == 'y'){pntr[j] = '\0'; pntr = C; j = 0; i++; continue;}
else if(str[i] == 'z'){pntr[j] = '\0'; pntr = D; j = 0; i += 2; continue;}
else if(str[i] == '\n' || str[i] == '\0'){pntr[j] = '\0'; break;}
pntr[j] = str[i];
i++;
j++;
}
a = atoi(A);
b = atoi(B);
c = atoi(C);
d = atoi(D);
printf("%d %d %d %d \n", a, b, c, d);
valter

The use of tolower and storing in an array

I am trying to trace through this problem and can not figure out how the star is goes through the while loop and is stored in the array. Is * stored as 8 because of tolower? If anyone could please walk through the first for - to second for loop please I would be eternally grateful.
#include <stdio.h>
#include <ctype.h>
int main()
{
int index, freq[26], c, stars, maxfreq;
for(index=0; index<26; index++)
freq[index] = 0;
while ( (c = getchar()) != '7')
{
if (isalpha(c))
freq[tolower(c)-'a']++;
printf("%d", &freq[7]);
}
maxfreq = freq [25];
for (index = 24; index >= 0; index--)
{
if (freq[index] > maxfreq)
maxfreq = freq[index];
}
printf ("a b c d e f\n");
for (index = 0; index < 5; index++)
{
for (stars = 0; stars < (maxfreq - freq[index]); stars ++)
printf(" ");
for (stars = 0; stars < (freq[index]); stars++)
printf("*");
printf("%c \n", ('A' + index) );
printf(" \n");
}
return 0;
}
It seems that this code is a histogram of sorts that prints how many times a given character has been entered into the console before it reaches the character '7'.
The following code:
for(index=0; index<26; index++)
freq[index] = 0;
Is simply setting all of the values of the array to 0. This is because of the fact that in C, variables that are declared in block scope (that is, inside a function) and that are not static do not have a specific default value and as such simply contain the garbage that was in that memory before the variable was declared. This would obviously affect the results that are displayed each time it is run, or when it is run elsewhere, which is not what you want I'm sure.
while ( (c = getchar()) != '7')
{
if (isalpha(c))
freq[tolower(c)-'a']++;
printf("%d", &freq[7]);
}
This next section uses a while loop to continue accepting input using getchar() (which gets the next character of input from STDIN in this case) until the character "7" is reached. This is due to the fact that assigning a value (such as "c = getchar()") allows the value to be used in such a way that it can be compared using "! = '7'". This allows us to continue looping until the character that is accepted from STDIN is equal to '7', after which the while loop will end.
Inside the loop itself, it's checking the value that has been entered using "isalpha()", which returns true if the character is an alphabetic letter. By using "tolower()" and returning that value to be subtracted by the character value of 'a', we are basically finding which character in the alphabet this is numerically. An example would be if we took the letter 'F'. Capital 'F' is stored as the value 70 in the background. tolower() checks to see if it is an uppercase character, and if it is, it returns the lowercase version of it (in this case, 'f' == 102). This value is then subtracted by 'a' (stored as 97) which returns the value 6 (which, when counting from 0, is the position of 'F' in the alphabet). This is then used to target that element of the array and to increment it, which tells us that another "F" or "f" has been entered.
maxfreq = freq [25];
for (index = 24; index >= 0; index--)
{
if (freq[index] > maxfreq)
maxfreq = freq[index];
}
This next section sets the variable "maxfreq" to the last value (how many times 'Z' was found), and iterates downwards, changing the value of maxfreq to the highest value that is found (that is, the largest number of any given character that is found in the array). This is later used to format the output to make sure that the letters line up correctly and the number of stars and spaces are correct.

how the following for loop function differently

#include<stdio.h>
void main()
{
int a,b,c;
for(b = c = 10; a = "- FIGURE?, UMKC,XYZHello Folks,TFy!QJu ROo TNn(ROo)SLq SLq ULo+UHs UJq TNn*RPn/QPbEWS_JSWQAIJO^NBELPeHBFHT}TnALVlBLOFAkHFOuFETpHCStHAUFAgcEAelclcn^r^r\\tZvYxXyT|S~Pn SPm SOn TNn ULo0ULo#ULo-WHq!WFs XDt!"[b+++21];)
{
for(;a-->64;)
{
putchar((++c == 'Z') ? (c = c/9) : (33^b&1));
}
}
getch();
}
Above program in c language gives the output as map of India. In the above program outer for loop has 2 slots and the third one is left empty. However I understood how the program works but the doubt is that the condition slot of the outer for loop works as an assignment slot. Syntactically and logically this should be wrong but it works. According to the value in array index, ASCII code of corresponding char is assigned variable a.
How this works?
The condition slot of the outer for loop assigns to a one of the characters of the string literal, at the same time incrementing b. Because the assignment operator also returns the assigned value, the condition of the outer for loop becomes the value of some of the characters of the string literal. Because strings are '\0'-delimited in C, the condition is true until the expression b++ + 21 reaches the end of string (then the last (extra) character of the string is returned, and it's equal to 0, thus evaluating as false)
In fact, this is an obfuscated and more complex version of a common C idiom for iterating a string, which looks like this:
char *string = "my string";
int i;
for (i = 0; string[i]; ++i)
/* do something with string[i] */
which can be simplified to:
int i = 0;
for (; string[i++]; )
/* do something with string[i] */
Moreover, the current character can be extracted to a separate char variable c:
int i = 0;
char c;
for (; c = string[i++]; )
/* do something with c */
A while loop can be used instead as well:
while (c = string[i++])
/* do something with c */

C loop won't exit

Good-morning one and all!
This is going to end up being one of those blindingly-easy questions in hindsight, but for the life of me I'm stumped. I'm going through some of the exercises in The C Programming Language, and I've managed to write some code to initialize a loop. After some Googling, I found better ways of initializing a loop to 0, but I don't understand why the loop that I wrote to do it doesn't finish. I've used the debugger to find out that it's because the 'c' variable never reaches 50, it gets to 49 and then rolls over to 0, but I can't figure out why it's rolling over. The code is attached below, does anyone know what's going on here?
#include <stdio.h>
#define IN 1
#define OUT 0
/* Write a program to print a histogram of the lengths of words in
itsinput. */
main()
{
int c=0;
int histogram[50]={0}
int current_length=0;
int state=OUT;
//Here we borrow C so we don't have to use i
printf("Initializing...\n");
while(c<51){
histogram[c] =0;
c=c+1;
}
c=0;
printf("Done\n");
while( (c=getchar()) != EOF){
if( (c==32 || c==10) && state==IN ){
//End of word
state=OUT;
histogram[current_length++];
}else if( (c>=33 && c<=126) && state==OUT ){
//Start of word
state=IN;
current_length=0;
}else if( (c>=33 && c<=126) && state==IN ){
//In a word
current_length++;
} else {
//Not in a word
//Example, " " or " \n "
;
}
}
//Print the histogram
//Recycle current_length to hold the length of the longest word
//Find longest word
for( c=0; c<50; c++){
if( c>histogram[c] )
current_length=histogram[c];
}
for( c=current_length; c>=0; c--){
for( state=0; state<=50; state++){
if( histogram[c]>=current_length )
printf("_");
else
printf(" ");
}
}
}
It's because histogram[c] = 0 writes past the histogram memory when c = 50. So essentially histogram[50] overwrites c and makes it 0.
This happens because arrays start from 0 in C. So the last valid index in a 50-element array is 49.
Technically, while interesting and exploitable you can't rely on this. It's a manifestation of undefined behavior. The memory could easily have another layout causing things to "just work" or do something funnier.
histogram has 50 elements: from index 0 to index 49.
You attempt to write to index 50. ALL BETS ARE OFF
do
while (c < 50)
or, to avoid magic constants
while (c < sizeof histogram / sizeof *histogram)
You are accessing elements 0 to 50 in histogram, which only contains elements 0 to 49 (C/C++ use zero-indexing, so the maximum element of an array will always be size-1).
To avoid errors like this, you could define the histogram size as a constant, and use that for all operations relating to the histogram array:
#define HISTOGRAM_SIZE 50
Or (only works for C99 or C++, see below comment):
const int HISTOGRAM_SIZE = 50;
Then:
int histogram[HISTOGRAM_SIZE];
And:
while(c<HISTOGRAM_SIZE)
'#define' is a C-preprocessor statement, and will be processed before compilation. To the compiler, it will just look as if you've written 50 everywhere where HISTOGRAM_SIZE is used, so you wont get any extra overhead.
'const int' gives you a similar solution, which in many cases will give the same result as with the define (I'm not 100% certain under which circumstances though, others are free to elaborate), but will also give you the added bonus of type-checking.

Resources