looking for exercise 1-9 from the K&R book (Copy input to output. Replace each string of multiple spaces with one single space) I found this code on this site.
#include <stdio.h>
main()
{
int ch, lch;
for(lch = 0; (ch = getchar()) != EOF; lch = ch)
{
if (ch == ' ' && lch == ' ')
;
else
putchar(ch);
}
}
The program works, but the operation is not clear to me:
what is the variable lch for? Why not inserting it inside the third condition of for loop and if statement the program does not give the correct output?
You need to substitute several spaces with one space. So if the previous inputted character was space and the current inputted character is also space when you need to skip the current character.
So lch stores the value of the previous inputted character. Initially when there was not yet any input lch is set to 0. Then in each iteration lch is set to the current inputted character that in this if statement
if (ch == ' ' && lch == ' ')
whether the current character and the previous character are both spaces. If so then the program outputs nothing.
lch is getting the old character, so the ch is getting getchar(), run the loop, and when this is finished, the value is taken by lch.
Related
I'm doing Exercise 1-9 in the K&R Book, while trying to find solutions I came across this code:
int main()
{
int c;
while ((c = getchar()) != EOF) {
if (c == ' ') {
while ((c = getchar()) == ' ');
putchar(' ');
if (c == EOF) break;
}
putchar(c);
}
}
Why does the first if statement work if even if I input a letter. From my understanding it will only execute if the character I input is a blank space?
Btw the exercise is making a program replace multiple consecutive blank spaces to a single one.
This program prints any entered character except the blank character ' ' until the user will interrupt the input.
In this while loop
while ((c = getchar()) == ' ');
each blank character is read but not outputted. And after the loop only one blank character is outputted
putchar(' ');
That is the program removes adjacent blank characters leaving only one blank character in the input sequence of characters entered by the user.
I've formatted out and comented the code. Hope that'll help. The code actually covert all
sequence of spaces within stdin into one space:
"123 456 a b c " -> "123 456 a b c "
Code:
int main() {
int c;
/* we read stdin character after character */
while ((c = getchar()) != EOF) {
/* if we have read space */
if (c == ' ') {
/* we skip ALL spaces */
while ((c = getchar()) == ' ')
; /* skipping ALL spaces: we do nothing */
/* and then we print just ONE space instead of many skipped */
putchar(' ');
/* if we at the end of stdin, we have nothing more to print */
if (c == EOF)
break;
}
/* we print every non space character */
putchar(c);
}
}
It loops as long as the input stream doesn't end (EOF = End Of File).
If the entered character is a space, it will ignore any follow-up spaces and only print one space afterwards.
Otherwise it will output the entered character.
int main()
{
int c;
while ((c = getchar()) != EOF) {
For each character of input...
if (c == ' ') {
...if the character is a space...
while ((c = getchar()) == ' ');
... this loop skips all spaces that come after the first... See the semicolon in the right, that makes it read the character, check that it is a space, and does nothing with it. It is a very tricky thing to write it as such, as it is quite common to think that the loop body will be the next statement below, while it has no body at all. After a while loop, you can assume that the condition that let you enter the loop is false, so we have some true assertion: in c there's for sure no space stored (it can still be EOF, which is not a character, so we need to test for it before printing, and we do it next, after the next statement)
putchar(' ');
... after the loop, only a single space is output, corresponding to the one tested in the first if statement you mentioned in your question. Think that c is not a space character (so we cannot putchar(c);), as we skip all spaces until none remained. Still it can be an EOF indicator, that is checked below.
if (c == EOF) break;
if the character was not a space, it could only be an EOF indicator. In that case, we need to get out of the loop so we don't print it, in the next statement...
}
putchar(c);
... as we have been reading characters in c until we got a non space (nor an EOF indicator, as we got out of the loop above in that case) we need to print that char anyway. This putchar(c); statement will always print a non-space character. It is out of the if statement, as it must be done for all the characters that were initially nonspaces and the characters that followed a sequence of spaces.
}
...As above, we can assume the test condition of the loop is false, so here we can ensure that c has got EOF (but only because the break statement inside the loop also happens when c == EOF).
}
Et voila!!!
Note
while trying to find solutions I came across this code:
A final note, it is better for you if you post your attempt at programming a solution, instead of finding already made solutions by searching. You'll learn more on programming and less on googling, which IMHO is not your primary interest.
Here is the exercise:
Write a program that prints its input one word per line.
My solution to this exercise is the following:
main() {
int c;
while((c = getchar()) != EOF) {
if(c == ' ' || c == '\t' )
putchar('\n');
else
putchar(c);
}
}
}
According to this link
it is a bad solution, but I'm not sure I understand why.
I would appreciate some help understanding this.
The problem is when your input contains more than one newline, tab or space in subsequent order.
Then it always jumps into a new line, although it shouldn't.
The requirement of to "print one word per line" is not fulfilled then.
You need to keep an eye on whether the newline, tab or space occurs after a sequence of non-instruction characters or not. So we need a "STATE" parameter which documents the current state.
Chrismath's solution covers that:
// print input one word per line
#define IN 1
#define OUT 0
int main (void)
{
int c, state;
// start without a word
state = OUT;
while ((c = getchar()) != EOF) {
// if the char is not blank, tab, newline
if (c != ' ' && c != '\t' && c != '\n') {
// inside a word
state = IN;
putchar(c);
// otherwise char is blank, tab, newline, word ended
}
else if (state == IN) {
state = OUT;
putchar('\n');
}
}
return 0;
}
The newline is only printed when state is IN which means at least a word of one character was printed in a line before it get to another one.
Someone could argue that a word wouldn't be a single character, but then we would need an explicit requirement of how many characters at least a word is consisted of, but this isn't provided to the task here, so the one character word is plausible and legit.
After comparing a few answers, the difference between the solution i have written above versus the correct solution in the link in the question is that a newline is not created for each blank, tab or newline character. The correct answer checks whether it has already accounted for a space and output the corresponding newline and doesn't output another newline if there is another space, thus answering the problem of "one word per line"
The input of this function is s[] - char array and lim - max possible length of this array.
The function itself is used to determine the length of an inputed char array which is entered in the console.
The question is, what's the main idea of c != '\n' condition in for loop?
I guess it's used to break a loop. It's quite clear. But I can't get how it can be implemented if I don't type \n in my input.
Is that a terminator at the end of an array like \0?
If that's the case why should we use the if (c == '\n') condition after that?
The code:
int getline(char s[],int lim)
{
int c, i;
for (i=0; i < lim-1 && (c=getchar())!=EOF && c!='\n'; ++i)
s[i] = c;
if (c == '\n') {
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
As you press Enter a \n newline character is added to the input buffer, stdin, this means that this is the last character in the input buffer.
So it's only fitting the cycle continues on until getchar() retrieves this last character.
The cycle control states that c must be:
Different from \n.
Different fromEOF.
And the iterator i must be lower than lim - 1.
If one of these conditions is not met the cycle breaks, so it's possible that c is not equal to \n.
After that, if c is equal to \n, it will be added to s, the condition if (c == '\n') is there because, as stated, c might not be equal to \n, and in that case it shouldn't be added to s.
Lastly s must be null terminated (s[i] = '\0') so it can be properly interpreted as a string, aka a null terminated char array. \n is not a null terminator, \0 is.
The fact that \n is added is only because the implementer wanted it to be so, it's part of the implementation, it wouldn't have to be, but it is.
It can be useful in some cases, for instance fgets library function has a similar implementation, it adds \n to the containing buffer and then null terminates it.
Well you want to get a line. \0 terminates the string but a string can exist of multiple lines. As the function's name indicates you only want a single line. Lines are terminated with \n and after that a new line begins.
The condition
c != '\n'
is evaluated as true if a character different to new line character is encountered.
As you can understand, since it is a getline() implementation, it is a very good reason to stop reading from the input string.
Please note how, after this check ends the loop s[i] = '\0'; is executed in order to terminate the string one position after where previously the new line character was.
So, how do getline() works?
The core of that function is the for-loop
for (i=0; i < lim-1 && (c=getchar())!=EOF && c!='\n'; ++i)
The condition i < lim-1 && (c=getchar())!=EOF && c!='\n' means: "Assign to c a character got with `getchar() until...":
the read character is EOF
the read character is \n (newline)
the number of characters requested by the caller (lim) has been read
Because '\n'it is only one of the conditions -
it can be also EOF which author does not want to save and also chars number limit can be reached and we do not want to write the nul terminator out of the char array bounds
The nul termination is not related to any of those conditions and it always happens
If the character read c is a return to new line \n, then the for loop will stop, and the array will end with \n followed by \0.
And that's what the function is supposed to do, it's called getline(), and lines end with \n.
The logic is simply this:
In the for loop we're saying that when the user hit Enter, stop reading; so it stops and won't read the rest of the input. because the condition c!='\n' becomes false.
In the if statement, we're telling the program to continue reading; so "if" the user hits Enter, the program will start reading the rest of the input from where it left off and will start looping again.
The if statement is to prevent the loop from stopping and also a way of telling the program that there are separate lines in the input.
Isn't the input and output number of characters be the same?
int ch;
while(ch != '\n')
{
ch = getchar();
putchar('K');
}
Others have mentioned that you're accessing an uninitialized variable, which causes undefined behavior. But even if you initialize it, the problem is that you're testing the variable ch before you read the character with getchar(). So the count will be off by 1.
Suppose you type 1\n. The first iteration will compare the initial value of ch to '\n'. They won't match, so then it executes
ch = getchar();
putchar('K');
That's 1 K printed. Then it compares the new value of ch with '\n'. Since ch == '1', they don't match, so it executes the loop body again. This reads the newline into ch, and prints a second K.
Then it repeats the loop. This time the test ch != '\n' fails, so the loop stops.
The number of K characters printed is the number of characters you typed including the newline.
One way to fix this is to call getchar() once before the loop.
int ch = getchar();
while (ch != '\n' && ch != EOF) {
putchar('K');
ch = getchar();
}
Notice that you need to declare ch as an int variable so you can properly compare it with EOF.
The variable ch is not initialized. So the initial condition in the while loop has undefined behavior because the variable ch has indeterminate value.
Change the code snippet the following way
int ch;
while( ( ch = getchar() ) != '\n' && ch != EOF )
{
putchar('K');
}
Pay attention to that ch is declared as having type int. Otherwise the code can not work if the type char is processed by the compiler as the type unsigned char.
As for your question
Why output of number of 'K' are more than the input characters?
then you at first are outputting the entered character and only after that checks it in the condition of the while statement. So outputted characters will be one greater than inputted characters if not to count the new line character as an inputted character.
The task is to make a C program that replaces multiple blanks with a single blank and I found this solution on another StackOverflow question:
int c;
while ((c = getchar()) != EOF) {
if (c == ' ') {
while ((c = getchar()) == ' ')
;
putchar(' ');
if (c == EOF) break;
}
putchar(c);
}
It works but I am puzzled by the second while loop:
while ((c = getchar()) == ' ')
;
How does line this even remove any white spaces? I thought it just does nothing until it comes across another non-blank character. So then if a sentence had 4 blanks then I would expect it to turn into 5 blanks because your just adding another blank with putchar(' ') ? Does it remove the excess blank spaces in a way I'm not aware of?
while ((c = getchar()) == ' ')
;
This bit skips over spaces. After skipping all of them,
putchar(' ');
puts one space.
This replaces groups of spaces with one.
The program reads (using getchar) a string from standard input and the outputs (using putchar) the same string to standard output but leaving out the excess spaces. So effectively it "removes" the extra spaces.
The while loop skips over a consecutive block of whitespace in the input while outputting nothing, and then after the loop is done, outputs a single space. That's how it "removes" the spaces.
getchar reads one character at a time. At each iteration it reads a character and check if it is a space or not. If the character read is a space then read another one until it finds a non-white-space character. After break out of loop putchar(' '); is executed and prints a space.