K&R C Exercise 1-18 no output/debugging issues - c

I wrote the code below in Code::Blocks as a response to K&R C exercise 1-18:
Write a program to remove trailing blanks and tabs from each line of
input, and to delete entirely blank lines.
I meant it to remove the blanks and tabs (I haven't tackled the blank line part yet). The while loop correctly saves input to the character array ip, however, the rest of the code doesn't seem to be working as EOF doesn't illicit any output at all.
#include <stdio.h>
#define MAXLINE 1000
main(){
int c, z;
char ip[MAXLINE];
z = 0;
while ((c = getchar()) != EOF) {
ip[z] = c;
++z;
}
for (z = 0; ip[z] == ' ' || ip[z] == '\t'; ip[z] = ip[z + 1]);
printf("%s", ip);
}
I was trying to use this issue as a way to learn the debugger, but after I add a breakpoint at line 14, open the watches window, and press the start arrow, nothing happens. No yellow arrow appears at my break point, my step options are greyed out, and neither my variable names nor their values show up in the watch window.
Advice on fixing my code is appreciated, but what I really want to know is what I'm doing or not doing that is preventing the debugger from helping me.

If you aren't seeing any output, then this is probably because your program is stuck in the for loop: for (z = 0; ip[z] ..., which happens if the string ip starts with two consecutive spaces and/or tabs.
The algorithm for removing a certain character is as such:
Have two variables that index a position in the string, destination and source. The code will be inside an outer loop. Source index will iterate in an inner loop until it finds a character that isn't one of the characters that have to be removed or the null character. Then the character from source index is assigned to destination index. The code then checks if source index reached a null character and breaks the outer loop if it did. Both indexes get incremented, the outer loop is then repeated.
In pseudo code:
remove = '\t'
string
source = 0
destin = 0
while
while string at source equals remove and not-equals null character
source++
string at destin = string at source
if string at source equals null character
break
source++
destin++

Related

What index the i-th element of any array refer to in c?

i've stumbled upon an assignment in a piece of code, in which we add a null character to an array line[i] = '\0' to explicitly declare it's a string, the latter rose in me the question: as the null character is exactly at the end of any string, well how do we know that adding \0 to the i-th element of line would be added to the last position in it, in my eyes i in line, could be any element with any index ,so do the i-th index of any array refer to the last position or what ?
Code like this usually appears just after code that has used the same index variable i to construct the string.
For example:
char string[10];
int i = 0;
string[i++] = 'a';
string[i++] = 'b';
string[i++] = 'c';
string[i] = '\0';
Or, more realistically:
char line[100];
int i = 0;
int c;
while((c = getchar()) != EOF && c != '\n')
line[i++] = c;
line[i] = '\0';
This second example reads one line of text from standard input and stores it in the line array as a proper, null-terminated string.
(In real code, of course, you also have to worry about the possibility of overflowing the array.)
To make things really clear, you can imagine writing code like this more explicitly, with a separate variable to hold the length of the string. For example:
i = 0;
while((c = getchar()) != EOF && c != '\n')
line[i++] = c;
int length_of_string = i;
line[length_of_string] = '\0';
When you see that line
line[length_of_string] = '\0';
it makes it more obvious that the \0 terminator is being stored at a spot in the string that someone has actually determined to be the length of the string. But as you can see, since the variable length_of_string has just been set based on the value of i after the loop, it's perfectly equivalent to just write
line[i] = '\0';
There's sort of an academic-sounding term called loop invariant, but code like this ends up being a perfect example of what it means, and it's worth thinking about for a moment. A loop invariant is something you can say about a loop that's true at all times, for every trip through the loop, at the beginning or the end or in the middle of the loop. For the read-a-line loop I've just shown, the loop invariant is:
i always contains the number of characters that have been read into the string line.
Let's look at all of the ways this "loop invariant" is true. To make things very clear, I'm going to write the loop again, with some comments to make it clear what I mean by the "top" and "bottom" of the loop:
i = 0;
while((c = getchar()) != EOF && c != '\n') {
/* top of loop */
line[i++] = c;
/* bottom of loop */
}
Before the loop runs, the string is empty, so i starts out as 0.
At the top of the loop, before the line[i++] = c step, i still has the value it did last time through the loop.
In the middle of the loop, the line line[i++] = c simultaneously stores the character c into the line array (and at the right spot!), and increments i.
At the bottom of the loop, after the line[i++] = c step, i contains the updated number of characters in the string.
After the loop (and this was your question), since i still contains the number of characters that have been read and stored into line, it's precisely the right index to use to null-terminate the string, with the line line[i] = '\0'.
The other thing that's worth paying attention to here is that the line in the middle of the loop, that simultaneously stores the next character into the line array, at the right spot, and increments i at the same time, is, once again:
line[i++] = c;
My question for you to think about is, what if I had instead written
line[++i] = c; /* WRONG */
It can be hard, at first, to really understand the difference between i++ and ++i, to understand why you would care, to understand why you might pick one over the other. This code here, I think, is an example that really makes the point.
(For extra credit, think about this: What if arrays in C were 1-based, instead of 0-based? What parts of the read-one-line loop would change, and is it still possible to maintain all facets of the loop invariant?)
If you have an already existing string and you just want it to be terminated with \0 on the last+1 index with a correct value, write a function to determine this position. E.g. check the char on the current position and check if the next position contains a legit value. You can then go trough the whole string and determine the last position, then return a pointer to the last position+1 and set your terminator. If you work with a variety of predefined strings this would be the most scalable approach for me.

Why is the string loop is being ignored?

I am new to the programming world and I'm trying to make a string split by using a loop that should split all characters separately but it's being ignored and basically ends with showing the input of the user instead of using the loop to separate the individual letters/characters. Have I missed declaring something important in the loop?
for (i = 0; str[i] != '\0'; i++); <- there's a semicolon here, so your loop literally does nothing
also note that str[i] != '\0' is a very dangerous way of iterating your string. If your string doesn't contain a zero-terminal character, C will happily continue reading memory beyond the end.
There are few syntactical errors with what you posted.
/* This should be <stdio.h>
...
/* Don't need a. semi-colon here
int main();
...
/* Calling getchar() will cause you to lose the first character of the
the input
*/
getchar();
...
/* Don't need a semi-colon here */
for (i = 0; str[i] != '\0'; i++
...
system("pause");
}
With those adjustments you should find the code works.
Write text:
Hello world
Input: Hello world
H
e
l
l
o
w
o
r
l
d
sh: pause: command not found
I'm not on windows, so if the code on your end does not seem to work after making adjustments it is probably windows specific.
If you're using java, you can just split with an empty string and then loop through the list that's created from the split method.

How does this else-if loop track number of words entered in C

I'm learning C using the K&R book, on a windows machine. I am trying out the program(bare bones Unix word count) which counts lines, characters, and words. Although this program correctly counts the number of characters, the no. of lines and words in my output are always 0 and 1, irrespective of what I enter. I also am somewhat stumped by one part of the program, which I'll get to next-
#include<stdio.h>
#define IN 1
#define OUT 0
int main()
{
int c,state, nc,nw,nl;
nl=nw=nc=0;
state=OUT;
while(c=getchar()!=EOF)
{
++nc;
if(c=='\n')
++nl;
if(c=='\n'||c=='\t'||c==' ')
state=OUT;
else if(state==OUT)
{
state=IN;
++nw;
}
}
printf("\n No. of characters, lines and words are : %d,%d,%d\n",nc,nl,nw);
return 0;
}
From what it looks, this program is using nc, nl and nw, respectively, to count the number of characters, lines and words entered in the input stream. My understanding of the program logic, thus far, is -
IN and OUT are two variables used to indicate the current state of the program. IN indicates that the program is currently 'inside' a word- in other words- no space, newline or tab has been encountered so far in the characters entered. Or so I think.
At the very beginning, before the while loop, the STATE is set to out. This indicates that right now, there is no word encountered.
When the while loop begins, with every character entered(unless it is EOF-Ctrl+Z), the number of character nc is incremented. In the first if statement, if the character is a newline '\n', nl is incremented. This should keep track of the number of lines encountered.
The second if statement is used to keep track of whether the program is currently inside a word or not, by setting the STATE to 0, whenever there is a blank, newline or tab. I've understood the logic thus far.
However, I am completely stumped coming to the else-if. Here, the program checks if STATE is OUT. Now, STATE will be out in two conditions:
when the program runs for the first time, and STATE is set to 0 before the while loop. Example- Consider the input WORD. Here, before W is encountered, STATE is set to 0.
Now that STATE is 0, and input is W we come to the else if statement. The next input after W is O. So, STATE is set to 1(indicating the program is inside a word), and the word count is incremented.
But, since the original input was WORD, what happens when R is encountered? What is the value of STATE now? Is it still 1 because it was set to 1 inside the last else-if statement? But then again, if that is 1, there is no condition for when STATE is 1.
Lastly, it's obvious that the program is flawed in some way, because in my sample output below, the number of lines and words are always fixed(0 and 1).
hello word
good morning
^Z
No. of characters, lines and words are : 24,0,1
I understand that my question is very long, but I'm really stumped and looking for answers to two major points:
How does the else-if statement logic work.
Why is the program throwing incorrect output.
Many thanks for your help
You are getting wrong input because you are missing parentheses:
while((c=getchar())!=EOF)
^ ^
Without them you always compare the return value of getchar() with EOF and assign the result of this comparison to c. That is, c will always be either 1 or 0.
How does the else-if statement logic work.
The IF statement check if there is a new line, a space or a tabulation, to CUT a word, so if there is, it put the "state" variable to OUT.
The next loop turn, if the "c" variable is not a new line or tabulation or space, because "state" variable is OUT, the ELSE IF is called.
The ELSE IF increment the nw, because after a space a tabulation or a new line (and if it's not another one) it's a new word. And put back the "state" variable to IN, to return to the IF statement.
EXAMPLE:
"WORD" => "W" -> nc++ nw++ state=OUT, "O" -> nc++ state=IN, "R" -> nc++ state=IN, "D" -> nc++ state=IN
"WO RD" => "W" -> nc++ nw++ state=OUT, "O" -> nc++ state=IN, " " -> nc++ state=OUT, "R" -> nc++ nw++ state=IN, "D" -> nc++ state=IN
And if you want to understand easely, add just after the while statement:
while((c=getchar())!=EOF)
{
printf("number of char = %d, number of words = %d, number or lines = %d, state = %d",nc,nw,nl,state)
So you'll see what the code does after each loop turn.
Here is a very basic walk-through the fixed code. I hope it will answer all of the original questions.
The only other suggestion is to enable and check compiler warning messages, as they often have clues about potent sources of errors. In fact, gcc, and clang will warn about the original program and suggest the correct fix.
Include the standard (std) Input/Output header files
#include <stdio.h>
Use the pre-processor to define to (constant) macros, which are used to represent the state of either being IN-side a word, or OUT-side a word. The definition for "outside" means that the current character (c) is a white space in this program.
White space being a character that does not display anything, but may modify the output, such as moving to the next character location (space), to the next tab stop (tab), or advancing to the next line (newline).
#define IN 1
#define OUT 0
Being a simple program, the program is located in the main function. That is okay for a short program like this one, but not a good idea in larger, more complex programs.
int main(int argc, char* argv[])
{
int c; /* This is a 'current' character being read from input */
int state; /* The state of being either IN- or OUT-side of a word. */
int nc; /* Count of number of characters read */
int nw; /* Count of number of "words" */
int nl; /* Line count */
nl = nw = nc = 0; /* Initialize the counts to zero */
state = OUT; /* Begin with the word 'state' being OUT-side of a word */
Get a single character from standard input (stdin), assign it to
the variable c. This is done first because of the (added) parenthesis enclosing the expression c = getchar(). Then the result of this assignment (which is equal to c) is compared to EOF (end of file).
While the contents of c are not equal to EOF, the while loop's body executes repeatedly, until the getchar() does assign an EOF to c.
while ( EOF != (c = getchar()) )
{
Since you have a new character increment the character count, nc, variable by one.
++nc;
If c is a newline, increment the number of lines, nl, count.
if (c == '\n')
++nl;
If the variable c is a newline, tab, or space, then sent the state variable to OUT, because they indicate that c is not part of a "word."
if (c == '\n' || c== '\t' || c == ' ') {
state = OUT;
}
If the previous if statement did not evaluate to true, follow the else statement.
The else statement consists of a second if statement which evaluates whether state is equal to OUT. If so, then execute the next block.
else if (state == OUT)
{
This block contains the two statements, set state to IN, and increment the value of nw (word count).
state = IN;
++nw;
} /* end of "else if" block */
} /* end of while loop block */
After getchar() returns an EOF (end of file), and the while loop ends, the program prints this summary output before returning zero to the parent process (don't worry about that here, it's just house-keeping) and ending the program.
printf("\n No. of characters, lines and words are : %d, %d, %d\n", nc, nl, nw);
return 0;
} /* end of main */

Infinite loop when getting a line with getchar()

I was trying an exercise from K&R (ex 1-17), and I came up with my own solution.
The problem is that my program appears to hang, perhaps in an infinite loop. I omitted the NUL ('\0') character insertion as I find C generally automatically attaches it to the end of a string (Doesn't it?).
Can somebody please help me find out what's wrong?
I'm using the GCC compiler with Cygwin on win8(x64), if that helps..
Question - Print all input lines that are longer than 80 characters
#include<stdio.h>
#define MINLEN 80
#define MAXLEN 1000
/* getlin : inputs the string and returns its length */
int getlin(char line[])
{
int c,index;
for(index = 0 ; (c != '\n') && ((c = getchar()) != EOF) && (index < MAXLEN) ; index++)
line[index] = c;
return (index); // Returns length of the input string
}
main()
{
int len;
char chArr[MAXLEN];
while((len = getlin(chArr))>0)
{
/* A printf here,(which I had originally inserted for debugging purposes) Miraculously solves the problem!!*/
if(len>=MINLEN)
printf("\n%s",chArr);
}
return 0;
}
And I omitted the null('\0') character insertion as I find C generally automatically attaches it to the end of a string (Doesn't it?).
No, it doesn't. You're using getchar() to read input characters one at a time. If you put the chars in an array yourself, you'll have to terminate it yourself.
The C functions that return a string will generally terminate it, but that's not what you're doing here.
Your input loop is a little weird. The logical AND operator only executes the right-hand-side if the left-hand-side evaluates to false (it's called "short-circuiting"). Rearranging the order of the tests in the loop should help.
for(index = 0 ; (index < MAXLEN) && ((c = getchar()) != EOF) && (c != '\n'); index++)
line[index] = c;
This way, c receives a value from getchar() before you perform tests on its contents.
I'm not positive about what's wrong, but you don't provide the input to the program so I'm guessing.
My guess is that in getlin your variable c gets set to '\n' and at that point it never gets another character. It just keeps returning and looping.
You never SET c to anything inside your getlin function before you test it, is the problem.
C does not insert a NUL terminator at the end of strings automatically. Some functions might do so (e.g. snprintf). Consult your documentation. Additionally, take care to initialize all your variables, like c in getlin().
1) C doesn't add a final \0 to your string. You are responsible for using an array of at least 81 chars and puting the final \0 after the last character you write in it.
2) You're testing the value of c before reading it
3) Your program doesn't print anything because printf uses a buffer for I/O which is flushed when you send \n. Modify this statement to print a final \n:
printf("\n%s",chArr);
to become:
printf("%s\n",chArr);
4) To send an EOF to your program you should do a Ctrl+D under unix and I don't know if it's possible for windows. This may be the reason why the program never ends.

K&R Chapter 1 - Exercise 22 solution, what do you think?

I'm learning C from the k&r as a first language, and I just wanted to ask, if you thought this exercise was being solved the right way, I'm aware that it's probably not as complete as you'd like, but I wanted views, so I'd know I'm learning C right.
Thanks
/* Exercise 1-22. Write a program to "fold" long input lines into two or
* more shorter lines, after the last non-blank character that occurs
* before then n-th column of input. Make sure your program does something
* intelligent with very long lines, and if there are no blanks or tabs
* before the specified column.
*
* ~svr
*
* [NOTE: Unfinished, but functional in a generic capacity]
* Todo:
* Handling of spaceless lines
* Handling of lines consisting entirely of whitespace
*/
#include <stdio.h>
#define FOLD 25
#define MAX 200
#define NEWLINE '\n'
#define BLANK ' '
#define DELIM 5
#define TAB '\t'
int
main(void)
{
int line = 0,
space = 0,
newls = 0,
i = 0,
c = 0,
j = 0;
char array[MAX] = {0};
while((c = getchar()) != EOF) {
++line;
if(c == NEWLINE)
++newls;
if((FOLD - line) < DELIM) {
if(c == BLANK) {
if(newls > 0) {
c = BLANK;
newls = 0;
}
else
c = NEWLINE;
line = 0;
}
}
array[i++] = c;
}
for(line = 0; line < i; line++) {
if(array[0] == NEWLINE)
;
else
printf("%c", array[line]);
}
return 0;
}
I'm sure you on the rigth track, but some pointers for readability:
comment your stuff
name the variables properly and at least give a description if you refuse
be consequent, some single-line if's you use and some you don't. (imho, always use {} so it's more readable)
the if statement in the last for-loop can be better, like
if(array[0] != NEWLINE)
{
printf("%c", array[line]);
}
That's no good IMHO.
First, it doesn't do what you were asked for. You were supposed to find the last blank after a nonblank before the output line boundary. Your program doesn't even remotely try to do it, it seems to strive for finding the first blank after (margin - 5) characters (where did the 5 came from? what if all the words had 9 letters?). However it doesn't do that either, because of your manipulation with the newls variable. Also, this:
for(line = 0; line < i; line++) {
if(array[0] == NEWLINE)
;
else
printf("%c", array[line]);
}
is probably wrong, because you check for a condition that never changes throughout the loop.
And, last but not least, storing the whole file in a fixed-size buffer is not good, because of two reasons:
the buffer is bound to overflow on large files
even if it would never overflow, people still wouldn't like you for storing eg. a gigabyte file in memory just to cut it into 25-character chunks
I think you should start again, rethink your algorithm (incl. corner cases), and only after that, start coding. I suggest you:
process the file line-by-line (meaning output lines)
store the line in a buffer big enough to hold the largest output line
search for the character you'll break at in the buffer
then print it (hint: you can terminate the string with '\0' and print with printf("%s", ...)), copy what you didn't print to the start of the buffer, proceed from that
An obvious problem is that you statically allocate 'array' and never check the index limits while accessing it. Buffer overflow waiting to happen. In fact, you never reset the i variable within the first loop, so I'm kinda confused about how the program is supposed to work. It seems that you're storing the complete input in memory before printing it word-wrapped?
So, suggestions: merge the two loops together and print the output for each line that you have completed. Then you can re-use the array for the next line.
Oh, and better variable names and some comments. I have no idea what 'DELIM' is supposed to do.
It looks (without testing) like it could work, but it seems kind of complicated.
Here's some pseudocode for my first thought
const int MAXLINE = ?? — maximum line length parameter
int chrIdx = 0 — index of the current character being considered
int cand = -1 — "candidate index", Set to a potential break character
char linebuf[bufsiz]
int lineIdx = 0 — index into the output line
char buffer[bufsiz] — a character buffer
read input into buffer
for ix = 0 to bufsiz -1
do
if buffer[ix] == ' ' then
cand = ix
fi
linebuf[lineIdx] = buffer[ix]
lineIdx += 1
if lineIdx >= MAXLINE then
linebuf[cand] = NULL — end the string
print linebuf
do something to move remnants to front of line (memmove?)
fi
od
It's late and I just had a belt, so there may be flaws, but it shows the general idea — load a buffer, and copy the contents of the buffer to a line buffer, keeping track of the possible break points. When you get close to the end, use the breakpoint.

Resources