The code below is my answer to exercise 1-13 in K&R The C Programming Language, which asks for a histogram for the length of words in its input. My question is regarding EOF. How exactly can I break out of the while loop without ending the program entirely? I have used Ctrl-Z which I have heard is EOF on Windows, but this ends the program, instead of just breaking the while loop. How can I get to the for loop after the while loop without ending the file? This is a general question, not just with my code below but for all the code in K&R that uses: while ((c = getchar()) != EOF). Thanks in advance!
`
#include <stdio.h>
#define MAXLENGTH 20 /* Max length of a word */
#define IN 1 /* In a word */
#define OUT 0 /* Out of a word */
int main() {
int c, i, j, len = 0;
int lenWords[MAXLENGTH];
bool state = OUT;
for (i = 0; i < MAXLENGTH; ++i) {
lenWords[i] = 0;
}
c = getchar();
while (c != EOF) {
if (c != ' ' && c != '\n' && c != '\t') {
if (state == IN) {
lenWords[len - 1] += 1; /* a length 5 word is in subscript 4 */
len = 0;
}
state = OUT;
}
else {
state = IN;
}
if (state == IN) {
len += 1;
}
c = getchar();
}
/* Generating a histogram using _ and | */
for (i = 0; i < MAXLENGTH; ++i) { /* Underscores write over one another; not so efficient */
for (j = 0; j < lenWords[i]; ++j) {
putchar('_');
}
putchar('\n');
for (j = 0; j < lenWords[i]; ++j) {
putchar('_');
}
putchar('|');
printf("Length: %d, Frequency: %d", i + 1, lenWords[i]);
}
return 0;
}
I think your question belongs on another network.
Answers here: Equivalent to ^D (in bash) for cmd.exe?
No. CtrlD on *nix generates a EOF, which various
shells interpret as running exit. The equivalent for EOF on Windows
is CtrlZ, but cmd.exe does not interpret this
specially when typed at the prompt.
Ctrl+D to sends EOF to standard input and stops the read on *nix.
Have a look here.
You have to check whether you use *nix or windows.
On Windows, EOF is represented by Ctrl+Z, whereas on *nix EOF is represented by Ctrl+D
Related
I'm towards solving the exercise, but just half way, I find it so weird and cannot figure it out,
the next is the code snippet, I know it is steps away from finished, but I think it's worth figuring out how come the result is like this!
#define MAXLINE 1000
int my_getline(char line[], int maxline);
int main(){
int len;
char line[MAXLINE];/* current input line */
int j;
while((len = my_getline(line, MAXLINE)) > 0 ){
for (j = 0 ; j <= len-1 && line[j] != ' ' && line[j] != '\t'; j++){
printf("%c", line[j]);
}
}
return 0;
}
int my_getline(char s[], int limit){
int c,i;
for (i = 0 ; i < limit -1 && (c = getchar()) != EOF && c != '\n'; i++)
s[i] = c;
if (c == '\n'){
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
It will be compiled successfully with cc: cc code.c. But the following result is subtle!
Iit is working for lines without \t and blanks:
hello
hello
but it does not work for the line in the picture:
I typed hel[blank][blank]lo[blank]\n:
Could anyone help me a bit? many thanks!
The problem is that you are stuck because you try to get a full line and process it. It's better to process (and the problems of K&R are mostly this way all) the input char by char. If you don't print characters as you detect spaces, but save them in a buffer, and print them if there's a nontab character when you read one past the accumulated ones, then everything works fine. This is also true for new lines. You should keep the last (nonblank) character (as blanks are eliminated before a new line) read to see if it is a newline... in that case, the new line you have just read is not printed, and so, sequences of two or more newlines are only printed the first. This is a sample complete program that does this:
#include <stdio.h>
#include <stdlib.h>
#define F(_f) __FILE__":%d:%s: "_f, __LINE__, __func__
int main()
{
char buffer[1000];
int bs = 0;
int last_char = '\n', c;
unsigned long
eliminated_spntabs = 0,
eliminated_nl = 0;
while((c = getchar()) != EOF) {
switch(c) {
case '\t': case ' ':
if (bs >= sizeof buffer) {
/* full buffer, cannot fit more blanks/tabs */
fprintf(stderr,
"we can only hold upto %d blanks/tabs in"
" sequence\n", (int)sizeof buffer);
exit(1);
}
/* add to buffer */
buffer[bs++] = c;
break;
default: /* normal char */
/* print intermediate spaces, if any */
if (bs > 0) {
printf("%.*s", bs, buffer);
bs = 0;
}
/* and the read char */
putchar(c);
/* we only update last_char on nonspaces and
* \n's. */
last_char = c;
break;
case '\n':
/* eliminate the accumulated spaces */
if (bs > 0) {
eliminated_spntabs += bs;
/* this trace to stderr to indicate the number of
* spaces/tabs that have been eliminated.
* Erase it when you are happy with the code. */
fprintf(stderr, "<<%d>>", bs);
bs = 0;
}
if (last_char != '\n') {
putchar('\n');
} else {
eliminated_nl++;
}
last_char = '\n';
break;
} /* switch */
} /* while */
fprintf(stderr,
F("Eliminated tabs: %lu\n"),
eliminated_spntabs);
fprintf(stderr,
F("Eliminated newl: %lu\n"),
eliminated_nl);
return 0;
}
The program prints (on stderr to not interfer the normal output) the number of eliminated tabs/spaces surrounded by << and >>. And also prints at the end the full number of eliminated blank lines and the number of no content lines eliminated. A line full of spaces (only) is considered a blank line, and so it is eliminated. In case you don't want blank lines with spaces (they will be eliminated anyway, as they are at the end) to be eliminated, just assign spaces/tabs seen to the variable last_char.
In addition to the good answer by #LuisColorado, there a several ways you can look at your problem that may simplify things for you. Rather than using multiple conditionals to check for c == ' ' and c == '\t' and c == '\n', include ctype.h and use the isspace() macro to determine if the current character is whitespace. It is a much clearer way to go.
When looking at the return. POSIX getline uses ssize_t as the signed return allowing it to return -1 on error. While the type is a bit of an awkward type, you can do the same with long (or int64_t for a guaranteed exact width).
Where I am a bit unclear on what you are trying to accomplish, you appear to be wanting to read the line of input and ignore whitespace. (while POSIX getline() and fgets() both include the trailing '\n' in the count, it may be more advantageous to read (consume) the '\n' but not include that in the buffer filled by my_getline() -- up to you. So from your example output provided above it looks like you want both "hello" and "hel lo ", to be read and stored as "hello".
If that is the case, then you can simplify your function as:
long my_getline (char *s, size_t limit)
{
int c = 0;
long n = 0;
while ((size_t)n + 1 < limit && (c = getchar()) != EOF && c != '\n') {
if (!isspace (c))
s[n++] = c;
}
s[n] = 0;
return n ? n : c == EOF ? -1 : 0;
}
The return statement is just the combination of two ternary clauses which will return the number of characters read, including 0 if the line was all whitespace, or it will return -1 if EOF is encountered before a character is read. (a ternary simply being a shorthand if ... else ... statement in the form test ? if_true : if_false)
Also note the choice made above for handling the '\n' was to read the '\n' but not include it in the buffer filled. You can change that by simply removing the && c != '\n' from the while() test and including it as a simple if (c == '\n') break; at the very end of the while loop.
Putting together a short example, you would have:
#include <stdio.h>
#include <ctype.h>
#define MAXC 1024
long my_getline (char *s, size_t limit)
{
int c = 0;
long n = 0;
while ((size_t)n + 1 < limit && (c = getchar()) != EOF && c != '\n') {
if (!isspace (c))
s[n++] = c;
}
s[n] = 0;
return n ? n : c == EOF ? -1 : 0;
}
int main (void) {
char str[MAXC];
long nchr = 0;
fputs ("enter line: ", stdout);
if ((nchr = my_getline (str, MAXC)) != -1)
printf ("%s (%ld chars)\n", str, nchr);
else
puts ("EOF before any valid input");
}
Example Use/Output
With your two input examples, "hello" and "hel lo ", you would have:
$ ./bin/my_getline
enter line: hello
hello (5 chars)
Or with included whitespace:
$ ./bin/my_getline
enter line: hel lo
hello (5 chars)
Testing the error condition by pressing Ctrl + d (or Ctrl + z on windows):
$ ./bin/my_getline
enter line: EOF before any valid input
There are many ways to put these pieces together, this is just one possible solution. Look things over and let me know if you have further questions.
I am reading the C programming language book Dennis M. Ritchie and
trying to solve this question:
Write a program to print a histogram of
the lengths of words in
its input. It is easy to draw the histogram with the bars horizontal; a vertical
orientation is more challenging.
I think my solution works, but the problem is that if I don't press EOF, the terminal won't show the
result. I know that the condition specifies that exactly, but I am
wondering whether there is any way to make the program terminate after
reading a single line? (Sorry if my explanation of the problem is a bit shallow. Feel free to ask more.)
#include <stdio.h>
int main ()
{
int digits[10];
int nc=0;
int c, i, j;
for (i = 0; i <= 10; i++)
digits[i] = 0;
//take input;
while ((c = getchar ()) != EOF) {
++nc;
if (c == ' ' || c=='\n') {
++digits[nc-1];
//is it also counting the space in nc? i think it is,so we should do nc-1
nc = 0;
}
}
for (i = 1; i <= 5; i++) {
printf("%d :", i);
for (j = 1; j <= digits[i]; j++) {
printf ("*");
}
printf ("\n");
}
// I think this is a problem with getchar()
//the program doesn't exit automatically
//need to find a way to do it
}
You could try to make something like
while ((c = getchar ()) != EOF && c != '\n') {
and then adding a line after the while loop to account for the last word:
if (c == '\n') {
++digits[nc-1];
nc = 0;
There is also another problem inside your program. ++digits[nc-1]; is correct, however, for the wrong reason. You should make it because an array starts at zero, i.e. if you have an array of length 10, it will go from 0 to 9, so you should count the length of the words and then add one to the position of the array length - 1 (as there are no words of length zero). The problem is that you are still counting the blank spaces or the newline characters inside the length of a word, so if you have two blank spaces after a word of length 4, the program will add to the array a word of length 5 + a word of length 1. To avoid this, you should do something like this:
while ((c = getchar ()) != EOF) {
if ((c == ' ' || c == '\n' || c == '\t') && nc > 0) {
++digits[nc-1]; // arrays start at zero
nc = 0;
}
else {
++nc;
}
}
This is a C program that should accept the terminal's input and return the longest line of the input alongside the length of that line. I know it's not as efficient as it could be made, but I'm trying to write with the few functions I know right now. In running it, it returns a segmentation error. An online debugger points out line 30 (which is flagged in the code below) but doesn't specify the problem. I'm not sure of it either, and I've been looking. What is the source of this error?
By the way, I know that there might be other errors. I want to find those myself. I only need help with that segmentation error.
#include <stdio.h>
#define MAX 200
int start = 0;
int i, j, k, x, finish;
int longlength;
char text[MAX];
char longest[MAX];
int main()
{
fgets (text, MAX, stdin);
for (i = start; text[i] != EOF; i++)
{
if (text[i] == '\n')
{
finish = i - 1;
break;
}
}
for (j = start; j <= finish; j++)
{
longest[j - start] = text[j];
}
longlength = finish - start;
for (k = finish + 1; (text[k] = '\n') && (text[k] != EOF); k++)
{
start = k; //*****This is line 30*****
for (i = start; (text[(i + 1)] != '\n') && (text[(i + 1)] != EOF); i++)
{
}
finish = i;
if ((finish - start) > longlength)
{
longlength = (finish - start);
for (x = start; x <= finish; x++)
{
longest[(x - start)] = text[x];
}
}
}
printf ("This is the longest line : %s.\n Its length is %d.", longest, longlength);
return 0;
}
text[i] will never (or almost never) be EOF (which is usually defined to be -1), so your first loop won't terminate (unless the string contains a \n). Strings in C are null-terminated, and you should be checking for '\0', the null character.
You should try to run the code in a debugger, to see what's going on.
Strings in C are typically terminated by \0 that's why you have to set that as loop termination.
In your case:
for (i = start; text[i] != '\0'; i++)
{
//statements
}
Why do not use EOF:
You use it for Files, for instance .csv or normal .txt Files. Normal chararrays don't work with that.
Working my way through K&R I stumbled uppon this unexpected behaviour. Consider following code:
#include <stdio.h>
#define MAXWLEN 10 /* maximum word length */
#define MAXHISTWIDTH 10 /* maximum histogram width */
#define IN 1 /* inside a word */
#define OUT 0 /* outside a word */
int main()
{
int c, i, state;
int wlen[MAXWLEN];
for (i = 0; i < MAXWLEN; ++i)
wlen[i] = 0;
i = 0; /* length of currend word */
state = OUT; /* start outside of words */
while ((c = getchar()) != EOF)
{
if (c == ' ' || c == '\t' || c == '\n')
{
state = OUT;
if (i > 0 && i < MAXWLEN)
++wlen[i];
i = 0;
}
else if (state == OUT) /* beginning of word */
{
state = IN;
i = 1;
}
else /* in word */
++i;
}
++wlen[i];
printf("\nwordlen\toccurences\n");
for (i = 1; i < MAXWLEN; ++i)
{
printf("%6d:\t", i);
if (wlen[i] > MAXHISTWIDTH)
wlen[i] = MAXHISTWIDTH;
for (int j = 0; j < wlen[i]; ++j)
printf("#");
printf("\n");
}
}
This counts the length of all words in a given input and prints a histogram of the result. The Result is as expected.
But I have to press CTRL-D twice, if the last character I entered was not a newline-command (Enter). I'm running my program in zhs, compiled the file with cc.
Can somebody explain, why this happens or is it just an error that occurs on my machine?
This is not behaviour of your program but rather terminal emulator.
Terminal emulators usually buffer the input line by line and send the input to program in bulks. Most of them usually ignore Ctrl-D if pressed in the middle of the line and detect it only if you press it twice. Maybe they take it as signal to interrupt the buffering, not sure abiut it.
#include <stdio.h>
#define MAXLENGTH 77
int getline(char[], int);
main() {
char in_line[MAXLENGTH];
int count, in_line_length;
in_line_length = 0;
for (count = 0; count < MAXLENGTH; ++count)
in_line[count] = 0;
in_line_length = getline(in_line, MAXLENGTH);
printf("%s", in_line);
}
int getline(char line[], int max_length) {
int count, character;
character = 0;
for (count = 0; count < (max_length -2) && (character = getchar()) != EOF && character != '\n'; count++) {
line[count] = character;
}
if (character = '\n') {
line[count++] = '\n';
}
line[count] = '\0';
return count;
}
I'm having a peculiar issue with the above code. As a little background, I'm compiling the above C code with the Tiny C Compiler (TCC) and running the executable in a Windows XP command prompt (cmd.exe).
The problem that appears to be occuring has to do with the EOF character. When running, if I issue the EOF character (CTRL+Z) and then hit return, the loop seems to exit correctly, but then character is populated with a newline character (\n).
So, something like this (as input):
test^Ztesttwo
Would output:
test\n
Due to the fact that the if statement following the for loop is always executed. What I'm wondering, is why the loop successfully exits upon recognition of the EOF character if it continues to grab another character from the stream?
Your problem is this line:
if (character = '\n') {
it is an assigment instead of a test operator:
if (character == '\n') {
You can prevent erros like this by putting the constant left, that way you will get a compiler error if you make a mistake.
if ('\n' = character) {
// ^ wont compile