getchar() with EOF not behaving as expected - c

Working my way through K&R I stumbled uppon this unexpected behaviour. Consider following code:
#include <stdio.h>
#define MAXWLEN 10 /* maximum word length */
#define MAXHISTWIDTH 10 /* maximum histogram width */
#define IN 1 /* inside a word */
#define OUT 0 /* outside a word */
int main()
{
int c, i, state;
int wlen[MAXWLEN];
for (i = 0; i < MAXWLEN; ++i)
wlen[i] = 0;
i = 0; /* length of currend word */
state = OUT; /* start outside of words */
while ((c = getchar()) != EOF)
{
if (c == ' ' || c == '\t' || c == '\n')
{
state = OUT;
if (i > 0 && i < MAXWLEN)
++wlen[i];
i = 0;
}
else if (state == OUT) /* beginning of word */
{
state = IN;
i = 1;
}
else /* in word */
++i;
}
++wlen[i];
printf("\nwordlen\toccurences\n");
for (i = 1; i < MAXWLEN; ++i)
{
printf("%6d:\t", i);
if (wlen[i] > MAXHISTWIDTH)
wlen[i] = MAXHISTWIDTH;
for (int j = 0; j < wlen[i]; ++j)
printf("#");
printf("\n");
}
}
This counts the length of all words in a given input and prints a histogram of the result. The Result is as expected.
But I have to press CTRL-D twice, if the last character I entered was not a newline-command (Enter). I'm running my program in zhs, compiled the file with cc.
Can somebody explain, why this happens or is it just an error that occurs on my machine?

This is not behaviour of your program but rather terminal emulator.
Terminal emulators usually buffer the input line by line and send the input to program in bulks. Most of them usually ignore Ctrl-D if pressed in the middle of the line and detect it only if you press it twice. Maybe they take it as signal to interrupt the buffering, not sure abiut it.

Related

C program won't run after EOF character on Linux even though it did on Windows cmd

Following K&R, i wrote this program as an exercise. It was a while ago when I was still using Windows. Since then I have switched to Linux and the program won't run the way it used, even when simulating cmd in WINE.
#include <stdio.h>
#define IN 1 /* inside a word */
#define OUT 0 /* outside a word */
int main(){
int c, state, nchar;
int i, j; /* for loop variables */
int charcount[15]; /* record word lengths from 1 to 15 characters */
state = IN; /* start inside the word, if whitespace inputted on first key-press ++charcount[-1], which doesn't exist */
nchar = 0; /* initialise word character count variable */
for(i = 0; i<= 14; ++i) /* initialise word length array */
charcount[i] = 0;
printf("Input your text, then type ^Z(Ctrl + Z) on a new line for a word length distribution histogram.\n");
printf("Special characters will be counted as part of a word.\n\n");
while((c = getchar()) != EOF){
if(state == IN){
if(c == ' ' || c == '\n' || c == '\t'){
if ((nchar - 1) <= 14) /* check if character count is above the limit of 15 */
++charcount[nchar-1]; /* increase number of characters of this length, not count the last character inputed(space)*/
else
++charcount[14]; /* if character count > 15, increase 15+ section of the array(charcount[14]) */
state = OUT; /* stop counting character */
}
++nchar; /* increase character count of word if input isn't a whitespace */
}
else if(c != ' ' && c != '\n' && c != '\t'){ /* && not || because the latter is always true, fuuuuck i'm an idiot... */
state = IN; /* go back to recording character count */
nchar = 1; /* count latest character */
}
}
for(i = 0; i< 14; ++i){ /* print histogram by looping through word length record up until 14 word character */
printf("\n%4d:", i+1); /* define histogram section names */
for(j = 0; j < charcount[i]; ++j) /* print bar for each section */
putchar('-');
}
printf("\n%3d+:", 15); /* print bar for words 15+ characters long */
for(j = 0; j < charcount[i]; ++j) /* print bar for 15+ section */
putchar('-');
return 0;
}
The program is supposed to print a histogram of the individual word lengths of an input text. To simulate the EOF character from the book I found out you have to press Control+Z on Windows. When I did that, the for loops at the ran and printed the histogram. When I do the equivalent on Linux(Control+D), the program simply stops. How should I solve this? I am guessing that using EOF as a trigger is a bad idea despite its use in K&R, os how should I change my program to make it reliable?

K&R C programming book exercise 1-18

I'm towards solving the exercise, but just half way, I find it so weird and cannot figure it out,
the next is the code snippet, I know it is steps away from finished, but I think it's worth figuring out how come the result is like this!
#define MAXLINE 1000
int my_getline(char line[], int maxline);
int main(){
int len;
char line[MAXLINE];/* current input line */
int j;
while((len = my_getline(line, MAXLINE)) > 0 ){
for (j = 0 ; j <= len-1 && line[j] != ' ' && line[j] != '\t'; j++){
printf("%c", line[j]);
}
}
return 0;
}
int my_getline(char s[], int limit){
int c,i;
for (i = 0 ; i < limit -1 && (c = getchar()) != EOF && c != '\n'; i++)
s[i] = c;
if (c == '\n'){
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
It will be compiled successfully with cc: cc code.c. But the following result is subtle!
Iit is working for lines without \t and blanks:
hello
hello
but it does not work for the line in the picture:
I typed hel[blank][blank]lo[blank]\n:
Could anyone help me a bit? many thanks!
The problem is that you are stuck because you try to get a full line and process it. It's better to process (and the problems of K&R are mostly this way all) the input char by char. If you don't print characters as you detect spaces, but save them in a buffer, and print them if there's a nontab character when you read one past the accumulated ones, then everything works fine. This is also true for new lines. You should keep the last (nonblank) character (as blanks are eliminated before a new line) read to see if it is a newline... in that case, the new line you have just read is not printed, and so, sequences of two or more newlines are only printed the first. This is a sample complete program that does this:
#include <stdio.h>
#include <stdlib.h>
#define F(_f) __FILE__":%d:%s: "_f, __LINE__, __func__
int main()
{
char buffer[1000];
int bs = 0;
int last_char = '\n', c;
unsigned long
eliminated_spntabs = 0,
eliminated_nl = 0;
while((c = getchar()) != EOF) {
switch(c) {
case '\t': case ' ':
if (bs >= sizeof buffer) {
/* full buffer, cannot fit more blanks/tabs */
fprintf(stderr,
"we can only hold upto %d blanks/tabs in"
" sequence\n", (int)sizeof buffer);
exit(1);
}
/* add to buffer */
buffer[bs++] = c;
break;
default: /* normal char */
/* print intermediate spaces, if any */
if (bs > 0) {
printf("%.*s", bs, buffer);
bs = 0;
}
/* and the read char */
putchar(c);
/* we only update last_char on nonspaces and
* \n's. */
last_char = c;
break;
case '\n':
/* eliminate the accumulated spaces */
if (bs > 0) {
eliminated_spntabs += bs;
/* this trace to stderr to indicate the number of
* spaces/tabs that have been eliminated.
* Erase it when you are happy with the code. */
fprintf(stderr, "<<%d>>", bs);
bs = 0;
}
if (last_char != '\n') {
putchar('\n');
} else {
eliminated_nl++;
}
last_char = '\n';
break;
} /* switch */
} /* while */
fprintf(stderr,
F("Eliminated tabs: %lu\n"),
eliminated_spntabs);
fprintf(stderr,
F("Eliminated newl: %lu\n"),
eliminated_nl);
return 0;
}
The program prints (on stderr to not interfer the normal output) the number of eliminated tabs/spaces surrounded by << and >>. And also prints at the end the full number of eliminated blank lines and the number of no content lines eliminated. A line full of spaces (only) is considered a blank line, and so it is eliminated. In case you don't want blank lines with spaces (they will be eliminated anyway, as they are at the end) to be eliminated, just assign spaces/tabs seen to the variable last_char.
In addition to the good answer by #LuisColorado, there a several ways you can look at your problem that may simplify things for you. Rather than using multiple conditionals to check for c == ' ' and c == '\t' and c == '\n', include ctype.h and use the isspace() macro to determine if the current character is whitespace. It is a much clearer way to go.
When looking at the return. POSIX getline uses ssize_t as the signed return allowing it to return -1 on error. While the type is a bit of an awkward type, you can do the same with long (or int64_t for a guaranteed exact width).
Where I am a bit unclear on what you are trying to accomplish, you appear to be wanting to read the line of input and ignore whitespace. (while POSIX getline() and fgets() both include the trailing '\n' in the count, it may be more advantageous to read (consume) the '\n' but not include that in the buffer filled by my_getline() -- up to you. So from your example output provided above it looks like you want both "hello" and "hel lo ", to be read and stored as "hello".
If that is the case, then you can simplify your function as:
long my_getline (char *s, size_t limit)
{
int c = 0;
long n = 0;
while ((size_t)n + 1 < limit && (c = getchar()) != EOF && c != '\n') {
if (!isspace (c))
s[n++] = c;
}
s[n] = 0;
return n ? n : c == EOF ? -1 : 0;
}
The return statement is just the combination of two ternary clauses which will return the number of characters read, including 0 if the line was all whitespace, or it will return -1 if EOF is encountered before a character is read. (a ternary simply being a shorthand if ... else ... statement in the form test ? if_true : if_false)
Also note the choice made above for handling the '\n' was to read the '\n' but not include it in the buffer filled. You can change that by simply removing the && c != '\n' from the while() test and including it as a simple if (c == '\n') break; at the very end of the while loop.
Putting together a short example, you would have:
#include <stdio.h>
#include <ctype.h>
#define MAXC 1024
long my_getline (char *s, size_t limit)
{
int c = 0;
long n = 0;
while ((size_t)n + 1 < limit && (c = getchar()) != EOF && c != '\n') {
if (!isspace (c))
s[n++] = c;
}
s[n] = 0;
return n ? n : c == EOF ? -1 : 0;
}
int main (void) {
char str[MAXC];
long nchr = 0;
fputs ("enter line: ", stdout);
if ((nchr = my_getline (str, MAXC)) != -1)
printf ("%s (%ld chars)\n", str, nchr);
else
puts ("EOF before any valid input");
}
Example Use/Output
With your two input examples, "hello" and "hel lo ", you would have:
$ ./bin/my_getline
enter line: hello
hello (5 chars)
Or with included whitespace:
$ ./bin/my_getline
enter line: hel lo
hello (5 chars)
Testing the error condition by pressing Ctrl + d (or Ctrl + z on windows):
$ ./bin/my_getline
enter line: EOF before any valid input
There are many ways to put these pieces together, this is just one possible solution. Look things over and let me know if you have further questions.

Why is this loop (C) producing a segmentation error?

This is a C program that should accept the terminal's input and return the longest line of the input alongside the length of that line. I know it's not as efficient as it could be made, but I'm trying to write with the few functions I know right now. In running it, it returns a segmentation error. An online debugger points out line 30 (which is flagged in the code below) but doesn't specify the problem. I'm not sure of it either, and I've been looking. What is the source of this error?
By the way, I know that there might be other errors. I want to find those myself. I only need help with that segmentation error.
#include <stdio.h>
#define MAX 200
int start = 0;
int i, j, k, x, finish;
int longlength;
char text[MAX];
char longest[MAX];
int main()
{
fgets (text, MAX, stdin);
for (i = start; text[i] != EOF; i++)
{
if (text[i] == '\n')
{
finish = i - 1;
break;
}
}
for (j = start; j <= finish; j++)
{
longest[j - start] = text[j];
}
longlength = finish - start;
for (k = finish + 1; (text[k] = '\n') && (text[k] != EOF); k++)
{
start = k; //*****This is line 30*****
for (i = start; (text[(i + 1)] != '\n') && (text[(i + 1)] != EOF); i++)
{
}
finish = i;
if ((finish - start) > longlength)
{
longlength = (finish - start);
for (x = start; x <= finish; x++)
{
longest[(x - start)] = text[x];
}
}
}
printf ("This is the longest line : %s.\n Its length is %d.", longest, longlength);
return 0;
}
text[i] will never (or almost never) be EOF (which is usually defined to be -1), so your first loop won't terminate (unless the string contains a \n). Strings in C are null-terminated, and you should be checking for '\0', the null character.
You should try to run the code in a debugger, to see what's going on.
Strings in C are typically terminated by \0 that's why you have to set that as loop termination.
In your case:
for (i = start; text[i] != '\0'; i++)
{
//statements
}
Why do not use EOF:
You use it for Files, for instance .csv or normal .txt Files. Normal chararrays don't work with that.

C Program to Get Characters into Array and Reverse Order

I'm trying to create a C program that accepts a line of characters from the console, stores them in an array, reverses the order in the array, and displays the reversed string. I'm not allowed to use any library functions other than getchar() and printf(). My attempt is below. When I run the program and enter some text and press Enter, nothing happens. Can someone point out the fault?
#include <stdio.h>
#define MAX_SIZE 100
main()
{
char c; // the current character
char my_strg[MAX_SIZE]; // character array
int i; // the current index of the character array
// Initialize my_strg to null zeros
for (i = 0; i < MAX_SIZE; i++)
{
my_strg[i] = '\0';
}
/* Place the characters of the input line into the array */
i = 0;
printf("\nEnter some text followed by Enter: ");
while ( ((c = getchar()) != '\n') && (i < MAX_SIZE) )
{
my_strg[i] = c;
i++;
}
/* Detect the end of the string */
int end_of_string = 0;
i = 0;
while (my_strg[i] != '\0')
{
end_of_string++;
}
/* Reverse the string */
int temp;
int start = 0;
int end = (end_of_string - 1);
while (start < end)
{
temp = my_strg[start];
my_strg[start] = my_strg[end];
my_strg[end] = temp;
start++;
end--;
}
printf("%s\n", my_strg);
}
It seems like in this while loop:
while (my_strg[i] != '\0')
{
end_of_string++;
}
you should increment i, otherwise if my_strg[0] is not equal to '\0', that's an infinite loop.
I'd suggest putting a breakpoint and look what your code is doing.
I think you should look at your second while loop and ask yourself where my_string[i] is being incremented because to me it looks like it is always at zero...

Condition using EOF in C

The code below is my answer to exercise 1-13 in K&R The C Programming Language, which asks for a histogram for the length of words in its input. My question is regarding EOF. How exactly can I break out of the while loop without ending the program entirely? I have used Ctrl-Z which I have heard is EOF on Windows, but this ends the program, instead of just breaking the while loop. How can I get to the for loop after the while loop without ending the file? This is a general question, not just with my code below but for all the code in K&R that uses: while ((c = getchar()) != EOF). Thanks in advance!
`
#include <stdio.h>
#define MAXLENGTH 20 /* Max length of a word */
#define IN 1 /* In a word */
#define OUT 0 /* Out of a word */
int main() {
int c, i, j, len = 0;
int lenWords[MAXLENGTH];
bool state = OUT;
for (i = 0; i < MAXLENGTH; ++i) {
lenWords[i] = 0;
}
c = getchar();
while (c != EOF) {
if (c != ' ' && c != '\n' && c != '\t') {
if (state == IN) {
lenWords[len - 1] += 1; /* a length 5 word is in subscript 4 */
len = 0;
}
state = OUT;
}
else {
state = IN;
}
if (state == IN) {
len += 1;
}
c = getchar();
}
/* Generating a histogram using _ and | */
for (i = 0; i < MAXLENGTH; ++i) { /* Underscores write over one another; not so efficient */
for (j = 0; j < lenWords[i]; ++j) {
putchar('_');
}
putchar('\n');
for (j = 0; j < lenWords[i]; ++j) {
putchar('_');
}
putchar('|');
printf("Length: %d, Frequency: %d", i + 1, lenWords[i]);
}
return 0;
}
I think your question belongs on another network.
Answers here: Equivalent to ^D (in bash) for cmd.exe?
No. CtrlD on *nix generates a EOF, which various
shells interpret as running exit. The equivalent for EOF on Windows
is CtrlZ, but cmd.exe does not interpret this
specially when typed at the prompt.
Ctrl+D to sends EOF to standard input and stops the read on *nix.
Have a look here.
You have to check whether you use *nix or windows.
On Windows, EOF is represented by Ctrl+Z, whereas on *nix EOF is represented by Ctrl+D

Resources