K&R C Exercise 4-9: Why ignore EOF? - c

Just a little confusion I'm hoping someone can clear up - this question asks:
"Our getch and ungetch do not handle a pushed-back EOF correctly. Decide what their properties ought to be if an EOF is pushed back, then implement your design".
With the code as it is, an EOF is pushed back, refetched with getch(), which causes a loop such as:
while ((c = getch()) != EOF)
putchar(c);
to terminate when it is encountered from the buffer. I fail to see how this behaviour is incorrect. Surely as an EOF will in theory (mostly) only ever be encountered once, if it is pushed back and then read from a buffer in this way, it doesn't really matter? I hope someone could clear up the purpose of this question for me - I get that most solutions involve programming ungetch() to ignore EOF, I just don't see the point.
I'm sure there is one, as Dennis Ritchie and Brian Kernighan are a lot brighter than little old me - just hoping someone could point it out. Thanks :-)
Regards,
Phil

The definition of buf is char buf[BUFSIZE]; ,according to the content in the book, page 19:
We must declare c to be a type big enough to hold any value that
getchar returns. We can't use char since c must be big enough to hold
EOF in addition to any possible char. Therefore we use int.
Then we get the answer:
int buf[BUFSIZE];

Related

C getchar() and EOF behavior [duplicate]

Why Ctrl+Z does not trigger the loop to finish on the following small program?
#include <stdio.h>
main()
{
int c;
while ((c = getchar()) != EOF)
{
//nothing
}
return 0;
}
If I enter: test^ZEnter, it does not get out of the loop.
I found related questions around (here and here) but none to explain it for C (not C++) under Windows.
Note: I use Visual Studio 2015 PRE on a windows 8.1
You need to hit Enter and then use ctrl+Z and then Enter again.
or, you may also use F6
EOF like you use it is not a character. It's the status in which that stream is.
I mean, heck, you even link this question, so you might as well read the accepted answer:
The underlying form of an EOF is a zero-length read.
It's not an "EOF character".
http://www.c-faq.com/stdio/getcharc.html cites a different case than yours, where someone stored the return value of getchar in a char. The underlying problem still occurs occasionally: different runtimes implement different values for the EOF integer (which is why I said, it's not an EOF character), and things love to go wrong. Especially in Visual C++, which is not a "real" C compiler but a C++ compiler with a compatibility mode, it seems things can go wrong.

Solve ÿ end of file in C

When I copy the content of a file to another in C at the end of the output file I have this character ÿ. I understand thanks to this forum that it is the EOF indicator but I don't understand what to do in order to get rid of it in the output file.
This is my code:
second_file = fopen(argv[2], "w+");
while (curr_char != EOF)
{
curr_char = fgetc(original_file);
fputc(curr_char, second_file);
}
printf("Your file has been successfully copy\n");
fclose(second_file);
fclose(original_file);
For each character you read, you have two things to do:
Check to see if it's EOF.
If not, write it to the output.
Your problem is you're doing these two things in the wrong order.
There are potentially several different ways of solving this. Which one you pick depends on how much you care about your program looking good, as opposed to merely working.
One. Starting with the code you wrote, we could change it to:
while (curr_char != EOF)
{
curr_char = getc(original_file);
if(curr_char == EOF) break;
putc(curr_char, second_file);
}
Here, we explicitly test to see if the character is EOF, immediately after reading it, before writing it. If it's EOF, we break out of the loop early. This will work, but it's ugly: there are two different places where we test for EOF, and one of them never "fires". (Also, as a commentator reminded me, there's the problem that the first time through the loop, we're testing curr_char before we've ever set it.)
Two. You could rearrange it like this:
curr_char = getc(original_file);
while (curr_char != EOF)
{
putc(curr_char, second_file);
curr_char = getc(original_file);
}
Here, we read an initial character, and as long as it's not EOF, we write it and read another. This will work just fine, but but it's still a little bit ugly, because this time there are two different places where we read the character.
Three. You could rearrange it like this:
while ((curr_char = getc(original_file)) != EOF)
{
putc(curr_char, second_file);
}
This is the conventional way of writing a character-copying loop in C. The call to getc and the assignment to curr_char are buried inside of the controlling expression of the while loop. It depends on the fact that in C, an assignment expression has a value just like any other expression. That is, the value of the expression a = b is whatever value we just assigned to a (that is, b's value). So the value of the expression curr_char = getc(original_file) is the character we just read. So when we say while ((curr_char = getc(original_file)) != EOF), what we're actually saying is, "Call getc, assign the result to curr_char, and if it's not equal to EOF, take another trip around the loop."
(If you're still having trouble seeing this, I've written other explanations in these notes and this writeup.)
This code is both good and bad. It's good because we've got exactly one place we read characters, one place we test characters, and one place we write characters. But it's a little bit bad because, let's admit it, it's somewhat cryptic at first. It's hard to think about that assignment-buried-inside-the-while-condition. It's code like this that gives C a reputation as being full of obscure gobbledegook.
But, at least in this case, it really is worth learning the idiom, and becoming comfortable with it, because the reductions to just one read and one test and one write really are virtues. It doesn't matter so much in a trivial case like this, but in real programs which are complicated for other reasons, if there's some key piece of functionality that happens in two different places, it's extremely easy to overlook this fact, and to make a change to one of them but forget to make it to the other.
(In fact, this happened to me just last week at work. I was trying to fix a bug in somebody else's code. I finally figured out that when the code did X, it was inadvertently clearing Y. I found the place where it did X, and I added some new code to properly recreate Y. But when I tested my fix, it didn't work! It turned out there were two separate places where the code did X, and I had found and fixed the wrong one.)
Finally, here's an equivalently minimal but unconventional way of writing the loop:
while (1)
{
curr_char = getc(original_file);
if(curr_char == EOF) break;
putc(curr_char, second_file);
}
This is kind of like number 1, but it gets rid of the redundant condition in the while loop, and replaces it with the constant 1, which is "true" in C. This will work just fine, too, and it shares the virtue of having one read, one test, and one write. It actually ends up doing exactly the same operations and in exactly the same order as number 3, but by being laid out linearly it may be easier to follow.
The only problem with number 4 is that it's an unconventional, "break in the middle" loop. Personally, I don't have a problem with break-in-the-middle loops, and I find they come up from time to time, but if I wrote one and someone said "Steve, that's ugly, it's not an idiom anyone recognizes, it will confuse people", I'd have to agree.
P.S. I have replaced your calls to fgetc and fputc with the more conventional getc and putc. I'm not sure who told you to use fgetc and fputc, and there are obscure circumstances where you need them, but they're so rare that in my opinion one might as well forget that the "f" variants exist, and always use getc and putc.

Why Ctrl-Z does not trigger EOF?

Why Ctrl+Z does not trigger the loop to finish on the following small program?
#include <stdio.h>
main()
{
int c;
while ((c = getchar()) != EOF)
{
//nothing
}
return 0;
}
If I enter: test^ZEnter, it does not get out of the loop.
I found related questions around (here and here) but none to explain it for C (not C++) under Windows.
Note: I use Visual Studio 2015 PRE on a windows 8.1
You need to hit Enter and then use ctrl+Z and then Enter again.
or, you may also use F6
EOF like you use it is not a character. It's the status in which that stream is.
I mean, heck, you even link this question, so you might as well read the accepted answer:
The underlying form of an EOF is a zero-length read.
It's not an "EOF character".
http://www.c-faq.com/stdio/getcharc.html cites a different case than yours, where someone stored the return value of getchar in a char. The underlying problem still occurs occasionally: different runtimes implement different values for the EOF integer (which is why I said, it's not an EOF character), and things love to go wrong. Especially in Visual C++, which is not a "real" C compiler but a C++ compiler with a compatibility mode, it seems things can go wrong.

fseek(stdin,0,SEEK_SET) and rewind(stdin) REALLY do flush the input buffer "stdin".Is it OK to use them? [duplicate]

This question already has an answer here:
Can fseek(stdin,1,SEEK_SET) or rewind(stdin) be used to flush the input buffer instead of non-portable fflush(stdin)?
(1 answer)
Closed 8 years ago.
I was thinking since the start that why can't fseek(stdin,0,SEEK_SET) and rewind(stdin) flush the input buffer since it is clearly written in cplusplusreference that calling these two functions flush the buffer(Input or Output irrespective).But since the whole idea seemed new,I had put it in a clumsy question yesterday.
Can fseek(stdin,1,SEEK_SET) or rewind(stdin) be used to flush the input buffer instead of non-portable fflush(stdin)?
And I was skeptical about the answers I got which seemed to suggest I couldn't do it.Frankly,I saw no reason why not.Today I tried it myself and it works!! I mean, to deal with the problem up the newline lurking in stdin while using multiple scanf() statments, it seems like I can use fseek(stdin,0,SEEK_SET) or rewind(stdin) inplace of the non-portable and UB fflush(stdin).
Please tell me if this is a correct approach without any risk.Till now, I had been using the following code to deal with newline in stdin: while((c = getchar()) != '\n' && c != EOF);. Here's my code below:
#include <stdio.h>
int main ()
{
int a,b;
char c;
printf("Enter 2 integers\n");
scanf("%d%d",&a,&b);
printf("Enter a character\n");
//rewind(stdin); //Works if activated
fseek(stdin,0,SEEK_SET); //Works fine
scanf("%c",&c); //This scanf() is skipped without fseek() or rewind()
printf("%d,%d,%c",a,b,c);
}
In my program, if I don't use either of fseek(stdin,0,SEEK_SET) or rewind(stdin),the second scanf() is skipped and newline is always taken up as the character.The problem is solved if I use fseek(stdin,0,SEEK_SET) or rewind(stdin).
I'm not sure where you read on cplusplusreference (whatever that is) that flushing to end of line is the mandated behaviour.
The closest matches I could find, http://www.cplusplus.com/reference/cstdio/fseek/ and http://www.cplusplus.com/reference/cstdio/rewind, don't mention flushing at all, other than in reference to fflush().
In any case, there's nothing in the C standard which mandates this behaviour either. C11 7.20.9.2 fseek and 7.20.9.5 rewind (which is, after all, identical to fseek with zero offset and SEEK_SET) also make no mention of flushing.
All they state is that the file pointer is moved to the relevant position in the stream.
So, to the extent this works in your environment, all we can say is that this works in your environment. It may not work elsewhere, it may even stop working in your envirnment at an indeterminate point in the future.
If you really want robust input, you should be using a two-stage approach, fgets to retrieve a line followed by sscanf to get what you want from that line. Mixing the two paradigms of input (scanf and getchar) is frequently problematic.
A good (robust, error-checking, and clearing to end of line if needed) input function can be found here.
I tested it right ago, and I checked that fseek doesn't work on stdin. fseek() usually works on the file on the disk so that it seems to be prohibited to access to stdin by the kernel for some secure reasons. Anyway, it was so happy to see who thought like me. Tnx for good question.

why gets() is not working?

I am programming in C in Unix,
and I am using gets to read the inputs from keyboard.
I always get this warning and the program stop running:
warning: this program uses gets(), which is unsafe.
Can anybody tell me the reason why this is happening?
gets is unsafe because you give it a buffer, but you don't tell it how big the buffer is. The input may write past the end of the buffer, blowing up your program fairly spectacularly. Using fgets instead is a bit better because you tell it how big the buffer is, like this:
const int bufsize = 4096; /* Or a #define or whatever */
char buffer[bufsize];
fgets(buffer, bufsize, stdin);
...so provided you give it the correct information, it doesn't write past the end of the buffer and blow things up.
Slightly OT, but:
You don't have to use a const int for the buffer size, but I would strongly recommend you don't just put a literal number in both places, because inevitably you'll change one but not the other later. The compiler can help:
char buffer[4096];
fgets(buffer, (sizeof buffer / sizeof buffer[0]), stdin);
That expression gets resolved at compile-time, not runtime. It's a pain to type, so I used to use a macro in my usual set of headers:
#define ARRAYCOUNT(a) (sizeof a / sizeof a[0])
...but I'm a few years out of date with my pure C, there's probably a better way these days.
As mentioned in the previous answers use fgets instead of gets.
But it is not like gets doesn't work at all, it is just very very unsafe. My guess is that you have a bug in your code that would appear with fgets as well so please post your source.
EDIT
Based on the updated information you gave in your comment I have a few suggestions.
I recommend searching for a good C tutorial in your native language, Google is your friend here. As a book I would recommend The C Programming Language
If you have new information it is a good idea to edit them into your original post, especially if it is code, it will make it easier for people to understand what you mean.
You are trying to read a string, basically an array of characters, into a single character, that will of course fail. What you want to do is something like the following.
char username[256];
char password[256];
scanf("%s%s", username, password);
Feel free to comment/edit, I am very rusty even in basic C.
EDIT 2 As jamesdlin warned, usage of scanf is as dangerous as gets.
man gets says:
Never use gets(). Because it is
impossible to tell without knowing the
data in advance how many characters
gets() will read, and because
gets() will continue to store
characters past the end of the buffer,
it is extremely dangerous to use. It
has been used to break computer
security. Use fgets() instead.
gets() is unsafe. It takes one parameter, a pointer to a char buffer. Ask yourself how big you have to make that buffer and how long a user can type input without hitting the return key.
Basically, there is no way to prevent a buffer overflow with gets() - use fgets().

Resources