I'm making a program that reads in a file from stdin, does something to it and sends it to stdout.
As it stands, I have a line in my program:
while((c = getchar()) != EOF){
where c is an int.
However the problem is I want to use this program on ELF executables. And it appears that there must be the byte that represents EOF for ascii files inside the executable, which results in it being truncated (correct me if I'm wrong here - this is just my hypothesis).
What is an effective general way to go about doing this? I could dig up documents on the ELF format and then just check for whatever comes at the end. That would be useful, but I think it would be better if I could still apply this program to any kind of file.
You'll be fine - the EOF constant doesn't contain a valid ASCII value (it's typically -1).
For example, below is an excerpt from stdio.h on my system:
/* End of file character.
Some things throughout the library rely on this being -1. */
#ifndef EOF
# define EOF (-1)
#endif
You might want to go a bit lower level and use the system functions like open(), close() and read(), this way you can do what you like with the input as it will get stored in your own buffer.
You are doing it correctly.
EOF is not a character. There is no way c will have EOF to represent any byte in the stream. If / when c indeed contains EOF, that particular value did not originate from the file itself, but from the underlying library / OS. EOF is a signal that something went wrong.
Make sure c is an int though
Oh ... and you might want to read from a stream under your control. In the absence of code to do otherwise, stdin is subject to "text translation" which might not be desirable when reading binary data.
FILE *mystream = fopen(filename, "rb");
if (mystream) {
/* use fgetc() instead of getchar() */
while((c = fgetc(mystream)) != EOF) {
/* ... */
}
fclose(mystream);
} else {
/* error */
}
From the getchar(3) man page:
Character values are returned as an
unsigned char converted to an int.
This means, a character value read via getchar, can never be equal to an signed integer of -1. This little program explains it:
int main(void)
{
int a;
unsigned char c = EOF;
a = (int)c;
//output: 000000ff - 000000ff - ffffffff
printf("%08x - %08x - %08x\n", a, c, -1);
return 0;
}
Related
I have written a code
#include <stdio.h>
#include <stdlib.h>
int main()
{
FILE *fp;
fp=fopen("lets.txt","r+");
if(fp==NULL)
{
printf("ERROR");
exit(1);
}
else
{
char ch,ch1;
while(!feof(fp))
{
ch= fgetc(fp);
printf("%c",ch);
}
printf("\n\nYou want to write something? (1/0)");
int n;
scanf("%d",&n);
if(n==1)
{
fputs("Jenny",fp);
ch1 = fgetc(fp);
printf("%c\n", ch1);
while(ch1 != EOF)
{
ch1=fgetc(fp);
printf("%c",ch1);
}
fclose(fp);
}
else{
printf("File Closed ");
fclose(fp);
}
}
}
I have tried to insert a string inside an already existing file "lets.txt"
but when I run this code, this is shown in the Terminal
I was expecting this to just put Jenny into the final file but it's also adding other text which was present before it and lots of NULL.
Is this because of something like temporary memory storage or something like that or just some mistake in the code?
First of all, the lines
char ch,ch1;
while(!feof(fp))
{
ch= fgetc(fp);
printf("%c",ch);
}
are wrong.
If you want ch to be guaranteed to be able to represent the value EOF and also want to be able to distinguish it from every possible character code, then you must store the return value of fgetc in an int, not a char. Please note that fgetc returns an int, not a char. See this other answer for more information on this issue.
Also, the function feof will only return a non-zero value (i.e. true) if a previous read operation has already failed due to end-of-file. It does not provide any indication of whether the next read operation will fail. This means that if fgetc returns EOF, you will print that value as if fgetc were successful, which is wrong. See the following question for further information on this issue:
Why is “while( !feof(file) )” always wrong?
For the reasons stated above, I suggest that you change these lines to the following:
int ch, ch1;
while ( ( ch = fgetc(fp) ) != EOF )
{
printf( "%c", ch );
}
Another issue is that when a file is opened in update mode (i.e. it is opened with a + in the mode string, for example "r+" as you are doing), you cannot freely change between reading and writing. According to §7.21.5.3 ¶7 of the ISO C11 standard,
output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), and
input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end-of-file.
If you break any of these rules, than your program will be invoking undefined behavior, which means that anything can happen, which includes the possibility that you get invalid output.
For this reason, I suggest that you change the lines
fputs("Jenny",fp);
ch1 = fgetc(fp);
to:
fseek( fp, 0, SEEK_CUR );
fputs("Jenny",fp);
fflush( fp );
ch1 = fgetc(fp);
In contrast to the line fflush( fp );, which is absolutely necessary, the line fseek( fp, 0, SEEK_CUR ); actually isn't necessary according to the rules stated above, because you encountered end-of-file. But it probably is a good idea to keep that line anyway, for example in case you later change your program to stop reading for some other reason besides end-of-file. In that case, that line would be required.
Re: "I changed the condition for the while loop to this simple form-
ch = fgetc(fp); while(ch != EOF) But it is still showing the same
result.
The value returned by getchar() must be stored in an int:
ch= fgetc(fp);
ch has been declared as a char. Storing the value in a char makes testing for EOF unreliable. C17 states that EOF has a negative int value. On some implementations, char is unsigned, hence it can't represent negative values.
On implementations where the type char is signed, assuming EOF is defined as -1 (which is the case on most implementations), it's impossible to distinguish EOF from the character code 255 (which would be stored as the value -1 in a char, but as 255 in an int).
From the man page:
fgetc(), getc(), and getchar() return the character read as an
unsigned char cast to an int or EOF on end of file or error.
It further states:
If the integer value returned by getchar() is stored into a
variable of type char and then compared against the integer
constant EOF, the comparison may never succeed, because sign-
extension of a variable of type char on widening to integer is
implementation-defined.
which is relevant to fgetc as well.
Possible fix:
Declare ch as an int.
You haven't set the file pointer when switching between read and write. The MSVC man page says about fopen
However, when you switch from reading to writing, the input operation must encounter an EOF marker. If there's no EOF, you must use an intervening call to a file positioning function. The file positioning functions are fsetpos, fseek, and rewind. When you switch from writing to reading, you must use an intervening call to either fflush or to a file positioning function.
I was given an assignment in C language about reading and writing in a file.
I have read different codes on different websites and also their explanations. but there is one question that remained unanswered! Following is the general code I found on different sites:
#include <stdio.h>
void main()
{
FILE *fp;
int c;
fp = fopen("E:\\maham work\\CAA\\TENLINES.TXT","r");
c = getc(fp) ;
while (c!= EOF)
{
putchar(c);
c = getc(fp);
}
fclose(fp);
}
My questions are simple and stright.
in line c = getc(fp) ; what is that C receives?? address? character? ASCII Code?? what?
and
while (c!= EOF)
{
putchar(c);
c = getc(fp);
}
here how is c able to read the file character by character?? there is no increment operator... does the pointer "fp" helps in reading the code??
lastly, why is putchar(c); used in printing ? why not use printf("%C", c); ?
getc() returns the integer value of the byte at the current position in the file handle, then advances that position by one byte.
putchar() is simpler than printf.
1 minute googling got me this.
C++ reference
tutorial points
wikipedia
Quoting reference documentation (C++ here, but probably very similar in C).
int getc ( FILE * stream );
Get character from stream Returns the character currently pointed by the internal file position indicator of the specified stream. The internal file position indicator is then advanced to the next character.
If the stream is at the end-of-file when called, the function returns EOF and sets the end-of-file indicator for the stream (feof).
If a read error occurs, the function returns EOF and sets the error indicator for the stream (ferror).
getc and fgetc are equivalent, except that getc may be implemented as a macro in some libraries. See getchar for a similar function that reads directly from stdin.
Further reading gives us:
On success, the character read is returned (promoted to an int value).
The return type is int to accommodate for the special value EOF, which indicates failure: If the position indicator was at the end-of-file, the function returns EOF and sets the eof indicator (feof) of stream.
If some other reading error happens, the function also returns EOF, but sets its error indicator (ferror) instead.
Here we read
This function returns the character read as an unsigned char cast to an int or EOF on end of file or error.
And on wikipedia
Reads a byte/wchar_t from a file stream
I was reading K&R book and wanted to test out printf() and putchar() functions in ways that I never tried. I encountered several unexpected events and would like to hear from more experienced programmers why that happens.
char c;
while((c = getchar()) != EOF) {
//put char(c);
printf("%d your character was.\n", c);
}
How would one get EOF (end of file) in the input stream (getchar() or scanf() functions)? Unexpected key that is not recognized by getchar()/scanf() function could produce it?
In my book, it says that c has to be an integer, because it needs to store EOF and the variable has to be big enough to hold any possible char that EOF can hold. This doesn't make sense for me, because EOF is a constant integer with a value of -1, which even char can store. Can anyone clarify what was meant by this?
What happens when I send "hello" or 'hello' to putchar() function? It expects to get an integer, but returns weird output, such as EE or oo, if I send the latter string or char sequence.
Why when I use printf() function that is written above I get two outputs? One is the one I entered and the other one is integer, which in ASCII is end of line. Does it produce the second output, because I press enter, which it assumes to be the second character?
Thanks.
on Linux, you can send it with Ctrl+d
you need an int, otherwise you can't make the difference between EOF and the last possible character (0xFFFF is not the same than 0x00FF)
putchar wants a character, not a string, if you're trying to give it a string, it'll print a part of the string address
you only get one output: the ascii value of the character you entered, the other "input" is what you typed in the terminal
edit - more details about 2
You need an int, because getchar can returns both a character of value -1 (0x00FF) and an integer of value -1 (0xFFFF), they don't have the same meaning: the character of value -1 is a valid character (for instance, it's ÿ in latin-1) while the integer of value -1 is EOF in this context.
Here's a simple program that shows the difference:
#include <stdio.h>
int main(int argc, char ** argv) {
{
char c = 0xFF; /* this is a valid char */
if (c == EOF) printf("wrong end of file detection\n");
}
{
int c = 0xFF; /* this is a valid char */
if (c == EOF) printf("wrong end of file detection\n");
}
}
The first test succeeds because 0xFF == -1 for char, while the second tests fails because 0x00FF != -1 for int.
I hope that makes it a bit clearer.
you must close the input stream to get EOF. Usually with CTRL-D in UNIX, but see you tty config (stty -a)
already answered
same
your tty echoes what you type by default. If you don't want this, set it in noecho mode (stty -echo). Becareful as some shells sets it again to echo. Try with sh. You must be aware taht tty also buffers your inputs until RETURN is types (see stty manual for raw mode if you need).
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
fgetc does not identify EOF
fgetc, checking EOF
I have created a file and named it "file.txt" in Unix. I tried to read the file content from my C program. I am not able to receive the EOF character. Unix doesn't store EOF character on file creation? If so what is the alternative way to read the EOF from a Unix created file using C.
Here's the code sample
int main(){
File *fp;
int nl,c;
nl =0;
fp = fopen("file.txt", "r");
while((c = fgetc(fp)) != EOF){
if (c=='\n')
nl++;
}
return 0;
}
If I explicitly give CTRL + D the EOF is detected even when I use char c.
This can happen if the type of c is char (and char is unsigned in your compiler, you can check this by examining the value of CHAR_MIN in ) and not int.
The value of EOF is negative according to the C standard.
So, implicitly casting EOF to unsigned char will lose the true value of EOF and the comparison will always fail.
UPDATE: There's a bigger problem that has to be addressed first. In the expression c = fgetc(fp) != EOF, fgetc(fp) != EOF is evaluated first (to 0 or 1) and then the value is assigned to c. If there's at least one character in the file, fgetc(fp) != EOF will evaluate to 0 and the body of the while loop will never execute. You need to add parentheses, like so: (c = fgetc(fp)) != EOF.
Missing parentheses. Should be:
while((c = fgetc(fp)) != EOF)
Remember: fgetc() returns an int, not a char. It has to return an int because its set of return values includes all possible valid characters plus a separate (negative) EOF indicator.
There are two possible traps if you use type char for c instead of int:
If the type char is signed with your compiler, you will detect a valid character as EOF. Often, the character ÿ (y-umlaut, officially known in Unicode as LATIN LOWER CASE Y WITH DIAERESIS, U+00FF, hex code 0xFF in the ISO 8859-1 aka Latin 1 code set) will be detected as equivalent to EOF, when it is a valid character.
If the type char is unsigned, then the comparison will never be true.
Both problems are serious, and both are avoided by using the correct type:
FILE *fp = fopen("file.txt", "r");
if (fp != 0)
{
int c;
int nl = 0;
while ((c = fgetc(fp)) != EOF)
if (c == '\n')
nl++;
printf("Number of lines: %d\n", nl);
}
Note that the type is FILE and not File. Note that you should check that the file was opened before trying to read via fp.
If I explicitly give CTRL + D, the EOF is detected even when I use char c.
This means that your compiler provides you with char as a signed type. It also means you will not be able to count lines accurately in files which contain ÿ.
Unlike CP/M and DOS, Unix does not use any character to indicate EOF; you reach EOF when there are no more characters to read. What confuses many people is that if you type a certain key combination at the terminal, programs detect EOF. What actually happens is that the terminal driver recognizes the character and sends any unread characters to the program. If there are no unread characters, the program gets 0 bytes returned, which is the same result you get when you've reached the end of file. So, the character combination (often, but not always, Ctrl-D) appears to 'send EOF' to the program. However, the character is not stored in a file if you are using cat >file; further, if you read a file which contains a control-D, that is a perfectly fine character with byte value 0x04. If a program generates a control-D and sends that to a program, that does not indicate EOF to the program. It is strictly a property of Unix terminals (tty and pty — teletype and pseudo-teletype — devices).
You do not show how you declare the variable c it should be of type int, not char.
What do you put in to end the program, -1, doesn't work:
#include <stdio.h>
//copy input to output
main() {
char c;
c = getchar();
while(c != EOF) {
putchar(c);
c = getchar();
}
}
Macro: int EOF
This macro is an integer value that is returned by a number of functions to indicate an end-of-file condition, or some other error situation. With the GNU library, EOF is -1. In other libraries, its value may be some other negative number.
The documentation for getchar is that it returns the next character available, cast to an unsigned char and then returned in an int return value.
The reason for this, is to make sure that all valid characters are returned as positive values and won't ever compare as equal to EOF, a macro which evaluates to a negative integer value.
If you put the return value of getchar into a char, then depending on whether your implementation's char is signed or unsigned you may get spurious detection of EOF, or you may never detect EOF even when you should.
Signaling EOF to the C library typically happens automatically when redirecting the input of a program from a file or a piped process. To do it interactively depends on your terminal and shell, but typically on unix it's achieved with Ctrl-D and on windows Ctrl-Z on a line by itself.
you should use int and not char
I agree with all other people in this thread by saying use int c not char.
To end the loop (at least on *nix like systems) you would press Ctrl-D to send EOF.
In addition, if you like to get your characters echoed instantly rewrite your code like this:
#include<stdio.h>
int
main(void)
{
int c;
c = getchar();
while (c != EOF)
{
putchar(c);
c = getchar();
fflush(stdout); /* optional, instant feedback */
}
return 0;
}
If the integer value returned by getchar() is stored into a variable of type char and then compared against the integer constant EOF, the comparison may never succeed, because sign-extension of a variable of type char on widening to integer is implementation-defined.
-- opengroup POSIX standard
If char is unsigned by default for your compiler (or by whatever options are being used to invoke the compiler), it's likely that
(c == EOF)
can never be true. If sizeof(unsigned char) < sizeof( int), which is pretty much always true, then the promotion of the char to an int will never result in a negative value, and EOF must be a negative value.
That's one reason why all (or at least many if not all) the functions in the C standard that deal with or return characters specify int as the parameter or return type.
EOF is not an actual character or a sequence of characters. EOF denotes the end of the input file or stream, i.e., the situation when getchar() tries to read a character beyond the last one.
On Unix, you can close an interactive input stream by typing CTRL-D. That situation causes getchar() to return EOF. But if a file contains a character whose ASCII code is 4 (i.e., CTRL-D), getchar() will return 4, not EOF.
It Still Works with char data type. But the tricks are checking the condition in the loop with int value.
First: let's check it. if you write the following code like
printf("%d",getchar());
And then if you give the input from the keyboard A You should see 65 which is ASCII value of the A or if you give CTRL-D then see -1.
So that if you implement this logic then the solving code is
#include<stdio.h>
int main()
{
char c;
while ((c = getchar()) != EOF){
putchar(c);
//printf("%c",c); // this is another way for output
}
return 0;
}
Windows: Ctrl+z
Unix: Ctrl+d
reference:EOF
hi i think it's becoz in a stream -1 is not one but two characters and the ascii for neither of them is -1 or whatever is used for EOF