I am new to file IO in c. I decided to write a simple script in c that copies a file to a new file for practice:
#include <stdio.h>
void main(int argc, char* argv[])
{
if (argc != 3)
{
printf("Usage: ./myFile source destination");
exit(-1);
}
FILE * src = fopen(argv[1], "r");
if (src == NULL)
{
printf("source file not found", argv[1]);
exit(-1);
}
FILE* dest = fopen(argv[2], "w");
unsigned char c;
do {
c = fgetc(src);
fputc(c, dest);
} while (c != EOF);
}
However, I am getting an infinite loop. Is this because I never actually hit a character called EOF?
Also, is there a faster way to write this script aside from reading each character 1 at a time?
Declare c as an int and it'll work.
EOF is not a valid value for a character, because if it were, the presence of that character in a file could mislead code into thinking that file has ended when it actually hasn't. That's precisely why fgetc() actually returns an int, not a char.
Edit: Your code also has another bug: when fgetc() does return EOF, you pass that value to fputc() before ending the loop, causing an extra character to appear at the end of your output file. (The extra character will be whatever you get when you cast EOF to unsigned char on your system, typically character 255 == 0xFF == (unsigned char) -1.) To fix that, you can rewrite your loop like this:
int c;
while ((c = fgetc(src)) != EOF) {
fputc(c, dst);
}
or, if you don't like assigments in loop conditions:
while (1) {
int c = fgetc(src);
if (c == EOF) break;
fputc(c, dst);
}
Anyway, it would be much more efficient to read and write the data in chunks using fread() and fwrite(), e.g. like this:
unsigned char buf[65536];
while (1) {
int n = fread(buf, 1, sizeof(buf), src);
fwrite(buf, 1, n, dst);
if (n < sizeof(buf)) break; /* end of file or read error */
}
Also, it would be a good idea to include some error checking, since both reading and writing a file can fail for a variety of unexpected reasons. You can use ferror() to tell whether an error has occurred on a particular I/O stream.
EOF is not an unsigned char but an int. See the prototype of fgetc:
int fgetc(FILE *stream);
Related
there is very long "dict.txt" file.
the size of this file is about 2400273(calculated by fseek, SEEK_END)
this file has lots of char like this 'apple = 사과'(simillar to dictionary)
Main problem is that reading file takes very long time
I couldn't find any solution to solve this problem in GOOGLE
The reason i guessed is associated with using fgets() but i don't know exactly.
please help me
here is my code written by C
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
int line = 0;
char txt_str[50];
FILE* pFile;
pFile = fopen("dict_test.txt", "r");
if (pFile == NULL) {
printf("file doesn't exist or there is problem to open your file\n");
}
else {
do{
fgets(txt_str, 50, pFile);;
line++;
} while (txt_str != EOF);
}
printf("%d", line);
}
Output
couldn't see result because program was continuosly running
Expected
the number of lines of this txt file
Major
OP's code fail to test the return value of fgets(). Code needs to check the return value of fgets() to know when to stop. #A4L
do{
fgets(txt_str, 50, pFile);; // fgets() return value not used.
Other
Line count should not get incremented when fgets() returns NULL.
Line count should not get incremented when fgets() read a partial line. (I. e.) the line was 50 or longer. Reasonable to use a wider than 50 buffer.
Line count may exceed INT_MAX. There is always some upper bound, yet trivial to use a wider type.
Good practice to close the stream.
Another approach to count lines would use fread() to read chunks of memory and then look for start of lines. (Not shown)
Recommend to print a '\n' after the line count.
int main(void) {
FILE* pFile = fopen("dict_test.txt", "r");
if (pFile == NULL) {
printf("File doesn't exist or there is problem to open your file.\n");
return EXIT_FAILURE;
}
unsigned long long line = 0;
char txt_str[4096];
while (fgets(txt_str, sizeof txt_str, pFile)) {
if (strlen(txt_str) == sizeof txt_str - 1) { // Buffer full?
if (txt_str[sizeof txt_str - 1] != '\n') { // Last not \n?
continue;
}
}
line++;
}
fclose(pFile);
printf("%llu\n", line);
}
fgets returns NULL on EOF.
You are never assigning the result of
fgets(txt_str, 50, pFile);
to txt_str, your program never sees the end of the file and thus enters an endless loop.
try something like this:
char* p_str;
do{
p_str = fgets(txt_str, 50, pFile);
} while (p_str != NULL);
On numerous sources, you can find a simple C program to count the number of lines in a file. I'm using one of these.
#include <stdio.h>
int main(int argc, char* argv[]) {
FILE *file;
long count_lines = 0;
char chr;
file = fopen(argv[1], "r");
while ((chr = fgetc(file)) != EOF)
{
count_lines += chr == '\n';
}
fclose(file); //close file.
printf("%ld %s\n", count_lines, argv[1]);
return 0;
}
However, it fails to count the num. of lines in Top2Billion-probable-v2.txt. It stops on the line
<F0><EE><E7><E0><EB><E8><FF>
and outputs
1367044 Top2Billion-probable-v2.txt
when it should output 1973218846 lines. wc -l somehow avoids the problem (and is amazingly faster).
Should I give up with a correct C implementation of counting the number of lines of a file or how should I space the special characters as wc does?
fgetc() returns the character read as an unsigned char cast to an int or EOF. Hence declaring chr as int instead of char should solve the issue.
I want to copy the contents of file1 to file2 exactly as they are (keeping spaces and newlines). I specifically want to copy these contents one small block of chars at a time(this is a small segment of a larger project so bear with me).
I have attempted the following:
#include <stdio.h>
#include <stdlib.h>
#define MAX 5
int main(int argc, char *argv[]) {
FILE *fin, *fout;
char buffer[MAX];
int length;
char c;
if((fin=fopen(argv[1], "r")) == NULL){
perror("fopen");
exit(EXIT_FAILURE);
}
if((fout=fopen(argv[2], "w")) == NULL){
perror("fopen");
exit(EXIT_FAILURE);
}
while(1){
length = 0;
while((c = fgetc(fin)) != EOF && length < MAX){
buffer[length++] = (char) c;
}
if(length == 0){
break;
}
fprintf(fout, "%s", buffer);
}
fclose(fout);
fclose(fin);
}
However, this causes incorrect output to my file2. Any input would be appreciated.
Your buffer is not zero-terminated. Use fwrite instead of fprintf:
fwrite(buffer, 1, length, fout);
And you should check the error too. So compare return code of fwrite to length and if it differs, either retry the write of remaining bytes (if positive) or print appropriate error message via perror("fwrite") (if return code is negative).
Additionally you may consider opening the files in binary mode which would cause difference on windows, i.e. pass "rb" and "wb" to fopen.
Last but not least, instead of looping and getting one character at a time, consider using fread instead:
length = fread(buffer, 1, MAX, fin);
Here is a simple example.(with no error checking)
You should use fwrite() since the string you would write to file is not a "null-terminated". And also note that "b" mode is specified with fopen(), which means you want to open the file as a binary file.
#include <stdio.h>
#include <stdlib.h>
#define MAX 5
#define FILE_BLOCK_SIZE 50
int _tmain(int argc, _TCHAR* argv[])
{
FILE *fin, *fout;
unsigned char *BufContent = NULL;
BufContent = (unsigned char*) malloc(FILE_BLOCK_SIZE);
size_t BufContentSz;
if((fin=fopen("E:\\aa.txt", "rb")) == NULL){
perror("fopen");
exit(EXIT_FAILURE);
}
if((fout=fopen("E:\\bb.txt", "wb")) == NULL){
perror("fopen");
exit(EXIT_FAILURE);
}
while ((BufContentSz = fread(BufContent, sizeof(unsigned char), FILE_BLOCK_SIZE, fin)) > 0)
{
fwrite(BufContent, sizeof(unsigned char), BufContentSz, fout);
}
fclose(fout);
fclose(fin);
delete BufContent;
return 0;
}
First off, change char buffer[MAX]; to int buffer[MAX];, and char c; to int c;, for a char can be either signed char or unsigned char, depending on your implementation. In the later case, c = EOF will give c a large positive number(It's unsigned ,anyway), so the loop will never end. A int will be large enough to hold all characters and EOF though.
Then, change your
fprintf(fout, "%s", buffer);
to
fwrite(buffer, 1, length, four);
This is because fprintf(fout, "%s", buffer); call for a C-style string, with ends with a '\0', but your buffer isn't zero-terminated. As a result, the program will keep copying the stuff in the stack, until a '\0' is met, leaving lots of garbage in file2.
I use below code to read a char from file and replace it with another,
but I have an error.loop in going to end of file.
What is wrong?
I tested this code on linux (netbeans IDE) and it was correct and worked beautiful but when I tried to use VS 2008 in windows , I found a non end loop.
//address = test.txt
FILE *fp;
fp=fopen(address,"r+");
if(fp == 0)
{
printf("can not find!!");
}
else
{
char w = '0'; /// EDIT : int w;
while(1)
{
if((w = fgetc(fp)) != EOF)
{
if((w = fgetc(fp)) != EOF)
{
fseek(fp,-2,SEEK_CUR);
fprintf(fp,"0");
}
}
else
{
break;
}
}
}
fclose(fp);
You are storing the result of fgetc in a char, instead of an int.
char w = '0'; /* Wrong, should be int. */
Incidentally, this problem is mentioned in the C FAQ.
If type char is unsigned, an actual
EOF value will be truncated (by having
its higher-order bits discarded,
probably resulting in 255 or 0xff) and
will not be recognized as EOF,
resulting in effectively infinite
input.
EDIT
Reading your question again, it's highly fishy the way you seek back two characters and write one character. That could well lead to an infinite loop.
EDIT2
You (likely) want something like this (untested):
while ((w = getc(fp)) != EOF) {
fseek(fp, -1, SEEK_CUR);
fprintf(fp, "0");
fflush(fp); /* Apparently necessary, see the answer of David Grayson. */
}
The fopen documentation on cplusplus.com says:
For the modes where both read and
writing (or appending) are allowed
(those which include a "+" sign), the
stream should be flushed (fflush) or
repositioned (fseek, fsetpos, rewind)
between either a reading operation
followed by a writing operation or a
writing operation followed by a
reading operation.
We can add an fflush call after the fprintf to satisfy that requirement.
Here is my working code. It creates a file named example.txt and after the program exits that file's contents will be 000000000000n.
#include <stdio.h>
int main(int argc, char **argv)
{
FILE * fp;
int w;
fp = fopen("example.txt","w");
fprintf(fp, "David Grayson");
fclose(fp);
fp = fopen("example.txt","r+");
while(1)
{
if((w = fgetc(fp)) != EOF)
{
if((w = fgetc(fp)) != EOF)
{
fseek(fp,-2,SEEK_CUR);
fprintf(fp,"0");
fflush(fp); // Necessary!
}
}
else
{
break;
}
}
fclose(fp);
}
This was tested with MinGW in Windows.
I have a text file that has strings on each line. I want to increment a number for each line in the text file, but when it reaches the end of the file it obviously needs to stop. I've tried doing some research on EOF, but couldn't really understand how to use it properly.
I'm assuming I need a while loop, but I'm not sure how to do it.
How you detect EOF depends on what you're using to read the stream:
function result on EOF or error
-------- ----------------------
fgets() NULL
fscanf() number of succesful conversions
less than expected
fgetc() EOF
fread() number of elements read
less than expected
Check the result of the input call for the appropriate condition above, then call feof() to determine if the result was due to hitting EOF or some other error.
Using fgets():
char buffer[BUFFER_SIZE];
while (fgets(buffer, sizeof buffer, stream) != NULL)
{
// process buffer
}
if (feof(stream))
{
// hit end of file
}
else
{
// some other error interrupted the read
}
Using fscanf():
char buffer[BUFFER_SIZE];
while (fscanf(stream, "%s", buffer) == 1) // expect 1 successful conversion
{
// process buffer
}
if (feof(stream))
{
// hit end of file
}
else
{
// some other error interrupted the read
}
Using fgetc():
int c;
while ((c = fgetc(stream)) != EOF)
{
// process c
}
if (feof(stream))
{
// hit end of file
}
else
{
// some other error interrupted the read
}
Using fread():
char buffer[BUFFER_SIZE];
while (fread(buffer, sizeof buffer, 1, stream) == 1) // expecting 1
// element of size
// BUFFER_SIZE
{
// process buffer
}
if (feof(stream))
{
// hit end of file
}
else
{
// some other error interrupted read
}
Note that the form is the same for all of them: check the result of the read operation; if it failed, then check for EOF. You'll see a lot of examples like:
while(!feof(stream))
{
fscanf(stream, "%s", buffer);
...
}
This form doesn't work the way people think it does, because feof() won't return true until after you've attempted to read past the end of the file. As a result, the loop executes one time too many, which may or may not cause you some grief.
One possible C loop would be:
#include <stdio.h>
int main()
{
int c;
while ((c = getchar()) != EOF)
{
/*
** Do something with c, such as check against '\n'
** and increment a line counter.
*/
}
}
For now, I would ignore feof and similar functions. Exprience shows that it is far too easy to call it at the wrong time and process something twice in the belief that eof hasn't yet been reached.
Pitfall to avoid: using char for the type of c. getchar returns the next character cast to an unsigned char and then to an int. This means that on most [sane] platforms the value of EOF and valid "char" values in c don't overlap so you won't ever accidentally detect EOF for a 'normal' char.
You should check the EOF after reading from file.
fscanf_s // read from file
while(condition) // check EOF
{
fscanf_s // read from file
}
I would suggest you to use fseek-ftell functions.
FILE *stream = fopen("example.txt", "r");
if(!stream) {
puts("I/O error.\n");
return;
}
fseek(stream, 0, SEEK_END);
long size = ftell(stream);
fseek(stream, 0, SEEK_SET);
while(1) {
if(ftell(stream) == size) {
break;
}
/* INSERT ROUTINE */
}
fclose(stream);