writing multiple lines back to a file, wrong file length - c

I want to get the a multiline text file's content, and put it back to the file.
However, I have an issue with the file length.
The null terminator (0) that I add is after some strange characters.
Something wrong with my f_length ?
Edit : If I set the line-endings of my file to Unix (LF), I don't have the issue. So it seems that my code is incompatible with Windows line endings. How can I account for Windows text files ?
int main()
{
FILE *fp = NULL;
int f_length;
char *buffer = NULL;
size_t size = 0;
fp = fopen(FILENAME, "r+");
fseek(fp, 0, SEEK_END);
f_length = ftell(fp);
rewind(fp);
buffer = malloc((f_length + 1) * sizeof(*buffer));
fread(buffer, f_length, 1, fp);
buffer[f_length] = 0;
printf("%s\n", buffer);
fp = fopen(FILENAME, "w+");
fputs(buffer, fp);
fclose(fp);
return 0;
}

Use fp = fopen(FILENAME, "rb+"); instead. For text files, you'll have newline characters replaced while reading (you've already noticed that in comments). In some cases, new format is shorter ("\r" or "\n" while the file itself contains "\r\n"), so f_length will be bigger than the actual data read.
Or you can use line-by-line reading functions, they are made for text-mode files.

You are assuming the size of the file on disk is going to be equal to the number of bytes you will read. That is a valid assumption for a clean, binary file. It is not a valid assumption for a text file.
I'd suggest using the return value from fread instead of f_length as it reports the number of objects you actually read after any required read processing. You'll need to adjust your fread parameters to read 1-byte sized objects.

regarding:
buffer[f_length] = 0;
This is placing the '0' way too far into the buffer. this is why you see garbage characters. Much better to capture the returned value from the call to fread() and set the '0' using:
buffer[ <returnedValue> ] = '\0';

Related

how to download an unrestricted line from a file using the fread function?

I have a question how to download a line of text from the file without specifying the size of this line? I wouldn't want to use fgets because you have to give the fgets to the characters in advance. I can load the whole file, but not one line.
FILE *f
long lSize;
char *buffer;
size_t result;
f = fopen("file.txt", "r");
fseek(f, 0, SEEK_END);
lSize = ftell(f);
rewind (f);
buffer = (char*)malloc(sizeof(char)*lSize);
result = fread(buffer,1,lSize, f);
fclose(f);
free(buffer);
Use malloc() to set an initial buffer for your text line. Say, 16 chars.
Loop over the file and retrieve one character at a time with fgetc(). Store it into your buffer, at the appropiate place. If it is a newline, put a NUL character instead in the buffer and exit the loop.
When the buffer is about to get full, realloc() it and expand it for 16 more chars. If the realloc is succesfull, go to step 2.

character by character reading from a file in C

How to read text from a file into a dynamic array of characters?
I found a way to count the number of characters in a file and create a dynamic array, but I can't figure out how to assign characters to the elements of the array?
FILE *text;
char* Str;
int count = 0;
char c;
text = fopen("text.txt", "r");
while(c = (fgetc(text))!= EOF)
{
count ++;
}
Str = (char*)malloc(count * sizeof(char));
fclose(text);
There is no portable, standard-conforming way in C to know in advance how may bytes may be read from a FILE stream.
First, the stream might not even be seekable - it can be a pipe or a terminal or even a socket connection. On such streams, once you read the input it's gone, never to be read again. You can push back one char value, but that's not enough to be able to know how much data remains to be read, or to reread the entire stream.
And even if the stream is to a file that you can seek on, you can't use fseek()/ftell() in portable, strictly-conforming C code to know how big the file is.
If it's a binary stream, you can not use fseek() to seek to the end of the file - that's explicitly undefined behavior per the C standard:
... A binary stream need not meaningfully support fseek calls with a whence value of SEEK_END.
Footnote 268 even says:
Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream ...
So you can't portably use fseek() in a binary stream.
And you can't use ftell() to get a byte count for a text stream. Per the C standard again:
For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file position indicator for the stream to its position at the time of the ftell call; the difference between two such return values is not necessarily a meaningful measure of the number of characters written or read.
Systems do exist where the value returned from ftell() is nothing like a byte count.
The only portable, conforming way to know how many bytes you can read from a stream is to actually read them, and you can't rely on being able to read them again.
If you want to read the entire stream into memory, you have to continually reallocate memory, or use some other dynamic scheme.
This is a very inefficient but portable and strictly-conforming way to read the entire contents of a stream into memory (all error checking and header files are omitted for algorithm clarity and to keep the vertical scrollbar from appearing - it really needs error checking and will need the proper header files):
// get input stream with `fopen()` or some other manner
FILE *input = ...
size_t count = 0;
char *data = NULL;
for ( ;; )
{
int c = fgetc( input );
if ( c == EOF )
{
break;
}
data = realloc( data, count + 1 );
data[ count ] = c;
count++;
}
// optional - terminate the data with a '\0'
// to treat the data as a C-style string
data = realloc( data, count + 1 );
data[ count ] = '\0';
count++;
That will work no matter what the stream is.
On a POSIX-style system such as Linux, you can use fileno() and fstat() to get the size of a file (again, all error checking and header files are omitted):
char *data = NULL;
FILE *input = ...
int fd = fileno( input );
struct stat sb;
fstat( fd, &sb );
if ( S_ISREG( sb.st_mode ) )
{
// sb.st_size + 1 for C-style string
char *data = malloc( sb.st_size + 1 );
data[ sb.st_size ] = '\0';
}
// now if data is not NULL you can read into the buffer data points to
// if data is NULL, see above code to read char-by-char
// this tries to read the entire stream in one call to fread()
// there are a lot of other ways to do this
size_t totalRead = 0;
while ( totalRead < sb.st_size )
{
size_t bytesRead = fread( data + totalRead, 1, sb.st_size - totalRead, input );
totalRead += bytesRead;
}
The above could should work on Windows, too. You may get some compiler warnings or have to use _fileno(), _fstat() and struct _stat instead, too.*
You may also need to define the S_ISREG() macro on Windows:
#define S_ISREG(m) (((m) & S_IFMT) == S_IFREG)
* that's _fileno(), _fstat(), and struct _stat without the hyperlink underline-munge.
For a binary file, you can use fseek and ftell to know the size without reading the file, allocate the memory and then read everything:
...
text = fopen("text.txt", "r");
fseek(txt, 0, SEEK_END);
char *ix = Str = malloc(ftell(txt);
while(c = (fgetc(text))!= EOF)
{
ix++ = c;
}
count = ix - Str; // get the exact count...
...
For a text file, on a system that has a multi-byte end of line (like Windows which uses \r\n), this will allocate more bytes than required. You could of course scan the file twice, first time for the size and second for actually reading the characters, but you can also just ignore the additional bytes, or you could realloc:
...
count = ix - Str;
Str = realloc(Str, count);
...
Of course for a real world program, you should control the return values of all io and allocation functions: fopen, fseek, fteel, malloc and realloc...
To just do what you asked for, you would have to read the whole file again:
...
// go back to the beginning
fseek(text, 0L, SEEK_SET);
// read
ssize_t readsize = fread(Str, sizeof(char), count, text);
if(readsize != count) {
printf("woops - something bad happened\n");
}
// do stuff with it
// ...
fclose(text);
But your string is not null terminated this way. That will get you in some trouble if you try to use some common string functions like strlen.
To properly null terminate your string you would have to allocate space for one additional character and set that last one to '\0':
...
// allocate count + 1 (for the null terminator)
Str = (char*)malloc((count + 1) * sizeof(char));
// go back to the beginning
fseek(text, 0L, SEEK_SET);
// read
ssize_t readsize = fread(Str, sizeof(char), count, text);
if(readsize != count) {
printf("woops - something bad happened\n");
}
// add null terminator
Str[count] = '\0';
// do stuff with it
// ...
fclose(text);
Now if you want know the number of characters in the file without counting them one by one, you could get that number in a more efficient way:
...
text = fopen("text.txt", "r");
// seek to the end of the file
fseek(text, 0L, SEEK_END);
// get your current position in that file
count = ftell(text)
// allocate count + 1 (for the null terminator)
Str = (char*)malloc((count + 1) * sizeof(char));
...
Now bring this in a more structured form:
// open file
FILE *text = fopen("text.txt", "r");
// seek to the end of the file
fseek(text, 0L, SEEK_END);
// get your current position in that file
ssize_t count = ftell(text)
// allocate count + 1 (for the null terminator)
char* Str = (char*)malloc((count + 1) * sizeof(char));
// go back to the beginning
fseek(text, 0L, SEEK_SET);
// read
ssize_t readsize = fread(Str, sizeof(char), count, text);
if(readsize != count) {
printf("woops - something bad happened\n");
}
fclose(text);
// add null terminator
Str[count] = '\0';
// do stuff with it
// ...
Edit:
As Andrew Henle pointed out not every FILE stream is seekable and you can't even rely on being able to read the file again (or that the file has the same length/content when reading it again). Even though this is the accepted answer, if you don't know in advance what kind of file stream you're dealing with, his solution is definitely the way to go.

Reading 7M data from a file fails [duplicate]

This question already has answers here:
reading a text file into an array in c
(3 answers)
Closed 8 years ago.
I am trying to read 7M data from a file but it is failing. When I googled,I found that there is no limit for reading data.
My code given below is failing with segmentation fault.
char *buf = malloc(7008991);
FILE *fp = fopen("35mb.txt", "rb");
long long i = 0;
long long j = 0;
while(fgets(buf+i, 1024, fp)) {
i+=strlen(buf);
if(i==7008991)break;
}
printf("read done");
printf("ch=%s\n", buf);
Need some help
If you want to read the content of a large file into memory, you may:
1. actually read it
2. mmap it.
I'll cover how to actually read it, and assume using binary mode and no text-mode mess.
FILE* fp;
// Open the file
fp = fopen ("35mb.txt", "rb");
if ( fp == NULL ) return -1; // Fail
// Get file length, there are many use to do this like fstat
// TODO: check failure
fseek ( fp, 0, SEEK_END );
flen = ftell ( fp );
fseek ( fp, 0, SEEK_SET );
if ( fread ( buffer, flen, 1, fp ) != 1 ) {
// Fail
}
fclose ( fp );
There are a few things that could go wrong here.
Firstly, no this line, memory allocation can fail. (Malloc can return a NULL pointer, you should check this. (You should also check that the file opened without error.)
char *buf = malloc(7008991);
Next, in the loop. Remember that fgets reads one line, regardless of how long that is, up to a maximum of 1024-1 bytes (and appends a null-character). Please not that for binary input, using fread is probably more appropriate.
while(fgets(buf+i, 1024, fp)) {
After that, this is a good line, as you really do not know how long a line is.
i+=strlen(buf);
This line however is probably why you are failing.
if(i==7008991)break;
You are requireing the size to be exactly 77008991 bytes long to break. That is rather unlikely unless you are very very sure about the formatting of your file. This line should probably read if ( i >= 7008991 ) break;
You should probably replace your explicit size with a named constant as well.
Most probably the size of your file is exactly 7008991 bytes. But when you read the file with fgets you ask to write at most 1024 bytes. This is not true when you reach the end of the file. Suppose you already read 7008990 bytes, then you should call fgets with: fgets(buf+i, 1, fp) because your buffer has got no more than one byte left.
Another issue is that you want to print the buffer at the end of your program. For this to work your buffer must be NUL terminated. So you need to allocate one more byte than the file size. fgets will automatically append the NUL byte.
Yet another issue is the way you increment your counter: i += strlen(buf) this is wrong, the correct code is: i = strlen(buf)
All of this assume there is no NUL bytes in your code. As already explained in comments, it is wiser to use fgets only when dealing with text files. When reading binary files you'd better use fread.
The corrected code would be:
unsigned long FILE_SIZE = 7008991+1;
char *buf = malloc(FILE_SIZE);
FILE *fp = fopen("35mb.txt", "rb");
long long i = 0;
long long j = 0;
while(fgets(buf+i, FILE_SIZE-i, fp)) {
i = strlen(buf);
if(i==7008991)break;
}
printf("read done");
printf("ch=%s\n", buf);

Read contents of a file into string while skipping first two lines

I'm trying to read the contents of a file into one string without reading in the first two lines.
Right now I have:
char* LoadDocument(char* name) {
char* buffer = 0;
long length;
FILE* f = fopen(name, "r");
if(f) {
fseek(f, 0, SEEK_END);
length = ftell(f);
fseek(f, 0, SEEK_SET);
buffer = malloc(length);
if (buffer) {
fgets (buffer, 100, f);
}
fclose (f);
}
return buffer;
}
But I'm not sure how to skip the first two lines. Also, it appears my malloc is insufficient to hold the whole file here, because it's not getting the whole file.
One solution is to read the complete file into the buffer, manually find the end of the second line, and move the remaining data to the beginning of the buffer.
Also, don't forget to add the string terminator, if you want to use the buffer as a string.
The easiest way is just to read two lines before your "real" reading begins:
char line[1024];
fgets(line, sizeof line, f);
fgets(line, sizeof line, f);
You should probably error-check this too, since the file might be shorter in which case you won't get the expected results. Also the length might be too short. If you really want to support any length, read single characters until you find the end of line, twice.
Your final fgets() should probably be a fread() call, to read the entire rest of the file. You might want to compensate for the lost length due to the initial skipping, too.

fread() puts weird things into char array

I have a file that I want to be read from and printed out to the screen. I'm using XCode as my IDE. Here is my code...
fp=fopen(x, "r");
char content[102];
fread(content, 1, 100, fp);
printf("%s\n", content);
The content of the file is "Bacon!" What it prints out is \254\226\325k\254\226\234.
I have Googled all over for this answer, but the documentation for file I/O in C seems to be sparse, and what little there is is not very clear. (To me at least...)
EDIT: I switched to just reading, not appending and reading, and switched the two middle arguments in fread(). Now it prints out Bacon!\320H\320 What do these things mean? Things as in backslash number number number or letter. I also switched the way to print it out as suggested.
You are opening the file for appending and reading. You should be opening it for reading, or moving your read pointer to the place from which you are going to read (the beginning, I assume).
FILE *fp = fopen(x, "r");
or
FILE *fp = fopen(x, "a+");
rewind(fp);
Also, fread(...) does not zero-terminate your string, so you should terminate it before printing:
size_t len = fread(content, 1, 100, fp);
content[len] = '\0';
printf("%s\n", content);
I suppose, you meant this:
printf("%s\n", content);
Maybe:
fp = fopen(x, "a+");
if(fp)
{
char content[102];
memset(content, 0 , 102);
// arguments are swapped.
// See : http://www.cplusplus.com/reference/clibrary/cstdio/fread/
// You want to read 1 byte, 100 times
fread(content, 1, 100, fp);
printf("%s\n", content);
}
A possible reason is that you do not terminate the data you read, so printf prints the buffer until it finds a string terminator.

Resources