Difference between specifications of fread and fgets? - c

What is the difference between fread and fgets when reading in from a file?
I use the same fwrite statement, however when I use fgets to read in a .txt file it works as intended, but when I use fread() it does not.
I've switched from fgets/fputs to fread/fwrite when reading from and to a file. I've used fopen(rb/wb) to read in binary rather than standard characters. I understand that fread will get /0 Null bytes as well rather than just single lines.
//while (fgets(buff,1023,fpinput) != NULL) //read in from file
while (fread(buff, 1, 1023, fpinput) != 0) // read from file
I expect to read in from a file to a buffer, put the buffer in shared memory, and then have another process read from shared memory and write to a new file.
When I use fgets() it works as intended with .txt files, but when using fread it adds a single line from 300~ characters into the buffer with a new line. Can't for the life of me figure out why.

fgets will stop when encountering a newline. fread does not. So fgets is typically only useful for text files, while fread can be used for both text and binary files.
From the C11 standard:
7.21.7.2 The fgets function
The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.
7.21.8.1 The fread function
The fread function reads, into the array pointed to by ptr, up to nmemb elements whose size is specified by size, from the stream pointed to by stream. For each object, size calls are made to the fgetc function and the results stored, in the order read, in an array of unsigned char exactly overlaying the object. The file position indicator for the stream (if defined) is advanced by the number of characters successfully read. If an error occurs, the resulting value of the file position indicator for the stream is indeterminate. If a partial element is read, its value is indeterminate.
This snippet maybe will make things clearer for you. It just copies a file in chunks.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char ** argv)
{
if(argc != 3) {
printf("Usage: ./a.out src dst\n");
printf("Copies file src to dst\n");
exit(EXIT_SUCCESS);
}
const size_t chunk_size = 1024;
FILE *in, *out;
if(! (in = fopen(argv[1], "rb"))) exit(EXIT_FAILURE);
if(! (out = fopen(argv[2], "wb"))) exit(EXIT_FAILURE);
char * buffer;
if(! (buffer = malloc(chunk_size))) exit(EXIT_FAILURE);
size_t bytes_read;
do {
// fread returns the number of successfully read elements
bytes_read = fread(buffer, 1, chunk_size, in);
/* Insert any modifications you may */
/* want to do here */
// write bytes_read bytes from buffer to output file
if(fwrite(buffer, 1, bytes_read, out) != bytes_read) exit(EXIT_FAILURE);
// When we read less than chunk_size we are either done or an error has
// occured. This error is not handled in this program.
} while(bytes_read == chunk_size);
free(buffer);
fclose(out);
fclose(in);
}
You mentioned in a comment below that you wanted to use this for byteswapping. Well, you can just use the following snippet. Just insert it where indicated in code above.
for(int i=0; i < bytes_read - bytes_read%2; i+=2) {
char tmp = buffer[i];
buffer[i] = buffer[i+1];
buffer[i+1] = tmp;
}

Related

How to read a line in a file, without know how long is (C language)

i have a file("career.txt") where each line are composed in this way:
serial number(long int) name(string) surname(string) exam_id(string) exam_result(string);
each line could have from 1 to 25 couple composed by the exam_id(ex: INF070) and the exam_result(ex:30L).
Ex line:
333145 Name Surname INF120 24 INF070 28 INF090 R INF100 30L INF090 24
33279 Name Surname GIU123 28 GIU280 27 GIU085 21 GIU300 R
I don't know how many couple there are in one line(there could be 1, 5 or 25)(so i can't use a for cycle) and i have to read it.
How can i read the entire line?
I tried use getline and fgets, and then divide the string with strtok, but I don't know why it doesn't work.
If you don't know the maximum number of characters in a line of the text file, one way to read the entire line is to dynamically allocate memory for the buffer as you read the file.
You can use the getline function to do this which is a POSIX function.
Here is an example of how you can use getline to read a line from a file:
#include <stdio.h>
#include <stdlib.h>
int main() {
char *buffer = NULL;
size_t bufsize = 0;
ssize_t characters;
FILE *file = fopen("file.txt", "r");
if (file == NULL) {
printf("Could not open file\n");
return 1;
}
while ((characters = getline(&buffer, &bufsize, file)) != -1) {
printf("Line: %s", buffer);
}
free(buffer);
fclose(file);
return 0;
}
The getline function works by allocating memory for the buffer automatically, so you don't have to specify a fixed buffer size. It takes three arguments: a pointer to a pointer to a character (in this case, &buffer), a pointer to a size_t variable (in this case, &bufsize), and a pointer to a FILE object representing the file you want to read from. The function reads a line from the file and stores it in the buffer, dynamically allocating more memory if necessary. The getline function returns the number of characters read, which includes the newline character at the end of the line, or -1 if it reaches the end of the file. And in the example above, each line is read and printed. And it is important to release the memory after finished processing the buffer, in this case by free(buffer).

Does fgets() hold somehow where it stopped reading from a FILE *?

I am trying to get a sample (shell script) program on how to write to a file:
#include <unistd.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv){
char buff[1024];
size_t len, idx;
ssize_t wcnt;
for (;;){
if (fgets(buff,sizeof(buff),stdin) == NULL)
return 0;
idx = 0;
len = strlen(buff);
do {
wcnt = write(1,buff + idx, len - idx);
if (wcnt == -1){ /* error */
perror("write");
return 1;
}
idx += wcnt;
} while (idx < len);
}
}
So my problem is this: Let's say I want to write a file of 20000 bytes so every time I can only write (at most) 1024 (buffer size).
Let's say that in my first attempt everything is going perfect and fgets() reads 1024 bytes and in my first do while I write 1024 bytes.
Then, since we wrote "len" bytes we exit the do-while loop.
So now what?? The buffer is full from our previous reading. It seems to me that for some reason it is implied that fgets() will now continue reading from the point it reached in in-file the last time. (buf[1024] here).
How come, fgets() knows where it stopped reading in the in-file?
I checked the man page :
fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored in the buffer. A terminating null byte (aq\0aq) is stored after the last character in the buffer.
fgets() return s on success, and NULL on error or when the end of file occurs while no characters have been read.*
So from that, I get that it returns a pointer to the first element of buf, which is always buf[0],
that's why I am confused.
When using aFILE stream, it contains information about the position in the file (among other things). fgets and other functions like freador fwrite merely utilize this information and updates it when an operation is performed.
So, whenever fgets reads from the stream, the stream will be updated to maintain the position, so that the next operation starts off where the previous ended.

character by character reading from a file in C

How to read text from a file into a dynamic array of characters?
I found a way to count the number of characters in a file and create a dynamic array, but I can't figure out how to assign characters to the elements of the array?
FILE *text;
char* Str;
int count = 0;
char c;
text = fopen("text.txt", "r");
while(c = (fgetc(text))!= EOF)
{
count ++;
}
Str = (char*)malloc(count * sizeof(char));
fclose(text);
There is no portable, standard-conforming way in C to know in advance how may bytes may be read from a FILE stream.
First, the stream might not even be seekable - it can be a pipe or a terminal or even a socket connection. On such streams, once you read the input it's gone, never to be read again. You can push back one char value, but that's not enough to be able to know how much data remains to be read, or to reread the entire stream.
And even if the stream is to a file that you can seek on, you can't use fseek()/ftell() in portable, strictly-conforming C code to know how big the file is.
If it's a binary stream, you can not use fseek() to seek to the end of the file - that's explicitly undefined behavior per the C standard:
... A binary stream need not meaningfully support fseek calls with a whence value of SEEK_END.
Footnote 268 even says:
Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream ...
So you can't portably use fseek() in a binary stream.
And you can't use ftell() to get a byte count for a text stream. Per the C standard again:
For a text stream, its file position indicator contains unspecified information, usable by the fseek function for returning the file position indicator for the stream to its position at the time of the ftell call; the difference between two such return values is not necessarily a meaningful measure of the number of characters written or read.
Systems do exist where the value returned from ftell() is nothing like a byte count.
The only portable, conforming way to know how many bytes you can read from a stream is to actually read them, and you can't rely on being able to read them again.
If you want to read the entire stream into memory, you have to continually reallocate memory, or use some other dynamic scheme.
This is a very inefficient but portable and strictly-conforming way to read the entire contents of a stream into memory (all error checking and header files are omitted for algorithm clarity and to keep the vertical scrollbar from appearing - it really needs error checking and will need the proper header files):
// get input stream with `fopen()` or some other manner
FILE *input = ...
size_t count = 0;
char *data = NULL;
for ( ;; )
{
int c = fgetc( input );
if ( c == EOF )
{
break;
}
data = realloc( data, count + 1 );
data[ count ] = c;
count++;
}
// optional - terminate the data with a '\0'
// to treat the data as a C-style string
data = realloc( data, count + 1 );
data[ count ] = '\0';
count++;
That will work no matter what the stream is.
On a POSIX-style system such as Linux, you can use fileno() and fstat() to get the size of a file (again, all error checking and header files are omitted):
char *data = NULL;
FILE *input = ...
int fd = fileno( input );
struct stat sb;
fstat( fd, &sb );
if ( S_ISREG( sb.st_mode ) )
{
// sb.st_size + 1 for C-style string
char *data = malloc( sb.st_size + 1 );
data[ sb.st_size ] = '\0';
}
// now if data is not NULL you can read into the buffer data points to
// if data is NULL, see above code to read char-by-char
// this tries to read the entire stream in one call to fread()
// there are a lot of other ways to do this
size_t totalRead = 0;
while ( totalRead < sb.st_size )
{
size_t bytesRead = fread( data + totalRead, 1, sb.st_size - totalRead, input );
totalRead += bytesRead;
}
The above could should work on Windows, too. You may get some compiler warnings or have to use _fileno(), _fstat() and struct _stat instead, too.*
You may also need to define the S_ISREG() macro on Windows:
#define S_ISREG(m) (((m) & S_IFMT) == S_IFREG)
* that's _fileno(), _fstat(), and struct _stat without the hyperlink underline-munge.
For a binary file, you can use fseek and ftell to know the size without reading the file, allocate the memory and then read everything:
...
text = fopen("text.txt", "r");
fseek(txt, 0, SEEK_END);
char *ix = Str = malloc(ftell(txt);
while(c = (fgetc(text))!= EOF)
{
ix++ = c;
}
count = ix - Str; // get the exact count...
...
For a text file, on a system that has a multi-byte end of line (like Windows which uses \r\n), this will allocate more bytes than required. You could of course scan the file twice, first time for the size and second for actually reading the characters, but you can also just ignore the additional bytes, or you could realloc:
...
count = ix - Str;
Str = realloc(Str, count);
...
Of course for a real world program, you should control the return values of all io and allocation functions: fopen, fseek, fteel, malloc and realloc...
To just do what you asked for, you would have to read the whole file again:
...
// go back to the beginning
fseek(text, 0L, SEEK_SET);
// read
ssize_t readsize = fread(Str, sizeof(char), count, text);
if(readsize != count) {
printf("woops - something bad happened\n");
}
// do stuff with it
// ...
fclose(text);
But your string is not null terminated this way. That will get you in some trouble if you try to use some common string functions like strlen.
To properly null terminate your string you would have to allocate space for one additional character and set that last one to '\0':
...
// allocate count + 1 (for the null terminator)
Str = (char*)malloc((count + 1) * sizeof(char));
// go back to the beginning
fseek(text, 0L, SEEK_SET);
// read
ssize_t readsize = fread(Str, sizeof(char), count, text);
if(readsize != count) {
printf("woops - something bad happened\n");
}
// add null terminator
Str[count] = '\0';
// do stuff with it
// ...
fclose(text);
Now if you want know the number of characters in the file without counting them one by one, you could get that number in a more efficient way:
...
text = fopen("text.txt", "r");
// seek to the end of the file
fseek(text, 0L, SEEK_END);
// get your current position in that file
count = ftell(text)
// allocate count + 1 (for the null terminator)
Str = (char*)malloc((count + 1) * sizeof(char));
...
Now bring this in a more structured form:
// open file
FILE *text = fopen("text.txt", "r");
// seek to the end of the file
fseek(text, 0L, SEEK_END);
// get your current position in that file
ssize_t count = ftell(text)
// allocate count + 1 (for the null terminator)
char* Str = (char*)malloc((count + 1) * sizeof(char));
// go back to the beginning
fseek(text, 0L, SEEK_SET);
// read
ssize_t readsize = fread(Str, sizeof(char), count, text);
if(readsize != count) {
printf("woops - something bad happened\n");
}
fclose(text);
// add null terminator
Str[count] = '\0';
// do stuff with it
// ...
Edit:
As Andrew Henle pointed out not every FILE stream is seekable and you can't even rely on being able to read the file again (or that the file has the same length/content when reading it again). Even though this is the accepted answer, if you don't know in advance what kind of file stream you're dealing with, his solution is definitely the way to go.

How to read a complete file with scanf maybe something like %[^\EOF] without loop in single statement

I want to know if I can read a complete file with single scanf statement. I read it with below code.
#include<stdio.h>
int main()
{
FILE * fp;
char arr[200],fmt[6]="%[^";
fp = fopen("testPrintf.c","r");
fmt[3] = EOF;
fmt[4] = ']';
fmt[5] = '\0';
fscanf(fp,fmt,arr);
printf("%s",arr);
printf("%d",EOF);
return 0;
}
And it resulted into a statement after everything happened
"* * * stack smashing detected * * *: terminated
Aborted (core dumped)"
Interestingly, printf("%s",arr); worked but printf("%d",EOF); is not showing its output.
Can you let me know what has happened when I tried to read upto EOF with scanf?
If you really, really must (ab)use fscanf() into reading the file, then this outlines how you could do it:
open the file
use fseek() and
ftell() to find the size of the file
rewind() (or fseek(fp, 0, SEEK_SET)) to reset the file to the start
allocate a big buffer
create a format string that reads the correct number of bytes into the buffer and records how many characters are read
use the format with fscanf()
add a null terminating byte in the space reserved for it
print the file contents as a big string.
If there are no null bytes in the file, you'll see the file contents printed. If there are null bytes in the file, you'll see the file contents up to the first null byte.
I chose the anodyne name data for the file to be read — there are endless ways you can make that selectable at runtime.
There are a few assumptions made about the size of the file (primarily that the size isn't bigger than can be fitted into a long with signed overflow, and that it isn't empty). It uses the fact that the %c format can accept a length, just like most of the formats can, and it doesn't add a null terminator at the end of the string it reads and it doesn't fuss about whether the characters read are null bytes or anything else — it just reads them. It also uses the fact that you can specify the size of the variable to hold the offset with the %n (or, in this case, the %ln) conversion specification. And finally, it assumes that the file is not shrinking (it will ignore growth if it is growing), and that it is a seekable file, not a FIFO or some other special file type that does not support seeking.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
const char filename[] = "data";
FILE *fp = fopen(filename, "r");
if (fp == NULL)
{
fprintf(stderr, "Failed to open file %s for reading\n", filename);
exit(EXIT_FAILURE);
}
fseek(fp, 0, SEEK_END);
long length = ftell(fp);
rewind(fp);
char *buffer = malloc(length + 1);
if (buffer == NULL)
{
fprintf(stderr, "Failed to allocate %ld bytes\n", length + 1);
exit(EXIT_FAILURE);
}
char format[32];
snprintf(format, sizeof(format), "%%%ldc%%ln", length);
long nbytes = 0;
if (fscanf(fp, format, buffer, &nbytes) != 1 || nbytes != length)
{
fprintf(stderr, "Failed to read %ld bytes (got %ld)\n", length, nbytes);
exit(EXIT_FAILURE);
}
buffer[length] = '\0';
printf("<<<SOF>>\n%s\n<<EOF>>\n", buffer);
free(buffer);
return(0);
}
This is still an abuse of fscanf() — it would be better to use fread():
if (fread(buffer, sizeof(char), length, fp) != (size_t)length)
{
fprintf(stderr, "Failed to read %ld bytes\n", length);
exit(EXIT_FAILURE);
}
You can then omit the variable format and the code that sets it, and also nbytes. Or you can keep nbytes (maybe as a size_t instead of long) and assign the result of fread() to it, and use the value in the error report, along the lines of the test in the fscanf() variant.
You might get warnings from GCC about a non-literal format string for fscanf(). It's correct, but this isn't dangerous because the programmer is completely in charge of the content of the format string.

Length of character array after fread is smaller than expected

I am attempting to read a file into a character array, but when I try to pass in a value for MAXBYTES of 100 (the arguments are FUNCTION FILENAME MAXBYTES), the length of the string array is 7.
FILE * fin = fopen(argv[1], "r");
if (fin == NULL) {
printf("Error opening file \"%s\"\n", argv[1]);
return EXIT_SUCCESS;
}
int readSize;
//get file size
fseek(fin, 0L, SEEK_END);
int fileSize = ftell(fin);
fseek(fin, 0L, SEEK_SET);
if (argc < 3) {
readSize = fileSize;
} else {
readSize = atof(argv[2]);
}
char *p = malloc(fileSize);
fread(p, 1, readSize, fin);
int length = strlen(p);
filedump(p, length);
As you can see, the memory allocation for p is always equal to filesize. When I use fread, I am trying to read in the 100 bytes (readSize is set to 100 as it should be) and store them in p. However, strlen(p) results in 7 during if I pass in that argument. Am I using fread wrong, or is there something else going on?
Thanks
That is the limitation with attempting to read text with fread. There is nothing wrong with doing so, but you must know whether the file contains something other than ASCII characters (such as the nul-character) and you certainly cannot treat any part of the buffer as a string until you manually nul-terminate it at some point.
fread does not guarantee the buffer will contain a nul-terminating character at all -- and it doesn't guarantee that the first character read will not be the nul-character.
Again, there is nothing wrong with reading an entire file into an allocated buffer. That's quite common, you just cannot treat what you have read as a string. That is a further reason why there are character oriented, formatted, and line oriented input functions. (getchar, fgetc, fscanf, fgets and POSIX getline, to list a few). The formatted and line oriented functions guarantee a nul-terminated buffer, otherwise, you are on your own to account for what you have read, and insure you nul-terminate your buffer -- before treating it as a string.

Resources