I am trying to understand how the fread() function in <stdio.h> works and I am confused about the return value of this function. In the man pages it says
RETURN VALUE
On success, fread() and fwrite() return the number of items read
or written. This number equals the number of bytes transferred only
when size is 1. If an error occurs, or the end of the file is
reached, the return value is a short item count (or zero).
fread() does not distinguish between end-of-file and error, and
callers must use feof(3) and ferror(3) to determine which occurred.
Could someone please explain to me what is meant by number of items read or written in this context. Also can anyone provide me with some example return values and their meanings?
fread() will read a count of items of a specified size from the provided IO stream (FILE* stream). It returns the number of items successfully read from the stream. If it returns a number less than the requested number of items, the IO stream can be considered empty (or otherwise broken).
The number of bytes read will be equal to the number of items successfully read times the provided size of the items.
Consider the following program.
#include <stdio.h>
int main() {
char buf[8];
size_t ret = fread(buf, sizeof(*buf), sizeof(buf)/sizeof(*buf), stdin);
printf("read %zu bytes\n", ret*sizeof(*buf));
return 0;
}
When we run this program, depending on the amount of input provided, different outcomes can be observed.
We can provide no input at all. The IO stream will be empty to begin with (EOF). The return value will be zero. No items have been read. Zero is returned.
$ : | ./a.out
read 0 bytes
We provide fewer input as requested. Some items will be read before EOF is encountered. The number of items read is returned. No more items are available. Thereafter the stream is empty.
$ echo "Hello" | ./a.out
read 6 bytes
We provide equal or more input as requested. The number of items as requested will be returned. More items may be available.
$ echo "Hello World" | ./a.out
read 8 bytes
Related reading:
What is the rationale for fread/fwrite taking size and count as arguments?
When there are less bytes in the stream than consitute an item, the number of bytes consumed from the stream might however be greater than the number of bytes read as calculated by above formula. This answer to above linked question (and the comment to it) I find especially insightful in this matter:
https://stackoverflow.com/a/296305/1025391
The syntax for fread() is
size_t fread(void *ptr, size_t size, size_t nmemb, FILE * stream );
which means,
The function fread() reads nmemb elements of data, each size bytes long, from the stream pointed to by stream, storing them at the location given by ptr.
So, the total number of bytes read will be nmemb * size.
Now, by saying
on success, fread() and fwrite() return the number of items read or written. This number equals the number of bytes transferred only when size is 1.
it means that, the return value will equal the nmemb when the size is 1.
Logic is same, in case of fwrite() also.
EDIT
for example, a fully successful call to fread() like
fread(readbuf, sizeof(int), 5 , stdin);
will return 5 while it will read sizeof(int) * 5 bytes. If we assume sizeof(int) is 4, then total bytes read will be 5 * 4 or 20. As you can see, here, the number of items read or written is not equal to the number of bytes transferred.
OTOH, another fully successful call to fread() like
fread(readbuf, sizeof(char), 5 , stdin);
will also return 5 while it will read sizeof(char) * 5 bytes, i.e., 5 bytes. In this case, as sizeof(char) is 1, so, here, the number of items read or written is equal to the number of bytes transferred. i.e., 5.
Related
This question already has answers here:
How does fread really work?
(7 answers)
Closed 9 years ago.
So I was trying to write something to stdout from my p->payload which is supposed to be a pointer to a data. And block_size is 256.
My problem is my stdout would always be exactly the same if I switch p->block_size and sizeof(char). Can someone tell my why? Thanks.
fwrite(p->payload, p->block_size,sizeof(char),stdout);
The difference is the value that it returns.
fwrite returns the number of data elements written.
If you execute
count = fwrite(ptr, 1, N, stdout);
then count will contain the exact number of bytes successfully written. If you do:
count = fwrite(ptr, N, 1, stdout);
then count will contain 1 if the entire write succeeded, or 0 if any of it failed.
The distinction may be useful if you need to distinguish between partial success and complete failure. If a partial write is no better than not writing anything at all, you can use the second form.
(There's not likely to be any significant difference in performance.)
See also this similar question about fread.
In general, just assume size * items bytes will be written.
This seems to be an attempt to mirror the fread API, which also has size and bytes parameters. For fread, this is more useful, as (for instance) you can ask it to read one item of 1000 bytes--meaning, "I want exactly 1000 bytes"--or you can ask it to read 1000 items of one byte apiece--meaning "I'll take up to 1000 bytes; less data than that is OK." Obviously, lots of useful combinations can be used here if your data being read is fixed-size.
In the case of fwrite, though, the only time you'd expect less than size * items bytes to be written would be in an error condition. In the case of "disk full" errors, you could hypothetically choose size and items such that you write as many complete items as possible, but never write a half-item; however, I've never seen anyone actually write code that works this way and doesn't just fail on disk full.
According to fwrite(3):
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
The function fwrite() writes nmemb elements of data, each size bytes long, to the stream pointed to by stream, obtaining them from the location given by ptr.
This means fwrite() will write 'size * nmemb' bytes from ptr to stream, because 'size * nmemb' equals 'nmemb * size', so you have the same number of bytes written to stream after switched size and nmemb.
#define MAXSIZE 256
fread(buff, sizeof(MAXSIZE), 1, infp);
Say at most we need to read 3 times, and after reading 2 times, the remaining stuff in infp is less than the size of MAXSIZE. How do we determine the size of information at the last read?
You can just check the return value of fread():
Return value
Number of objects read successfully, which may be less than count if an error or end-of-file condition occurs.
Like this:
size_t num = fread(...);
P.S.: as #chux commented, you are actually need to use fread(buff, MAXSIZE, 1, infp) instead.
From fread man page
On success, fread() and fwrite() return the number of items read or written. This number equals the number of bytes transferred only when size is 1. If an error occurs, or the end of the file is reached, the return value is a short item count (or zero).
fread() does not distinguish between end-of-file and error, and callers must use feof(3) and ferror(3) to determine which occurred.
Man fread
isn't it possible to read bytes left in a file that is smaller than buffer size?
char * buffer = (char *)malloc(size);
FILE * fp = fopen(filename, "rb");
while(fread(buffer, size, 1, fp)){
// do something
}
Let's assume size is 4 and file size is 17 bytes. I thought fread can handle last operation as well even if bytes left in file is smaller than buffer size, but apparently it just terminates while loop without reading one last byte.
I tried to use lower system call read() but I couldn't read any byte for some reason.
What should I do if fread cannot handle last part of bytes that is smaller than buffer size?
Yep, turn your parameters around.
Instead of requesting one block of size bytes, you should request size blocks of 1 bytes. Then the function will return how many blocks (bytes) it was able to read:
int nread;
while( 0 < (nread = fread(buffer, 1, size, fp)) ) ...
try using "man fread"
it clearly mention following things which itself answers your question:
SYNOPSIS
size_t fread(void *ptr, size_t size, size_t nitems, FILE *stream);
DESCRIPTION
fread() copies, into an array pointed to by ptr, up to nitems items of
data from the named input stream, where an item of data is a sequence
of bytes (not necessarily terminated by a null byte) of length size.
fread() stops appending bytes if an end-of-file or error condition is
encountered while reading stream, or if nitems items have been read.
fread() leaves the file pointer in stream, if defined, pointing to the
byte following the last byte read if there is one.
The argument size is typically sizeof(*ptr) where the pseudo-function
sizeof specifies the length of an item pointed to by ptr.
RETURN VALUE
fread(), return the number of items read.If size or nitems is 0, no
characters are read or written and 0 is returned.
The value returned will be less than nitems only if a read error or
end-of-file is encountered. The ferror() or feof() functions must be
used to distinguish between an error condition and an end-of-file
condition.
I have a question:
I am using fread to read a file.
typedef struct {
int ID1;
int ID2;
char string[256];
} Reg;
Reg *A = (Reg*) malloc(sizeof(Reg)*size);
size = FILESIZE/sizeof(Reg);
fread (A, sizeof(Reg), size, FILEREAD);
Using a loop, consecutively call this call, to make me read my entire file.
What will happen when I get near the end of the file, and I can not read "size" * sizeof (Reg), or if you are only available to read half this amount, what will happen with my array A. It will be completed? The function will return an error?
Knowing how the file was read by the fread through?
Edi1: Exactly, if the division is not exact, when I read the last bit smaller file size that I'll read things that are not on file, I'm wondering with my vector resize to the amount of bytes that I can read, or develop a dynamic better.
fread returns the number of records it read. Anything beyond that in your buffer may be mangled, do not rely on that data.
fread returns the number of full items actually read, which may be
less than count if an error occurs or if the end of the file is
encountered before reaching count.
The function will not read past the end of the file : the most likely occurrence is that you will get a bunch of full buffers and then a (final) partial buffer read, unless the file size is an exact multiple of your buffer length.
Your logic needs to accommodate this - the file size gives you the expected total number of records so it should not be hard to ignore trailing data in the buffer (after the final fread call) that corresponds to uninitialized records. A 'records remaining to read' counter would be one approach.
fread() returns the number of elements it could read. So you have to check the return value of fread() to find out how many elements in your array are valid.
It will return a short item count or zero if either an error occurred or EOF has is reached. You will have to use feof() ond ferror() in this case to check what condition is met.
The declaration of fread is as following:
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
The question is: Is there a difference in reading performance of two such calls to fread:
char a[1000];
fread(a, 1, 1000, stdin);
fread(a, 1000, 1, stdin);
Will it read 1000 bytes at once each time?
There may or may not be any difference in performance. There is a difference in semantics.
fread(a, 1, 1000, stdin);
attempts to read 1000 data elements, each of which is 1 byte long.
fread(a, 1000, 1, stdin);
attempts to read 1 data element which is 1000 bytes long.
They're different because fread() returns the number of data elements it was able to read, not the number of bytes. If it reaches end-of-file (or an error condition) before reading the full 1000 bytes, the first version has to indicate exactly how many bytes it read; the second just fails and returns 0.
In practice, it's probably just going to call a lower-level function that attempts to read 1000 bytes and indicates how many bytes it actually read. For larger reads, it might make multiple lower-level calls. The computation of the value to be returned by fread() is different, but the expense of the calculation is trivial.
There may be a difference if the implementation can tell, before attempting to read the data, that there isn't enough data to read. For example, if you're reading from a 900-byte file, the first version will read all 900 bytes and return 900, while the second might not bother to read anything. In both cases, the file position indicator is advanced by the number of characters successfully read, i.e., 900.
But in general, you should probably choose how to call it based on what information you need from it. Read a single data element if a partial read is no better than not reading anything at all. Read in smaller chunks if partial reads are useful.
According to the specification, the two may be treated differently by the implementation.
If your file is less than 1000 bytes, fread(a, 1, 1000, stdin) (read 1000 elements of 1 byte each) will still copy all the bytes until EOF. On the other hand, the result of fread(a, 1000, 1, stdin) (read 1 1000-byte element) stored in a is unspecified, because there is not enough data to finish reading the 'first' (and only) 1000 byte element.
Of course, some implementations may still copy the 'partial' element into as many bytes as needed.
That would be implementation detail. In glibc, the two are identical in performance, as it's implemented basically as (Ref http://sourceware.org/git/?p=glibc.git;a=blob;f=libio/iofread.c):
size_t fread (void* buf, size_t size, size_t count, FILE* f)
{
size_t bytes_requested = size * count;
size_t bytes_read = read(f->fd, buf, bytes_requested);
return bytes_read / size;
}
Note that the C and POSIX standard does not guarantee a complete object of size size need to be read every time. If a complete object cannot be read (e.g. stdin only has 999 bytes but you've requested size == 1000), the file will be left in an interdeterminate state (C99 ยง7.19.8.1/2).
Edit: See the other answers about POSIX.
fread calls getc internally. in Minix number of times getc is called is simply size*nmemb so how many times getc will be called depends on the product of these two. So Both fread(a, 1, 1000, stdin) and fread(a, 1000, 1, stdin) will run getc 1000=(1000*1) Times.
Here is the siimple implementation of fread from Minix
size_t fread(void *ptr, size_t size, size_t nmemb, register FILE *stream){
register char *cp = ptr;
register int c;
size_t ndone = 0;
register size_t s;
if (size)
while ( ndone < nmemb ) {
s = size;
do {
if ((c = getc(stream)) != EOF)
*cp++ = c;
else
return ndone;
} while (--s);
ndone++;
}
return ndone;
}
There may be no performance difference, but those calls are not the same.
fread returns the number of elements read, so those calls will return different values.
If an element cannot be completely read, its value is indeterminate:
If an error occurs, the resulting value of the file position indicator for the stream is
indeterminate. If a partial element is read, its value is indeterminate. (ISO/IEC 9899:TC2 7.19.8.1)
There's not much difference in the glibc implementation, which just multiplies the element size by the number of elements to determine how many bytes to read and divides the amount read by the member size in the end. But the version specifying an element size of 1 will always tell you the correct number of bytes read. However, if you only care about completely read elements of a certain size, using the other form saves you from doing a division.
One more sentence form http://pubs.opengroup.org/onlinepubs/000095399/functions/fread.html is notable
The fread() function shall read into the array pointed to by ptr up to nitems elements whose size is specified by size in bytes, from the stream pointed to by stream. For each object, size calls shall be made to the fgetc() function and the results stored, in the order read, in an array of unsigned char exactly overlaying the object.
Inshort in both case data will be accessed by fgetc()...!
I wanted to clarify the answers here. fread performs buffered IO. The actual read block sizes fread uses are determined by the C implementation being used.
All modern C libraries will have the same performance with the two calls:
fread(a, 1, 1000, file);
fread(a, 1000, 1, file);
Even something like:
for (int i=0; i<1000; i++)
a[i] = fgetc(file)
Should result in the same disk access patterns, although fgetc would be slower due to more calls into the standard c libraries and in some cases the need for a disk to perform additional seeks which would have otherwise been optimized away.
Getting back to the difference between the two forms of fread. The former returns the actual number of bytes read. The latter returns 0 if the file size is less than 1000, otherwise it returns 1. In both cases the buffer would be filled with the same data, i.e. the contents of the file up to 1000 bytes.
In general, you probably want to keep the 2nd parameter (size) set to 1 such that you get the number of bytes read.