I use fread to read into a char buffer.
char buffer[50];
int nbytes = fread(buffer, 1, 50, fp);
The file I read from contains exactly the word Hello, i.e. 5 bytes.
In the above example, nbytes equals 6. Why?
Additionally, reading from a zero-byte file (i.e. it's empty) returns 0.
My guess is that whatever wrote to the file you are reading either included a newline (if it's a text file) or a 0 byte after the string. If you are on unix, run the following command:
od -c filename
Which will print the entire contents of the file including non-printables.
You can also run:
wc --bytes filename
Which will print the length of the file in bytes (along with the filename).
Related
I'm learning the file handling of C and I got some problem.
I wrote the codes as follows,
# include <stdio.h>
# include <stdlib.h>
int main(void)
{
FILE * file;
errno_t err = fopen_s(&file,"f.txt","r");
fseek(file, 0, SEEK_END);
int size = ftell(file);
fseek(file, 0, SEEK_SET);
char *tmp;
tmp = malloc(size);
printf("%d\n", size);
for (int i = 0; !feof(file); i++)
{
fread(tmp + i, 1, 1, file);
size = i + 1;
}
printf("%d\n", size);
fclose(file);
free(tmp);
return 0;
However, the outputs of size are not the same(1st: 78, 2nd: 76), what is the reason behind this?
I suspect you are using Microsoft Windows. In Microsoft’s C/C++ implementation, binary streams and text streams are different. If you had opened the file with "rb" passed to fopen_s as its third parameter, the file would be opened with a binary stream, and fread would return the actual bytes in the file.
Since you opened the file with "r", it was opened as a text stream. In this mode, some processing is performed when reading and writing the file. Notably, Windows uses two characters, a new-line '\n' and a carriage-return '\r', at the end of each line. When reading the file as a text stream, these two characters are reduced to a single '\n'. Conversely, when writing a text stream, writing a '\n' produces '\n' and '\r' in the file.
For a binary stream, ftell gives the number of bytes from the beginning of the file. For a text stream, the C standard only specifies that ftell is usable for resetting the stream position using fseek—it is not necessarily the number of bytes (in the actual file) or characters (appearing in the stream) from the beginning of the file. A C implementation might implement ftell so that it gives the number of bytes from the beginning of the file (and that is the 78 you are seeing), but, even if it does, you cannot easily use that to know how many characters are in the text stream.
Additionally, as others have noted in comments, this code is wrong:
for (int i = 0; !feof(file); i++)
{
fread(tmp + i, 1, 1, file);
size = i + 1;
}
The standard library routines do not know the end of the file file has been reached until you attempt a read and it fails because the end of the file was reached. For example, if there is one character in the file, and you read it, feof(file) is still false—the end of the file has not been encountered. It is not until you try to read a second character and fread fails that feof(file) becomes true.
Because of this, the above loop ultimately sets size to one more than the number of characters read because, at the beginning of the file iteration, !feof(file) was true, so fread was attempted, it failed, and then size was set to i + 1 even though no byte was just read.
Because this is how feof works, you could not use it for controlling a loop like this. Instead, you should write the loop so that it tests the result of fread and, if it fail to read any characters, the code exits the loop. That code could be something like:
int i = 0;
do
{
size_t result = fread(tmp + i, 1, 1, file);
if (result == 0)
break;
i++;
}
size = i;
(Note that, if you were reading more than one byte at a time with fread, additional code would be needed to handle the case where the number of bytes read was between zero and the number requested.)
Once that loop is fixed, you should see the number of characters in the stream reported as 75. Most likely, your file f.txt contains three lines of text with 72 characters total excluding the line endings. When read as a text stream, there are three '\n' characters, so the total is 75. When read as a binary stream, there are three '\n' characters and three '\r' characters, so the total is 78.
I am writing some code that is supposed to read in blocks of 16 bytes at a time from an input file. I am using fread to do this however I run into problems when I get to the last few bytes of the file.
size_t bytesread=1;
while(bytesread > 0){
bytesread = fread(buffer,16,1,inputfile);
buffer[16]='\0';
fprintf("Read in line: "%s"\n,buffer);
}
Say for example my text file is "This is a testfile. Here are some words".
It would print out
Read in line: "This is a testfi"
Read in line: "le. Here are som"
Read in line: "e words
are som"
I can't figure out why it adds on the extra characters when reading in the last line. I understand that I am reading in a block of 16 bytes but how would I deal with the last block where I only want to read in the last 7 bytes?
fread(buffer,16,1,inputfile); attempts to read one block of 16 bytes. If it fails, fread returns zero, indicating that zero complete blocks were read.
You do not want this; you want to know how many characters were read. So use this code, which attempts to read 16 blocks of one byte each:
bytesread = fread(buffer, 1, 16, inputfile);
After this code, bytesread contains the number of bytes read. You can use this to put an end-of-string marker after the last byte read:
buffer[bytesread] = '\0';
Then printf("Read in line: \"%s\"\n", buffer); will print the bytes just read and no more.
I'm trying to read data from a file into a buffer. The data in file is of 900K bytes. (seek to end of file and ftell()). Allocated the buffer in which the data is to be read of size 900K + 1 (to null terminate). My question is that fread() returns 900K but the I see the strlen(buffer) it shows lesser value and in the buffer at the last I can see something like ".....(truncated)". Why is this behavior? Is there a limit with fread() beyond which we cannot read into buffer and it will truncate it. Also why the return value of fread() says 900K even though actually it has read even less.?
strlen does something along these lines:
int strlen(char *str)
{
int len = 0;
while(*str++) len++;
return len;
}
If your file contains binary data (or if it's a text file with a UTF encoding and unused upper bytes) strlen is going to stop at the first 0x00 byte it encounters and return how many bytes into the file that was encountered. If you read a text file in a single-byte encoding like ANSI there won't be a null terminator and calling strlen will invoke undefined behavior.
If you want to determine how many bytes that fread successfully read out of the file, check its return value.1
If you want to determine the file size before reading a file, do this:
size_t len;
fseek(fp, 0, SEEK_END);
len = ftell(fp);
rewind(fp);
len will contain the file's size in bytes.
1: Assuming you called fread with parameter 2 set to 1 byte per element and didn't try to read more bytes than are actually in the file.
Your main question has already been answered, though it's worth notice that strlen is not designed to measure the size of an array but a NULL-terminated string. It probably prints a lower value because strlen returns the number of characters that appear before a null-char, so if you have nullchars ('\0') through your data, strlen will stop as soon as it finds one of them.
You should trust fread 's return value.
EDIT: as a note, fread MAY read less bytes than requested, and it can be caused by an error or an end of file. You can check it with ferror and feof, respectively.
I'm just starting in c/c++. I'm able to write a file from binary :
FILE *myFile= fopen("/mnt/music.mp3", "ab+"); // Doesn't exist
fwrite(binaryBuffer, sizeOfBuffer, 1, myFile);
All I want is to get a new "binaryBuffer" from "myFile"
How I can do that ?
Thanks !
Use the fread function, which works just like fwrite:
char buffer[BUFFER_SIZE]; // declare a buffer
fread(buffer, length, 1, file); //read length amount of bytes into buffer
If you don't know how many bytes to read you can seek to the end of the file to find the length.
(If you read from the same file you just wrote to you will want to rewind)
http://www.cplusplus.com/reference/cstdio/fread/
The user should input some file names in the command line and the program will read each file name from argv[] array. I have to perform error checking etc.
I want to read each filename. For example, if argv[2] is 'myfile.txt', the program should read the content of 'myfile.txt' and store value in char buffer[BUFSIZ] and then write the content of buffer into another file.
However before the content is written, the program should also write the name of the file and the size. Such that the file can be easily extracted later. A bit like the tar function.
The file I write the content of buffer, depending on the number of files added by user, should be a string like:
myfile.txt256Thisisfilecontentmyfile2.txt156Thisisfile2content..............
My question is
1) How do I write value of argv[2] into file using write() statement, as having problems writing char array, what should I put as (sizeof(?)) inside write(). see below as I don't know the length of the file name entered by the user.
2) Do I use the '&' to write an integer value into file after name, for example write 4 bytes after file name for the size of file
Here is the code I have written,
char buffer[BUFSIZ];
int numfiles=5; //say this is no of files user entered at command
open(file.....
lseek(fdout, 0, SEEK_SET); //start begging of file and move along each file some for loop
for(i=0-; ......
//for each file write filename,filesize,data....filename,filesize,data......
int bytesread=read(argv[i],buffer,sizeof(buffer));
write(outputfile, argv[i], sizeof(argv)); //write filename size of enough to store value of filename
write(outputfile, &bytesread, sizeof(bytesread));
write(outputfile, buffer, sizeof(buffer));
But the code is not working as I expected.
Any suggestions?
Since argv consists of null-terminated arrays, the length you can write is strlen(argv[2])+1 to write both the argument and null terminator:
size_t sz = strlen (argv[2]);
write (fd, argv[2], sz + 1);
Alternatively, if you want the length followed by the characters, you can write the size_t itself returned from strlen followed by that many characters.
size_t sz = strlen (argv[2]);
write (fd, &sz, sizeof (size_t));
write (fd, argv[2], sz);
You probably also need to write the length of the file as well so that you can locate the next file when reading it back.
1., You can write the string the following way:
size_t size = strlen(string);
write(fd, string, size);
However, most of the time it's not this simple: you will need the size of the string so you'll know how much you need to read. So you should write the string size too.
2., An integer can be written the following way:
write(fd, &integer, sizeof(integer));
This is simple, but if you plan to use the file on different architectures, you'll need to deal with endianness too.
It sounds like your best bet is to use a binary format. In your example, is the file called myfile.txt with a content length of 256, or myfile.txt2 with a content length of 56, or myfile.txt25 with a content length of 6? There's no way to distinguish between the end of the filename and the start of the content length field. Similarly there is no way to distinguish between the end of the content length and the start of the content. If you must use a text format, fixed width fields will help with this. I.e. 32 characters of filename followed by 6 digits of content length. But binary format is more efficient.
You get the filename length using strlen(), don't use sizeof(argv) as you will get completely the wrong result. sizeof(argv[i]) will also give the wrong result.
So write 4 bytes of filename length followed by the filename then 4 bytes of content length followed by the content.
If you want the format to be portable you need to be aware of byte order issues.
Lastly, if the file won't all fit in your buffer then you are stuffed. You need to get the size of the file you are reading to write it to your output file first, and then make sure you read that number of bytes from the first file into the second file. There are various techniques to do this.
thanks for replies guys,
I decided not to use (size_t) structure instead just assigned (int) and (char) types so I know exact value of bytes to read() out. ie I know start at beggining of file and read 4 bytes(int) to get value of lenght of filename, which I use as size in next read()
So, when I am writing (copying file exactly with same name) users inputted file to the output file (copied file) I writing it in long string, without spaces obviously just to make it readable here,
filenamesize filename filecontentsize filecontent
ie 10 myfile.txt 5 hello
So when come to reading that data out I start at begining of file using lseek() and I know the first 4 bytes are (int) which is lenght of filename so I put that into value int namelen using the read function.
My problem is I want to use that value read for the filenamesize(first 4 bytes) to declare my array to store filename with the right lenght. How do I put this array into read() so the read stores value inside that char array specified, see below please
int namelen; //value read from first 4 bytes of file lenght of filename to go in nxt read()
char filename[namelen];
read(fd, filename[namelen], namelen);//filename should have 'myfile.txt' if user entered that filename
So my question is once I read that first 4 bytes from file giving me lenght of filename stored in namelen, I then want to read namelen amount of bytes to give me the filename of originally file so I can create copied file inside directory?
Thanks