I can't understand why function fread() behaves differently in these 2 examples:
1)
I have a structure with a short and a char (size is 4 bytes including padding) and an array of three such structures.If I write each short and char of each structure separately with fwrite() and then read that file with fread() to a variable whose type is that structure, I will read 4 bytes at a time (there will be 9 bytes in the file) so you can see that one byte will be left in 3rd iteration (and one byte will be lost in each iteration).What happens is that there is no 3rd read because I'm left with one byte and fread has to read 4 bytes.
2)
A simpler example, if I write a 1 byte char to a file with fwrite() and then put the content of that file into a 4 byte int with fread(), the integer will get that data.
Why does this happen?Why does the data get read in one case but not in the other if EOF is reached?
Here is the first example:
int main()
{
struct X { short int s; char c; } y, x[]=
{{0x3132,'3'},{0x3435,'6'},{0x3738,'9'}};
FILE *fp=fopen("FILE.DAT","wb+");
if (fp)
{
for(int i=0;i<sizeof(x)/sizeof(x[i]);)
{
fwrite(&x[i].s,sizeof(x[i].s),1,fp);
fwrite(&x[i].c,sizeof(x[i].c),1,fp);
i++;
}
rewind(fp);
for(int i=0;fread(&y,sizeof(y),1,fp);)
printf("%d:%x %c\n",++i, y.s, y.c);
fclose(fp);
}
return 0;
}
Second example:
int main()
{
FILE *fp=fopen("FILE.DAT","wb+");
char c = 'a';
fwrite(&c, sizeof(c), 1, fp);
rewind(fp);
int num;
fread(&num, sizeof(num), 1, fp);
fclose(fp);
return 0;
}
Why does the data get read in one case but not in the other if EOF is reached?
"What happens is that there is no 3rd read because I'm left with one byte and fread has to read 4 bytes." is a questionable premise.
1st Code did read 3 times. There are with no bytes left to read.
In both codes, the last read was a partial read with a fread() return value of 0.#Useless
(The first code did not print the result of the 3rd read.)
With fread(), a return value of 0 does not mean "end-of-file" was immediately encountered - nothing read. Instead, 0 means an complete read did not occur due to :
* "end-of-file" or partial read.
* rare I/O error.
Why does this happen?
In the 2nd code, results may differ due to Indeterminate behavior
fread() ... If a partial element is read, its value is indeterminate1 C11dr §7.21.8.1 2
fread(&num, sizeof(num), 1, fp) result may or may not be as expected.
A more informative example
int main(void) {
FILE *fp = fopen("FILE.DAT", "wb+");
char c = 'a';
printf(" %8X\n", c);
fwrite(&c, sizeof(c), 1, fp);
rewind(fp);
unsigned num = rand();
printf(" %8X\n", num);
size_t len = fread(&num, sizeof(num), 1, fp);
printf("%zu %8X\n", len, num);
len = fread(&num, sizeof(num), 1, fp);
printf("%zu\n", len);
fclose(fp);
return 0;
}
Output
61 as expected
5851F42D as expected - some random value
0 5851F461 Indeterminate! (in this case, looks like the LSByte was replaced.)
0 as expected
Moral of the story: assess the return value of fread() before relying on what was read into the buffer.
1 indeterminate value
either an unspecified value or a trap representation
... when EOF is reached ...
EOF isn't "reached". Many <stdio.h> functions return EOF as a signal that something went wrong, giving no indication what that something is. If you want to know what went wrong after receiving the signal, test with feof() and/or ferror().
Related
I'm dabbling in a bit of Game Boy save state hacking, and I'm currently getting my head around reading from a binary file in C. I understand that char is one byte so I'm reading in chars using the following code
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <unistd.h>
int main () {
FILE *fp = NULL;
char buffer[100];
int filesize;
fp = fopen("save file.srm", "r+"); //open
if(fp == NULL) printf("File loading error number %i\n", errno);
fseek(fp, 0L, SEEK_END); //seek to end to get file size
filesize = ftell(fp);
printf("Filesize is %i bytes\n",filesize);
fseek(fp, 0, SEEK_SET); //set read point at start of file
fread(buffer, sizeof buffer, 1, fp);//read into buffer
for(int i=70;i<80;i++) printf("%x\n", buffer[i]); //display
fclose(fp);
return(0);
}
The output I get is
Filesize is 32768 bytes
ffffff89
66
ffffff85
2
2
ffffff8b
44
ffffff83
c
0
I'm trying to load in bytes, so I want each row of the output to have maximum value 0xff. Can someone explain why I'm getting ffffff8b?
if(fp == NULL) printf("File loading error number %i\n", errno);
When you detect an error, do not just print a message. Either exit the program or do something to correct for the error.
char buffer[10];
Use unsigned char for working with raw data. char may be signed, which can cause undesired effects.
fread(buffer, strlen(buffer)+1, 1, fp);
buffer has not been initialized at this point, so the behavior of strlen(buffer) is not defined by the C standard. In any case, you do not want to use the length of the string currently in buffer as the size for fread. You want the size of the array. So use sizeof buffer (without the +1).
for(int i=0;i<10;i++)
Do not iterate to ten. Iterate to the number of bytes put into the buffer by fread. fread returns size_t value that is the number of items read. If you use it as size_t n = fread(buffer, 1, sizeof buffer, fp);, the number of items (in n) will be the number of bytes read, since having 1 for the second argument says each item to read is one byte.
printf("%x\n", buffer[i]);
To print an unsigned char, use %hhx. Because your buffer had signed char elements, some of them were negative. When used in this printf, they were promoted to negative int values. Then, because of the %x, printf attempted to print them as unsigned int values. All the extra bits from the negative values in two’s complement form showed up.
Very simply: char can be signed or unsigned by default: that's down to the compiler. In your case, it appears to be signed.
When you pass the char of buffer[i] to printf(), it is promoted to int, and sign-extended if the original char value had its top bit set. Hence anything that's in the range 0x80-0xff gets a lot of fs prefixing the value.
If you declare buffer to be unsigned char, this problem should not occur. But you should, in combination with that, use "%hhx" rather than "%x" for your printf() format, since the hh length modifier forces printf() to mask the input value so that only those bits applicable to an unsigned char (given that you're using the x specifier) are used.
These are the contents of my file, 'unsorted.txt' :
3 robert justin trump
This is my code:
#include <stdio.h>
int main(void) {
FILE *f = fopen("unsorted.txt", "r");
char n;
printf("%d\n", ftell(f));
fscanf(f, "%s", &n);
int l = n - '0';
printf("%d %d\n", l, ftell(f));
return 0;
}
on execution it gives the following output:
0
3 -1
why did it return -1 in second case? It should move from 0 to 1 right?
NOTE: the file can be opened, because then how would it print 0 in the first call and the first character from the file without being able to be opened?
fscanf(f,"%s",&n);
is very wrong, since you declared char n; (of only one byte). You got undefined behavior. Be very scared (and next time, be ashamed).
I recommend:
Test that fopen don't fail:
FILE *f = fopen("unsorted.txt","r");
if (!f) { perror("fopen unsorted.txt"); exit(EXIT_FAILURE); };
Declare a buffer of reasonable size (80 was the size of punched cards in the 1970s).
char buf[80];
clear it (you want defensive programming):
memset(buf, 0, sizeof(buf));
Then read carefully about fscanf. Read that documentation several times. Use it with a fixed size and test its result:
if (fscanf(f, "%72s", buf) > 0) {
(72 was the usable size in PL/1 programs of punched cards; it is less than 80)
Don't forget to read documentation of other functions, including ftell.
Important hint:
compile with all warnings and debug info (gcc -Wall -Wextra -g with GCC), improve the code to get no warnings, use the debugger gdb to run it step by step.
PS. As an exercise, find the possible content of unsorted.txt which made your initial program run correctly. Could you in that case predict its output? If not, why??
There are multiple problems in your code:
You do not test the return value of fopen(). Calling ftell() with a NULL pointer has undefined behavior. You cannot draw conclusions from observed behavior.
printf("%d\n", ftell(f)); is incorrect because the return value of ftell() is a long. You should use the format %ld.
fscanf(f, "%s", &n); is incorrect because you pass the address of a single char for fscanf() to store a null-terminated string. fscanf() will access memory beyond the size of the char, which has undefined behavior. Define an array of char such as char buf[80]; and pass the maximum number of characters to store as: fscanf(f, "%79s", buf); and check the return value, or use %c to read a single byte.
int l = n - '0'; is not strictly incorrect, but it is error prone: avoid naming a variable l as it looks confusingly similar to 1.
printf("%d %d\n", l, ftell(f)); is incorrect as the previous call to printf: use the conversion specifier %ld for the return value of ftell().
Note also that the return value of ftell() on a text stream is not necessarily the byte offset in the file.
Here is a corrected version:
#include <stdio.h>
int main(void) {
FILE *f = fopen("unsorted.txt", "r");
char c;
if (f != NULL) {
printf("%ld\n", ftell(f));
if (fscanf(f, "%c", &c) == 1) {
int diff = c - '0';
printf("%d %ld\n", diff, ftell(f));
}
}
return 0;
}
Output:
0
3 1
Aloha,
I'm new here, so please take it easy on me.
I'm trying to read a file with function read() and then write() to a file or a file descriptor. My function successfully reads a file, but a problem occurs when I try to read a larger file(in my example size of 40,000 bytes).
I think that I must write a while loop, which will be reading until the end of a file, but I am stuck on the idea of how to..
(I open a file or file descriptor in main of the program)
My function( also convert binary input char data and writes to the ASCII) :
void function(int readFrom,int writeOn){
char buffer[100];
int x = read(readFrom, buffer, sizeof(buffer));
int size= x/8;
int i;
for(i=0; i<size; i++){
char temp[sizeof(int)-1];
sprintf(temp,"%d",buffer[i];
write(writeOn, temp, sizeof(temp));
}
}
You need to check return value of functions read and write. They return the number of bytes read/written that may be less than the number that you passed as third argument. Both read and write must be done in a loop like:
int bytesRead = 0;
while (bytesRead < sizeof(buffer)) {
int ret = read(readFrom, buffer + bytesRead, sizeof(buffer) - bytesRead);
if (ret == 0)
break; / * EOF */
if (ret == -1) {
/* Handle error */
}
bytesRead += ret;
}
You use sprintf() to convert characters from buffer into a very small buffer temp. On most current systems, int is 4 bytes, so your printf causes buffer overflows for char values greater than 99 (ASCII letter 'c'). Note that char can be signed by default, so negative values less than -99 will require 5 bytes for the string conversion: 3 digits, a minus sign and a null terminator.
You should make this buffer larger.
Furthermore, I don't understand why you only handle x/8 bytes from the buffer read by the read() function. The purpose of your function is obscure.
I want read from a .txt file which contains english sentences and store them into a character array. Each character by character. I tried but got segmentation fault:11 . I have trouble with fscanf and reading from a file in C.
#include<stdio.h>
#include<math.h>
#include<limits.h>
int main()
{
FILE* fp = fopen("file1.txt","r");
char c , A[INT_MAX];
int x;
while(1)
{
fscanf("fp,%c",&c);
if(c == EOF)
{break;}
A[x] = c;
x++;
}
int i;
for (i=0;i<x;i++)
printf("%c",A[i]);
return 0;
}
Problem 1: Putting the array onto the stack as A[INT_MAX] is bad practice; it allocates an unreasonable amount of space on the stack (and will crash on machines where INT_MAX is large relative to the size of memory). Get the file size, then malloc space for it.
fseek(fp, SEEK_END);
long size = ftell(fp);
rewind(fp);
char *A = malloc((size_t) size); // assumes size_t and long are the same size
if (A == NULL) {
// handle error
}
Problem 2: The fscanf is wrong. If you insist on using fscanf (which is not a good way to read an entire file; see problem 4), you should change:
fscanf("fp,%c",&c);`
should be
int count = fscanf(fp, "%c",&c);
if (count <= 0)
break;
Problem 3: Your x counter is not initialized. If you insist on using fscanf, you'd need to initialize it:
int x = 0;
Problem 4: The fscanf is the wrong way to read the entire file. Assuming you've figured out how large the file is (see problem 1), you should read the file with an fread, like this:
int bytes_read = fread(A, 1, size, fp);
if (bytes_read < size) {
// something went wrong
}
My initial answer, and a good general rule:
You need to check the return value, because your c value can never be EOF, because EOF is an int value that doesn't fit into a char. (You should always check return values, even when it seems like errors shouldn't happen, but I haven't consistently done that in the code above.)
From http://www.cplusplus.com/reference/cstdio/fscanf/ :
Return Value
On success, the function returns the number of items of the argument list successfully filled. This count can match the expected number of items or be less (even zero) due to a matching failure, a reading error, or the reach of the end-of-file.
If a reading error happens or the end-of-file is reached while reading, the proper indicator is set (feof or ferror). And, if either happens before any data could be successfully read, EOF is returned.
If an encoding error happens interpreting wide characters, the function sets errno to EILSEQ.
Hi you should declear till where the program should read data. You can access all characters even if you read line like a string.
try it out
#include<stdio.h>
#include<string.h>
#define INT_MAX 100
int main()
{
FILE* fp = fopen("file1.txt","r");
char c , A[INT_MAX];
int i;
int x;
j=0
while(fscanf(fp,"%s",A[j])!=EOF)
{
j++;
}
int i;
int q;
for(q=0;q<j;q++)
{
for (i=0;i<strlen(A[q]);i++)
printf("%c ",A[q][i]);
printf("\n");
}
return 0;
}
I can't figure out why my while loop won't work. The code works fine without it... The purpose of the code is to find a secret message in a bin file. So I got the code to find the letters, but now when I try to get it to loop until the end of the file, it doesn't work. I'm new at this. What am I doing wrong?
main(){
FILE* message;
int i, start;
long int size;
char keep[1];
message = fopen("c:\\myFiles\\Message.dat", "rb");
if(message == NULL){
printf("There was a problem reading the file. \n");
exit(-1);
}
//the first 4 bytes contain an int that tells how many subsequent bytes you can throw away
fread(&start, sizeof(int), 1, message);
printf("%i \n", start); //#of first 4 bytes was 280
fseek(message, start, SEEK_CUR); //skip 280 bytes
keep[0] = fgetc(message); //get next character, keep it
printf("%c", keep[0]); //print character
while( (keep[0] = getc(message)) != EOF) {
fread(&start, sizeof(int), 1, message);
fseek(message, start, SEEK_CUR);
keep[0] = fgetc(message);
printf("%c", keep[0]);
}
fclose(message);
system("pause");
}
EDIT:
After looking at my code in the debugger, it looks like having "getc" in the while loop threw everything off. I fixed it by creating a new char called letter, and then replacing my code with this:
fread(&start, sizeof(int), 1, message);
fseek(message, start, SEEK_CUR);
while( (letter = getc(message)) != EOF) {
printf("%c", letter);
fread(&start, sizeof(int), 1, message);
fseek(message, start, SEEK_CUR);
}
It works like a charm now. Any more suggestions are certainly welcome. Thanks everyone.
The return value from getc() and its relatives is an int, not a char.
If you assign the result of getc() to a char, one of two things happens when it returns EOF:
If plain char is unsigned, then EOF is converted to 0xFF, and 0xFF != EOF, so the loop never terminates.
If plain char is signed, then EOF is equivalent to a valid character (in the 8859-1 code set, that's ÿ, y-umlaut, U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS), and your loop may terminate early.
Given the problem you face, we can tentatively guess you have plain char as an unsigned type.
The reason that getc() et al return an int is that they have to return every possible value that can fit in a char and also a distinct value, EOF. In the C standard, it says:
ISO/IEC 9899:2011 §7.21.7.1 The fgetc() function
int fgetc(FILE *stream);
If the end-of-file indicator for the input stream pointed to by stream is not set and a
next character is present, the fgetc function obtains that character as an unsigned char converted to an int ...
If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-
file indicator for the stream is set and the fgetc function returns EOF.
Similar wording applies to the getc() function and the getchar() function: they are defined to behave like the fgetc() function except that if getc() is implemented as a macro, it may take liberties with the file stream argument that are not normally granted to standard macros — specifically, the stream argument expression may be evaluated more than once, so calling getc() with side-effects (getc(fp++)) is very silly (but change to fgetc() and it would be safe, but still eccentric).
In your loop, you could use:
int c;
while ((c = getc(message)) != EOF) {
keep[0] = c;
This preserves the assignment to keep[0]; I'm not sure you truly need it.
You should be checking the other calls to fgets(), getc(), fread() to make sure you are getting what you expect as input. Especially on input, you cannot really afford to skip those checks. Sooner, rather than later, something will go wrong and if you aren't religiously checking the return statuses, your code is likely to crash, or simply 'go wrong'.
There are 256 different char values that might be returned by getc() and stored in a char variable like keep[0] (yes, I'm oversummarising wildly). To detect end-of-file reliably, EOF has to have a value different from all of them. That's why getc() returns int rather than char: because a 257th distinct value for EOF wouldn't fit into a char.
Thus you need to store the value returned by getc() in an int at least until you check it against EOF:
int tmpc;
while( (tmpc = getc(message)) != EOF) {
keep[0] = tmpc;
...