Unexpected output when using fseek - c

Assuming we have a text file named hi.txt that contains the following string:
AbCdE12345
Let say we run this code:
int main() {
FILE *fp;
fp = fopen("hi.txt","r");
if(NULL == fp) { return 1; }
fseek(fp,-1, SEEK_END);
while (ftell(fp) > 0) {
printf("%c",fgetc(fp));
fseek(fp,-4, SEEK_CUR);
}
fclose(fp);
return 0;
}
When I ran this code it printed: 3EbCd
When I tried to guess what it would print I thought that it should be 52d.
Can anyone explain what has happened here ?

It looks like there is a non-printable end-of-line character at the end of your file. That's what gets printed first. Then the position is moved in turn to 3, E, and b. At this point, re-positioning by -3 fails, because the location would become -2. File cursor stays where it was, i.e. at C which gets printed next. The following attempt at repositioning fails too, so d gets printed. The next repositioning succeeds, terminating the loop.
To detect situations when fseek is ignored, check its return value, like this:
while (ftell(fp) > 0) {
printf("%c",fgetc(fp));
// Successful calls of fseek return zero
if (fseek(fp,-4, SEEK_CUR)) {
// Exit the loop if you can't jump back by 4 positions
break;
}
}

For files opened in text mode the offset passed to fseek is only meaningful for values returned by ftell. So the offset may not necessarily be in bytes. Try opening the file in binary mode:
fp = fopen("hi.txt", "rb");
and see if the results are different.

Related

What empty binary files contain in C

#include <stdio.h>
#include <stdlib.h>
int main(void) {
FILE *fp;
fp = fopen("clients.dat", "wb");
fclose(fp);
fp = fopen("clients.dat", "rb");
while (1) {
if (fp == EOF)
break;
else
printf("There is something inside a file");
}
fclose(fp);
return 0;
}
Here comes a question: what do empty binary files contain? should the pointer point to the EOF character? I mean: isn't it that the first and last thing in the file is EOF? OR how Can I check whether a file is empty or not?
An empty file contains nothing, it is empty. So it contains 0 bytes. EOF is not a character that is at the end of a file, it is an integer constant used as return value from some of the standard methods reading from a file to indicate end of file or some sort of error.
When you open a file you get a pointer to a FILE type back, this is what you can expect even from an empty file.
A file is not terminated the same way a string is, so there is no equivalent of a NULL character in a file, that determines when the file contents stops.
To determine whether a file you have opened and have a valid FILE pointer to is empty you can use fseek and ftell:
fseek(fp, 0, SEEK_END);
size = ftell(fp);
if (size == 0) {
// File is empty
}
Function fopen returns a pointer to a file handle of type FILE, not a pointer to any content of the file or a pointer to an EOF-character. The pointer is NULL if the file could not be opened at all, but does not indicate whether the file is empty or not.
To check if a file is empty you either (1) need to make an attempt to read bytes and handle the various results, or (2) to use fseek and ftell to move the read pointer to the end and ask then for the position.
(1)
fp=fopen("clients.dat","rb");
char buffer;
size_t bytesRead = fread(&buffer, 1, 1, fp); // try to read one byte
if(bytesRead == 1) {
printf("file contains at least one byte\n");
} else { // error handling
if (feof(fp))
printf("Attemt to read though end of file has been reached. File is empty.\n");
else if (ferror(fp)) {
perror("Error reading file.");
}
}
(2)
fp=fopen("clients.dat","rb");
fseek(fp, 0L, SEEK_END);
long size = ftell(fp);
if (size==0) {
// file is empty.
}
I'd prefer the second variant.
Here's another approach:
To check if the file is empty, you can simply read the file:
int c = fgetc(fp);
if (c == EOF)
{
// The file is empty or an error occured
if (feof(fp))
{
// Empty
}
else
{
// Error during file read
}
}
else
{
// non-empty file
}
Here comes a question what empty binary files contain ?
Empty files contain nothing, that is what makes them empty.
Regular files have a size which is not part of their data, but instead is normally a part of the directory entry or inode.
should the pointer point to the EOF character ?
No
First of all the pointer returned by fopen is NOT a pointer to the content of the file, but merely a pointer to a data structure describing the open file.
Secondly EOF is not an actual part of the file, but a special return value from the getc family of functions used to indicate that the end of file has been reached.
To test whether you are at the end of a file without reading from it you can use the feof function.

y with umlaut in file

I'm working on an example problem where I have to reverse the text in a text file using fseek() and ftell(). I was successful, but printing the same output to a file, I had some weird results.
The text file I input was the following:
redivider
racecar
kayak
civic
level
refer
These are all palindromes
The result in the command line works great. In the text file that I create however, I get the following:
ÿsemordnilap lla era esehTT
referr
levell
civicc
kayakk
racecarr
redivide
I am aware from the answer to this question says that this corresponds to the text file version of EOF in C. I'm just confused as to why the command line and text file outputs are different.
#include <stdio.h>
#include <stdlib.h>
/**********************************
This program is designed to read in a text file and then reverse the order
of the text.
The reversed text then gets output to a new file.
The new file is then opened and read.
**********************************/
int main()
{
//Open our files and check for NULL
FILE *fp = NULL;
fp = fopen("mainText.txt","r");
if (!fp)
return -1;
FILE *fnew = NULL;
fnew = fopen("reversedText.txt","w+");
if (!fnew)
return -2;
//Go to the end of the file so we can reverse it
int i = 1;
fseek(fp, 0, SEEK_END);
int endNum = ftell(fp);
while(i < endNum+1)
{
fseek(fp,-i,SEEK_END);
printf("%c",fgetc(fp));
fputc(fgetc(fp),fnew);
i++;
}
fclose(fp);
fclose(fnew);
fp = NULL;
fnew = NULL;
return 0;
}
No errors, I just want identical outputs.
The outputs are different because your loop reads two characters from fp per iteration.
For example, in the first iteration i is 1 and so fseek sets the current file position of fp just before the last byte:
...
These are all palindromes
^
Then printf("%c",fgetc(fp)); reads a byte (s) and prints it to the console. Having read the s, the file position is now
...
These are all palindromes
^
i.e. we're at the end of the file.
Then fputc(fgetc(fp),fnew); attempts to read another byte from fp. This fails and fgetc returns EOF (a negative value, usually -1) instead. However, your code is not prepared for this and blindly treats -1 as a character code. Converted to a byte, -1 corresponds to 255, which is the character code for ÿ in the ISO-8859-1 encoding. This byte is written to your file.
In the next iteration of the loop we seek back to the e:
...
These are all palindromes
^
Again the loop reads two characters: e is written to the console, and s is written to the file.
This continues backwards until we reach the beginning of the input file:
redivider
^
Yet again the loop reads two characters: r is written to the console, and e is written to the file.
This ends the loop. The end result is that your output file contains one character that doesn't exist (from the attempt to read past the end of the input file) and never sees the first character.
The fix is to only call fgetc once per loop:
while(i < endNum+1)
{
fseek(fp,-i,SEEK_END);
int c = fgetc(fp);
if (c == EOF) {
perror("error reading from mainText.txt");
exit(EXIT_FAILURE);
}
printf("%c", c);
fputc(c, fnew);
i++;
}
In addition to #melpomene correction about using only 1 fgetc() per loops, other issues exist.
fseek(questionable_offset)
fopen("mainText.txt","r"); opens the file in text mode and not binary mode. Thus the using fseek(various_values) as a valid offset into the file is prone to troubles. Usually not a problem in *nix systems.
I do not have a simple alternative.
ftell() return type
ftell() return long. Use long instead of int i, endNum. (Not a concern with small files)
Check return values
ftell() and fseek() can fail. Test for error returns.

fscanf while-loop never runs

I'm working on a project, and I can't seem to figure out why a piece of my function for finding prime numbers won't run. Essentially, I want to code to first check the text file log for any previously encountered prime numbers, but no matter what I put for the while-loop containing fscanf(), it seems like my code never enters it.
int filePrime(int a) {
int hold = 0;
FILE *fp = fopen("primes.txt", "a+");
if (fp == NULL) {
printf("Error while opening file.");
exit(2);
}
/*
the while loop below this block is the one with the issue.
on first run, it should skip this loop entirely, and proceed
to finding prime numbers the old-fashioned way, while populating the file.
instead, it is skipping this loop and proceeding right into generating a
new set of prime numbers and writing them to the file, even if the previous
numbers are already in the file
*/
while (fscanf(fp, "%d", &hold) == 1){
printf("Inside scan loop.");
if (hold >= a) {
fclose(fp);
return 1;
}
if (a % hold == 0) {
fclose(fp);
return 0;
}
}
printf("Between scan and print.\n");
for (; hold <= a; hold++) {
if (isPrime(hold) == 1) {
printf("Printing %d to file\n", hold);
fprintf(fp, "%d\n", hold);
if (hold == a)
return 1;
}
}
fclose(fp);
return 0;
}
I have tried all sorts of changes to the while-loop test.
Ex. != 0, != EOF, cutting off the == 1 entirely.
I just can't seem to get my code to enter the loop using fscanf.
Any help is very much appreciated, thank you so much for your time.
In a comment, I asked where the "a+" mode leaves the current position?
On Mac OS X 10.11.4, using "a+" mode opens the file and positions the read/write position at the end of file.
Demo code (aplus.c):
#include <stdio.h>
int main(void)
{
const char source[] = "aplus.c";
FILE *fp = fopen(source, "a+");
if (fp == NULL)
{
fprintf(stderr, "Failed to open file %s\n", source);
}
else
{
int n;
char buffer[128];
fseek(fp, 0L, SEEK_SET);
while ((n = fscanf(fp, "%127s", buffer)) == 1)
printf("[%s]\n", buffer);
printf("n = %d\n", n);
fclose(fp);
}
return(0);
}
Without the fseek(), the return value from n is -1 (EOF) immediately.
With the fseek(), the data (source code) can be read.
One thing slightly puzzles me: I can't find information in the POSIX fopen() specification (or in the C standard) which mentions the read/write position after opening a file with "a+" mode. It's clear that write operations will always be at the end, regardless of intervening uses of fseek().
POSIX stipulates that the call to open() shall use O_RDWR|O_CREAT|O_APPEND for "a+", and open() specifies:
The file offset used to mark the current position within the file shall be set to the beginning of the file.
However, as chux notes (thanks!), the C standard explicitly says:
Annex J Portability issues
J.3 Implementation-defined behaviour
J.3.12 Library functions
…
Whether the file position indicator of an append-mode stream is initially positioned at
the beginning or end of the file (7.21.3).
…
So the behaviour seen is permissible in the C standard.
The manual page on Mac OS X for fopen() says:
"a+" — Open for reading and writing. The file is created if it does not exist. The stream is positioned at the end of the file. Subsequent writes to the file will always end up at the then current end of file, irrespective of any intervening fseek(3) or similar.
This is allowed by Standard C; it isn't clear it is fully POSIX-compliant.

C - Using fscanf to read '-1' from a file

I'm a bit new to C, but basically I have a problem where I need to read '-1' from a file. Sadly this means I run into a premature ending of the file, because the EOF constant is also -1 in my compiler.
What sort of work arounds would there be for this? Is there another function I can use to read it that will change the EOF to something I can work with?
Thanks in advance.
The code since people are asking for it
int read() {
int returnVal; // The value which we return
// Open the file if it isn't already opened
if (file == NULL) {
file = fopen(filename, "r");
}
// Read the number from the file
fscanf(file, "%i", &returnVal);
// Return this number
return returnVal;
}
This number is then later compared to EOF.
Okay this is probably bad practice, but I changed the code to the following
int readValue() {
int returnVal; // The value which we return
// Open the file if it isn't already opened
if (file == NULL) {
file = fopen(filename, "r");
}
// Read the number from the file
fscanf(file, "%i", &returnVal);
if (feof(file)) {
fclose(file);
return -1000;
}
// Return this number
return returnVal;
}
Because I knew I would never read any such number from my file (they range from about [-300, 300]. Thanks for all your help guys!
The return value of fscanf is NOT the value that was read, but rather it is the number of items successfully read, or EOF if an error occurred.
The problem is that your read function doesn't distinguish between a successful read and an error condition. You should change it to accept a int * as a parameter that scanf writes into, and the function should return something like 0 on a successful read and -1 on error. You can use the return value of scanf as the basis of what your function returns.
Also, there's a system call named read, so you should really name it something else. And don't forget to fclose(file) at the end of the function, otherwise you're leaking file descriptors.

fwrite() does not override text in Windows (C)

I write this C code so that I could test whether fwrite could update some values in a text file. I tested on Linux and it works fine. In Windows (vista 32bits), however, it simply does not work. The file remains unchanged after I write a different byte using: cont = fwrite(&newfield, sizeof(char), 1, fp);
The registers are written on the file using a "#" separator, in the format:
Reg1FirstField#Reg1SecondField#Reg2FirstField#Reg2SecondField...
The final file should be: First#1#Second#9#Third#1#
I also tried putc and fprintf, all with no result. Can someone please help me with this?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct test {
char field1[20];
char field2;
} TEST;
int main(void) {
FILE *fp;
TEST reg, regread;
char regwrite[22];
int i, cont, charwritten;
fp=fopen("testupdate.txt","w+");
strcpy(reg.field1,"First");
reg.field2 = '1';
sprintf(regwrite,"%s#%c#", reg.field1, reg.field2);
cont = (int)strlen(regwrite);
charwritten = fwrite(regwrite,cont,1,fp);
fflush(fp);
strcpy(reg.field1,"Second");
reg.field2 = '1';
sprintf(regwrite,"%s#%c#", reg.field1, reg.field2);
cont = (int)strlen(regwrite);
charwritten = fwrite(regwrite,cont,1,fp);
fflush(fp);
strcpy(reg.field1,"Third");
reg.field2 = '1';
sprintf(regwrite,"%s#%c#", reg.field1, reg.field2);
cont = (int)strlen(regwrite);
charwritten = fwrite(regwrite,cont,1,fp);
fflush(fp);
fclose(fp);
// open file to update
fp=fopen("testupdate.txt","r+");
printf("\nUpdate field 2 on the second register:\n");
char aux[22];
// search for second register and update field 2
for (i = 0; i < 3; i ++) {
fscanf(fp,"%22[^#]#", aux);
printf("%d-1: %s\n", i, aux);
if (strcmp(aux, "Second") == 0) {
char newfield = '9';
cont = fwrite(&newfield, sizeof(char), 1, fp);
printf("written: %d bytes, char: %c\n", cont, newfield);
// goes back one byte in order to read properly
// on the next fscanf
fseek(fp,-1,SEEK_CUR);
}
fscanf(fp,"%22[^#]#", aux);
printf("%d-2: %s\n",i, aux);
aux[0] = '\0';
}
fflush(fp);
fclose(fp);
// open file to see if the update was made
fp=fopen("testupdate.txt","r");
for (i = 0; i < 3; i ++) {
fscanf(fp,"%22[^#]#", aux);
printf("%d-1: %s\n", i, aux);
fscanf(fp,"%22[^#]#",aux);
printf("%d-2: %s\n",i, aux);
aux[0] = '\0';
}
fclose(fp);
getchar();
return 0;
}
You're missing a file positioning function between the read and write. The Standard says:
7.19.5.3/6
When a file is opened with update mode, both input and output may be performed on the associated stream. However, ... input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end-of-file. ...
for (i = 0; i < 3; i ++) {
fscanf(fp,"%22[^#]#", aux); /* read */
printf("%d-1: %s\n", i, aux);
if (strcmp(aux, "Second") == 0) {
char newfield = '9';
/* added a file positioning function */
fseek(fp, 0, SEEK_CUR); /* don't move */
cont = fwrite(&newfield, sizeof(char), 1, fp); /* write */
I didn't know it but here they explain it:
why fseek or fflush is always required between reading and writing in the read/write "+" modes
Conclusion: You must either fflush or fseek before every write when you use "+".
fseek(fp, 0, SEEK_CUR);
// or
fflush(fp);
cont = fwrite(&newfield, sizeof(char), 1, fp);
Fix verified on Cygwin.
You're not checking any return values for errors. I'm guessing the file is read-only and is not even opening properly.
At least here on OSX, your value 9 is begin appended to the end of the file ... so you're not updating the actual register value for Second at it's position in the file. For some reason after the scan for the appropriate point to modify the values, your stream pointer is actually at the end of the file. For instance, running and compiling your code on OSX produced the following output in the actual text file:
First#1#Second#1#Third#1#9
The reason your initial read-back is working is because the data is being written, but it's at the end of the file. So when you write the value and then back-up the stream and re-read the value, that works, but it's not being written in the location you're assuming.
Update: I've added some calls to ftell to see what's happening to the stream pointer, and it seems that your calls to fscanf are working as you'd assume, but the call to fwrite is jumping to the end of the file. Here's the modified output:
Update field 2 on the second register:
**Stream position: 0
0-1: First
0-2: 1
**Stream position: 8
1-1: Second
**Stream position before write: 15
**Stream position after write: 26
written: 1 bytes, char: 9
1-2: 9
**Stream position after read-back: 26
Update-2: It seems by simply saving the position of the stream-pointer, and then setting the position of the stream-pointer, the call to 'fwrite` worked without skipping to the end of the file. So I added:
fpos_t position;
fgetpos(fp, &position);
fsetpos(fp, &position);
right before the call to fwrite. Again, this is on OSX, you may see something different on Windows.
With this:
fp=fopen("testupdate.txt","w+");
^------ Notice the + sign
You opened the file in "append" mode -- that is what the plus sign does in this parameter. As a result, all of your fwrite() calls will be relative to the end of the file.
Using "r+" for the fopen() mode doesn't make sense -- the + means nothing in this case.
This and other issues with fopen() are why I prefer to use the POSIX-defined open().
To fix your particular case, get rid of the + characters from the fopen() modes, and consider that you might need to specify binary format on Windows ("wb" and "rb" modes).

Resources