File Pointer Position

File Pointer Position - c

Here is the snippet of code
typedef struct
{
double testA;
double testB[500];
bool isProcessed;
} MYSTURCT;
I have a binary file which is written with multiple structs of type "myStruct".
Now, in another function, I m trying to read the file and update in the middle.
void test()
{
FILE* fp = fopen (testFile, "r+")
MYSTURCT* myPtr = malloc (sizeof (MYSTRUCT));
while ( fread (myPtr,sizeof(MYSTRUCT),1,fp) )
{
if (!myPtr->isProcessed)
{
//update some thing int he struct
myPtr->testA = 100.00;
fseek (fp, -sizeof(MYSTRUCT), SEEK_CUR);
fwrite (myPtr,sizeof(MYSTRUCT), 1,fp);
}
}
}
Once I find something unprocessed, I update the struct in the memory, then try to
write the struct to the disk. (first by seeking the CURR - sizeof(struct)) position
and then fwriting the struct to disk.
Whats happening in my application is after doing the fseek, my
fp->_ptr is getting messed up and it looses the track of position in my stream.
Is there anything wrong that I am doing here?

-sizeof(STRUCT) is potentially dangerous. sizeof(STRUCT) is an unsigned type, and if it is as least as wide as an int it's promoted type (the type of the -sizeof(STRUCT) expression) will also be unsigned and have a value of about UINT_MAX - sizeof(STRUCT) + 1 or possibly ULONG_MAX - sizeof(STRUCT)+ 1.
If you're unlucky (e.g. 32 bit size_t, 64 bit long) its UINT_MAX - sizeof(STRUCT) + 1 and a long int may be able to hold this large postive value and the seek won't do what you want it to do.
You could consider doing a position save and restore:
fpos_t pos;
if (fgetpos(fp, &pos) != 0)
{
/* position save failed */
return;
}
/* read struct */
if (fsetpos(fp, &pos) != 0)
{
/* position restore failed */
return;
}
/* write struct */
fgetpos and fsetpos use a fpos_t so can potentially work with very large files in scenarios where fseek and ftell won't.

The fopen manpage says:
Reads and writes may be intermixed on
read/write streams in any order.
Note that ANSI C requires that a file positioning function
intervene
between output and input, unless an input operation encounters
end-of-
file. (If this condition is not met, then a read is allowed to
return
the result of writes other than the most recent.) Therefore it is
good
practice (and indeed sometimes necessary under Linux)
to put an
fseek(3) or fgetpos(3) operation between write and read
operations on
such a stream. This operation may be an apparent
no-op (as in
fseek(..., 0L, SEEK_CUR) called for its synchronizing side effect.
So you might try putting the dummy fseek in right after you fwrite.

You malloc sizeof (MYSTRUCT) bytes to myPtr, but myPtr is of type MYSTURCT.
I don't think that's your problem, though.
Apparently there's nothing wrong with your code; try to add some error-checking ...
void test(){
FILE* fp = fopen (testFile, "r+"); /* missing semicolon */
MYSTURCT* myPtr = malloc (sizeof *myPtr);
while ( fread (myPtr,sizeof *myPtr,1,fp) == 1) /* error checking */
{
if (!myPtr->isProcessed)
{
//update some thing int he struct
myPtr->testA = 100.00;
if (fseek (fp, -sizeof *myPtr, SEEK_CUR) == -1)
{
perror("fseek");
}
if (fwrite (myPtr,sizeof *myPtr, 1,fp) != 1)
{
perror("fwrite");
}
}
}
}
And the fopen should be in binary mode, even if you're on Linux (where it really doesn't matter). On Windows a sequence of 0x0D 0x0A in the middle of one of those doubles will get converted to 0x0D and mess everything up.

Try fflush after the last fwrite(). Then Try making a new test file using your current structure. It could be that you changed your structure and your current test file has an older invalid byte order.

I tried your sample code, and it seems to work fine to me (though I am doing it in C - I substituted a "char" for your "boolean")
For debugging, how do you know that the fp is getting corrupted? It is unusual to look at the members of the FILE struct. Each time you do an fseek(), fread() or fwrite(), what is your output when you invoke ftell()?

if you wanted to write to the file you should use "w+b" not "r+" otherwise fwrite would fail and return the error code instead. (I think).

Related

Why should I put SEEK_SET twice

I want to modify some vowels of a file by "5". The following code works. However, I do not understand why I should put fseek twice.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
void print_file_contents(const char *filename)
{
FILE *fp;
char letter;
if((fp=fopen(filename,"r+"))==NULL)
{
printf("error\n");
exit(1);
}
fseek(fp,0,SEEK_END);
int size=ftell(fp);
rewind(fp);
for(int i=0;i<size;i++)
{
fseek(fp,i,SEEK_SET);
letter=fgetc(fp);
if((letter=='a') || (letter=='e') || (letter=='i'))
{
fseek(fp,i,SEEK_SET); // WHY THIS FSEEK ?
fwrite("5",1,sizeof(char),fp);
}
}
fclose(fp);
}
int main(int argc, char *argv[])
{
print_file_contents("myfile");
return 0;
}
In my opinion, the first fseek(fp, i, SEEK_SET) is used to set the file position indicator to the current character being processed, so that the character can be read using fgetc. Hence, the cursor is updated every time so there is no need to add another fseek(fp, i, SEEK_SET);.

The fgetc advanced the file position; if you want to replace the character you just read, you need to rewind back to the same position you were in when you read the character to replace.

Note that the C standard mandates a seek-like operation when you switch between reading and writing (and between writing and reading).
§7.21.5.s The fopen function ¶7:
¶7 When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end- of-file.
Also, calling fgetc() moves the file position forward one character; if the write worked (it's undefined behaviour if you omit the seek-like operation), you'd overwrite the next character, not the one you just read.

Your intuition is correct: two of the three fseek calls in this program are unnecessary.
The necessary fseek is the one inside the if((letter=='a') || (letter=='e') || (letter=='i')) conditional. That one is needed to back up the file position so you overwrite the character you just read (i.e. the vowel), not the character after the vowel.
The fseek inside the loop (but outside the if) is unnecessary because both fgetc and fwrite advance the file position, so it will always set the file position to the position it already has. And the fseek before the loop is unnecessary because you do not need to know how big the file is to implement this algorithm.
This code can be tightened up considerably. I'd write it like this:
#include <stdio.h>
void replace_aie_with_5_in_place(const char *filename)
{
FILE *fp = fopen(filename, "r+"); // (1)
if (!fp) {
perror(filename); // (2)
exit(1);
}
int letter;
while ((letter = fgetc(fp)) != EOF) { // (3)
if (letter == 'a' || letter == 'e' || letter == 'i') { // (4)
fseek(fp, -1, SEEK_CUR); // (5)
fputc('5', fp);
if (fflush(fp)) { // (6)
perror(filename);
exit(1);
}
}
if (fclose(fp)) { // (7)
perror(filename);
exit(1);
}
}
int main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "usage: %s filename\n", argv[0]);
return 1;
}
replace_aei_with_5_in_place(argv[1]); // (8)
return 0;
}
Notes:
It is often (but not always) better to write operations with side effects, like fopen, separately from conditionals checking whether they succeeded.
When a system-level operation fails, always print both the name of any file involved, and the decoded value of errno. perror(filename) is a convenient way to do this.
You don't need to know the size of the file you're crunching because you can use a loop like this, instead. Also, this is an example of an exception to (1).
Why not 'o' and 'u' also?
Here's the necessary call to fseek, and the other reason you don't need to know the size of the file: you can use SEEK_CUR to back up by one character.
This fflush is necessary because we're switching from writing to reading, as stated in Jonathan Leffler's answer. Inconveniently, it also consumes the notification for some (but not all) I/O errors, so you have to check whether it failed.
Because you are writing to the file, you must also check for delayed I/O errors, reported only on fclose. (This is a design error in the operating system, but one that we are permanently stuck with.)
Best practice is to pass the name of the file to munge on the command line, not to hardcode it into the program.

#Jonathan Leffler well states why code used multiple fseek(): To cope with changing between reading and writing.
int size=ftell(fp); is weak as the range of returned values from ftell() is long.
Seeking in a text file (as OP has) also risks undefined behavior (UB).
For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET. C17dr § 7.21.9.1 3.
Better to use #zwol like approach with a small change.
Do not assume a smooth linear mapping. Instead, note the location and then return to it as needed.
int replacement = '5';
for (;;) {
long position = ftell(fp);
if (ftell == -1) {
perror(filename);
exit(1);
}
int letter = fgetc(fp);
if (letter == EOF) {
break;
}
if (letter == 'a' || letter == 'e' || letter == 'i') {
fseek(fp, position, SEEK_SET);
fputc(replacement, fp);
if (fflush(fp)) {
perror(filename);
exit(1);
}
}
}
Research fgetpos(), fsetpos() for an even better solution that handles all file sizes, even ones longer than LONG_MAX.

Should I check the return value of fseek() when calculating the length of a file?

I have this idiomatic snippet for getting the length of a binary file:
fseek(my_file, 0, SEEK_END);
const size_t file_size = ftell(my_file);
…I know, to be pedantic fseek(file, 0, SEEK_END) has undefined behavior for a binary stream [1] – but frankly on the platforms where this is a problem I also don't have fstat() and anyway this is a topic for another question…
My question is: Should I check the return value of fseek() in this case?
if (fseek(my_file, 0, SEEK_END)) {
return 1;
}
const size_t file_size = ftell(my_file);
I have never seen fseek() been checked in a case like this, and I also wonder what kind of error fseek() could possibly ever return here.
EDIT:
After reading Clifford's answer, I also think that the best way to deal with fseek() and ftell() return values while calculating the size of a file is to write a dedicated function. However Clifford's good suggestion could not deal with the size_t data type (we need a size after all!), so I guess that the most practical approach in the end would be to use a pointer for storing the size of the file, and keep the return value of our dedicated function only for failures. Here is my contribution to Clifford's solution for a safe size calculator:
int fsize (FILE * const file, size_t * const size) {
long int ftell_retval;
if (fseek(file, 0, SEEK_END) || (ftell_retval = ftell(file)) < 0) {
/* Error */
*size = 0;
return 1;
}
*size = (size_t) ftell_retval;
return 0;
}
So that when we need to know the length of a file we could simply do:
size_t file_size;
if (fsize(my_file, &file_size)) {
fprintf(stderr, "Error calculating the length of the file\n");
return 1;
}

You need perhaps to ask yourself two questions:
What will ftell() return if fseek() has failed?
Can I handle failure in any meaningful way?
If fseek() fails it returns a non-zero value. If ftell() fails (which it likely will if fseek() has failed), it will return -1L - so is more deterministic, which from an error handling point of view is better.
However there are potentially ways in which fseek() could fail that do not cause ftell() to fail (unlikely perhaps, but the failure modes are implementation defined), so it is better perhaps to test fseek() to be sure you are not getting an erroneous answer from ftell().
Since your aim is to get the file size, and the use of fseek/ftell is just a way of synthesising that, it makes more sense to define a file-size function, so that the caller need only be concerned with handling the failure to obtain a valid file size rather than the failure of implementation details. The point being is if you want the file size, you don't want to have to handle errors for fseek() since that was a means to an end and not directly related to what you need to achieve - failure of fseek() is a non-deterministic side-effect, and the effect is an unknown file size - better then to behave "as-if" ftell() had failed without risking misleading behaviour by actually calling ftell():
long fsize( FILE* file )
{
long size = -1 ' // as-if ftell() had failed
if( fseek( file, 0, SEEK_END ) == 0 )
{
size = ftell( file ) ;
}
return size ;
}
Then your code will be:
const long file_size = fsize(my_file);
Then at the application level you only need to handle the error file_size < 0, you have no interest in whether fseek() or ftell() failed, just that you don't know the file size.

Its always a good practice to test the return value of a function and handle it on time otherwise strange behaviour might occur which you won't be able to understand or find without an exhaustive debugging.
You can read the following link about the return value of fseek: fseek in the return value section.
This if statement is neglectable in the code pipeline while make its easier to treat problems when it occur.

fseek can return an error in the case where the file handle is a pipe (or a serial stream).
At that point, ftell can't even tell you where it's at, because in those circumstances it's more "wherever you go, there you are".

Yes, check return value, yet be more careful with type changes.
Note the the range of size_t may be more or less than 0...LONG_MAX.
// function returns an error flag
int fsize (FILE * file, size_t *size) {
if (fseek(file, 0, SEEK_END)) {
return 1; // fseek error
}
long ftell_retval = ftell(file);
if (ftell_retval == -1) {
return 1; // ftell error
}
// Test if the file size fits in a `size_t`.
// Improved type conversions here.
// Portably *no* overflow possible.
if (ftell_retval < 0 || (unsigned long) ftell_retval > SIZE_MAX) {
return 1; // range error
}
*size = (size_t) ftell_retval;
return 0;
}
Portability
Direct conversion of a long to size_t and vice versa is portably challenging given the relationship of LONG_MAX, SIZE_MAX is not defined. It may be <,==, >.
Instead first test for < 0, then, if positive, convert to unsigned long. C specifies that LONG_MAX <= ULONG_MAX, so we are OK here. Then compare the unsigned long to SIZE_MAX. Since both types are some unsigned type, the compare simply converts to the wider of the two. Again no range loss.

C Read and replace char

I'm trying to read a file and replace every char by it's corresponding char up one in ASCII table. It opens the file properly but keep on reading the first character.
int main(int argc, char * argv[])
{
FILE *input;
input = fopen(argv[2], "r+");
if (!input)
{
fprintf(stderr, "Unable to open file %s", argv[2]);
return -1;
}
char ch;
fpos_t * pos;
while( (ch = fgetc(input)) != EOF)
{
printf("%c\n",ch);
fgetpos (input, pos);
fsetpos(input, pos-1);
fputc(ch+1, input);
}
fclose(input);
return 1;
}
the text file is
abc
def
ghi
I'm pretty sure it's due to the fgetpos and fsetpos but if I remove it then it will add the character at the end of the file and the next fgetc will returns EOF and exit.

You have to be careful when dealing with files opened in update mode.
C11 (n1570), § 7.21.5.3 The fopen function
When a file is opened with update mode ('+' as the second or third character in the
above list of mode argument values), both input and output may be performed on the
associated stream.
However, output shall not be directly followed by input without an
intervening call to the fflush function or to a file positioning function (fseek,
fsetpos, or rewind), and input shall not be directly followed by output without an
intervening call to a file positioning function, unless the input operation encounters end-of-file.
So your reading might look something like :
int c;
while ((c = getc(input)) != EOF)
{
fsetpos(/* ... */);
putc(c + 1, input);
fflush(input);
}
By the way, you will have problems with 'z' character.

procedure for performing random access such
positioned the record
reading of the record
positioned the record
update(write) the record
do flush (to finalize the update)
The following code is a rewrite in consideration to it.
#include <stdio.h>
#include <ctype.h>
int main(int argc, char * argv[]){
FILE *input;
input = fopen(argv[1], "rb+");
if (!input){
fprintf(stderr, "Unable to open file %s", argv[1]);
return -1;
}
int ch;
fpos_t pos, pos_end;
fgetpos(input, &pos);
fseek(input, 0L, SEEK_END);
fgetpos(input, &pos_end);
rewind(input);
while(pos != pos_end){
ch=fgetc(input);
if(EOF==ch)break;
printf("%c",ch);
if(!iscntrl(ch) && !iscntrl(ch+1)){
fsetpos(input, &pos);
fputc(ch+1, input);
fflush(input);
}
pos += 1;
fsetpos(input, &pos);
}
fclose(input);
return 1;
}

I really suspect the problem is here:
fpos_t * pos;
You are declaring a pointer to a fpos_t which is fine but then, where are the infomation stored when you'll retrieve the pos?
It should be:
fpos_t pos; // No pointer
...
fgetpos (input, &pos);
fsetpos(input, &pos); // You can only come back where you were!
Reading the (draft) standard, the only requirement for fpos_t is to be able to represent a position and a state for a FILE, it doesn't seem that there is a way to move the position around.
Note that the expression pos+1 move the pointer, does not affect the value it points to!
What you probably want is the old, dear ftell() and fseek() that will allow you to move around. Just remember to open the file with "rb+" and to flush() after your fputc().
When you'll have solved this basic problem you will note there is another problem with your approach: handling newlines! You most probably should restrict the range of characters you will apply your "increment" and stipulate that a follows z and A follows Z.
That said, is it a requirement to do it in-place?

7.21.9.1p2
The fgetpos function stores the current values of the parse state (if
any) and file position indicator for the stream pointed to by stream
in the object pointed to by pos. The values stored contain unspecified
information usable by the fsetpos function for repositioning the
stream to its position at the time of the call to the fgetpos
function.
The words unspecified information don't seem to inspire confidence in that subtraction. Have you considered calling fgetpos prior to reading the character, so that you don't have to do a non-portable subtraction? Additionally, your call to fgetpos should probably pass a pointer to an existing fpos_t (eg. using the &address-of operator). Your code currently passes a pointer to gibberish.
fgetc returns an int, so that it can represent every possible unsigned char value distinct from negative EOF values.
Suppose your char defaults to an unsigned type. (ch = fgetc(input)) converts the (possibly negative, corresponding to errors) return value straight to your unsigned char type. Can (unsigned char) EOF ever compare equal to EOF? When does your loop end?
Suppose your char defaults, instead, to a signed type. (c = fgetc(input)) is likely to turn the higher range of any returned unsigned char values into negative numbers (though, technically, this statement invokes undefined behaviour). Wouldn't your loop end prematurely (eg. before EOF), in some cases?
The answer to both of these questions indicates that you're handing the return value of fgetc incorrectly. Store it in an int!
Perhaps your loop should look something like:
for (;;) {
fpos_t p;
/* TODO: Handle fgetpos failure */
assert(fgetpos(input, &p) == 0);
int c = fgetc(input);
/* TODO: Handle fgetc failure */
assert(c >= 0);
/* TODO: Handle fsetpos failure */
assert(fsetpos(input, &p) == 0);
/* TODO: Handle fputc failure */
assert(fputc(c + 1, input) != EOF);
/* TODO: Handle fflush failure (Thank Kirilenko for this one) */
assert(fflush(input) == 0);
}
Make sure you check return values...

The update mode('+') can be a little bit tricky to handle. Maybe You could just change approach and load the whole file into char array, iterate over it and then eventually write the whole thing to an emptied input file? No stream issues.

how to open a file in C (fopen/fread)

I have an embedded board (beagleboard-xm) that runs ubuntu 12.04, I would like to read one GPIO input if it is logic 1 or 0. How can I implement cat /sys/class/gpio/gpio139/value in C? (value file stores 0 or 1)
I open the file by:
FILE *fp;
fp = fopen("/sys/class/gpio/gpio139/value", "rb");
what do I need to do next?

If you want to read one character, try this:
int value = fgetc(fp);
/* error checking */
value = value - '0';

You can read one byte, or until eof:
char buffer[32]; // Very long number!
if (NULL == (fp = fopen(FILENAME, "rb")))
{
// TODO: return a suitable error/perror
return -1;
}
bytesread = fread(buffer, sizeof(char), sizeof(buffer)-1, fp);
fclose(fp);
if (!bytesread)
{
// Nothing at all was read
// TODO: return error
return -2;
}
// This is in case you want the byte interpreted from ASCII
// otherwise you'd just return buffer[0], or (*(DATATYPE *)buffer)[0].
buffer[bytesread] = 0x0;
return atol(buffer);
This code is actually not that general, in that many hardware devices will implement a blocking data channel - that is, if you try to read more data than it's there, the fread will block until data becomes available. In such a case, just dimension the buffer to the maximum number of bytes you need, plus one.
The plus one, and the corresponding -1 in the fread, are only there for the case in which the data you read is rendered as ASCII, i.e., "128" is three ASCII bytes "1", "2", "8" and maybe even a carriage return, instead of a binary 0x80. In this case, the buffer is zero-terminated to make it a C string on which atol may operate to retrieve a decimal number.
If what is needed is a binary value, then no such conversion is needed, and one can read the full buffer without adjustments, avoid setting the last plus one byte to zero, and just return a cast value from the buffer; or buffer[0] if only one byte is needed.

After attempting to open the file, you check that the fopen() succeeded.
Then you can use any of the stdio functions to read the data:
getc()
fgetc()
fgets()
fread()
and probably others too. You might be looking at the scanf() family, but most probably won't be using them, for example. Which is most appropriate depends on the data that is read; is it text or is it binary. If it is a single character, then getc(); if it is text and line-oriented, maybe fgets(); if binary, probably fread().

If you have access to your Linux headers, than I would recommend you to access GPIO using Linux API.
Include this in your file:
#include <linux/gpio.h>
Now you have access to functions like:
int gpio_is_valid(int number);
int gpio_get_value(unsigned gpio);
void gpio_set_value(unsigned gpio, int value);
In your case you can just write this:
int io_ret = -1;
if (gpio_is_valid(139))
io_ret = gpio_get_value(139);

i think it will be better if you used:
system("echo /sys/class/gpio/gpio139/value >> temp.txt ");
after that it is easy you can just extract the value from temp.txt which will be either 0 or 1

Same file, same filesize but the memory comparision returns non zero

#define "/local/home/..."
FILE *fp;
short *originalUnPacked;
short *unPacked;
int fileSize;
fp = fopen(FILENAME, "r");
fseek (fp , 0 , SEEK_END);
fileSize = ftell (fp);
rewind (fp);
originalUnPacked = (short*) malloc (sizeof(char)*fileSize);
unPacked = (short*) malloc (sizeof(char)*fileSize);
fread(unPacked, 1, fileSize, fp);
fread(originalUnPacked, 1, fileSize, fp);
if( memcmp( unPacked, originalUnPacked, fileSize) == 0)
{
print (" unpacked and original unpacked equal ") // Not happens
}
My little knowldege of C says that the print statement in the last if block should be printed but it doesnt, any ideas Why ??
Just to add more clarity and show you the complete code i have added a define statement and two fread statement before the if block.

Few points for your consideration:
1. The return type of ftell long int so it is better to declare fileSize as long int (as sizeof(int) <= sizeof(long)).
2. It is a better practice in C not to typecast the return value of malloc. Also you can probably get rid of sizeof(char) when using in malloc.
3. fread advances the file stream thus after the first fread call the file stream pointer has advanced by the size of the file as dictated by fileSize. Thus the second fread immediately after that will fail to read anything (assuming the first one succeeded). This is the reason why you are seeing the behavior mentioned in your program. You need to reset the file stream pointer using rewind before the second call to fread. Also you can check the return value of fread which is the number of bytes successfully read to check how many bytes were actually read successfully. Try something on these lines:
size_t bytes_read;
bytes_read = fread(unPacked, 1, fileSize, fp);
/* some check or print of bytes read successfully if needed */
/* Reset fp if fread was successfully to load file in memory pointed by originalUnPacked */
rewind(fp);
bytes_read = fread(originalUnPacked, 1, fileSize, fp);
/* some check or print of bytes read successfully if needed */
/* memcmp etc */
4. It may be a good idea to check for the return values of fopen, malloc etc against failure i.e. NULL check in case of fopen & malloc.
Hope this helps!

The memory allocated with malloc is not pre-initialized, so its contents are random and thus almost certainly different for the two allocations.
The expected (probabilistically speaking, "certain") result is exactly what happens.
Did you mean to load the file into both of these buffers before testing with memcmp but forgot to do so?

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

File Pointer Position - c

Try fflush after the last fwrite(). Then Try making a new test file using your current structure. It could be that you changed your structure and your current test file has an older invalid byte order.

if you wanted to write to the file you should use "w+b" not "r+" otherwise fwrite would fail and return the error code instead. (I think).

Related

Why should I put SEEK_SET twice

Should I check the return value of fseek() when calculating the length of a file?

C Read and replace char

how to open a file in C (fopen/fread)

Same file, same filesize but the memory comparision returns non zero

Categories

Resources