I'm new to C and am using fseek. My question is, what exactly does it do?
eg: fseek(FILE *stream, -1, SEEK_CUR) will read 1 character backwards, so is it already at the end of the file?
Does that mean fseek(FILE *stream, 0, SEEK_CUR) does nothing?
And how do we apply the functions after fseek to a data structure?
I'm new to C and am using fseek. My question is, what exactly does it do?
fseek sets the file position indicator for the stream pointed to by stream. The new position will be given by the offset in conjunction with the flag:
For SEEK_SET, the offset is relative to the start of the file.
For SEEK_CUR the offset is relative to the current indicator position.
For SEEK_END it's relative to end-of-file.
eg: fseek(FILE *stream, -1, SEEK_CUR) will read 1 character backwards, so is it already at the end of the file?
It may be at beginning, in the middle or at the end, the current position is where it is right now. If the indicator is at the end of the file it will be moved one byte back.
It does not read characters, as stated, it positions the indicator to the desired position, by offset, from it's current position as explained above.
Does that mean fseek(FILE *stream, 0, SEEK_CUR) does nothing?
In practical terms it does nothing visible, because the offset is 0 it does not move the indicator, it just places it at the same position. It does clear the end-of-file status (resets the EOF flag as mentioned by #rici).
From C11 N1570 §7.21.9.2
#include <stdio.h>
int fseek(FILE *stream, long int offset, int whence);
Description:
2 The fseek function sets the file position indicator for the stream pointed to by stream. If a read or write error occurs, the error indicator for the stream is set and fseek fails.
3 For a binary stream, the new position, measured in characters from the beginning of the file, is obtained by adding offset to the position specified by whence. The specified position is the beginning of the file if whence is SEEK_SET, the current value of the file position indicator if SEEK_CUR, or end-of-file if SEEK_END. A binary stream need not meaningfully support fseek calls with a whence value of SEEK_END.
4 For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET.
5 After determining the new position, a successful call to the fseek function undoes any effects of the ungetc function on the stream, clears the end-of-file indicator for the stream, and then establishes the new position. After a successful fseek call, the next operation on an update stream may be either input or output.
Returns:
6 The fseek function returns nonzero only for a request that cannot be satisfied.
Related
I am learning to use fread and fwrite right now.
As far as I understand, from the documentation, it just seems to read from or write into a specified number of bytes, but always from the beginning of the file. Is there any way not to have to start at the beginning of the file or am I misunderstanding the functions?
use fseek
int fseek(FILE *pointer, long int offset, int position)
pointer: pointer to a FILE object that identifies the stream.
offset: number of bytes to offset from position
position: position from where offset is added.
returns:
zero if successful, or else it returns a non-zero value
SEEK_END : It denotes end of the file.
SEEK_SET : It denotes starting of the file.
SEEK_CUR : It denotes file pointer’s current position.
I'm trying to write a specific program that reads data from a file but I realized that when I read the file with fgetc, if I use fgets later, it doesn't have any output.
For example, this code:
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE * arq = fopen("arquivo.txt", "r");
char enter = fgetc(arq);
int line_count = 1;
while(enter != EOF) {
if (enter == '\n') line_count++;
enter = fgetc(arq);
}
printf("%d", line_count);
char str[128];
while(fgets(str, 128, arq)) printf("%s", str);
}
the second while doesn't print anything but if I delete the first while, the code prints the file content. Why is that happening?
TLDR: rewind(arq); is what you want
When you read from a file, the internal file pointer advances as you read, so that each subsequent read will return the next data in the file. When you get to the end, all subsequent reads will return EOF as there is nothing more to read.
You can manipulate the internal file pointer with the fseek and ftell functions. fseek allows you to set the internal file pointer to any point in the file, relative to the beginning, the end, or the current position. ftell will tell you the current position. This allows you to easily remember any position in the file and go back to it later.
SYNOPSIS
#include <stdio.h>
int fseek(FILE *stream, long offset, int whence);
long ftell(FILE *stream);
void rewind(FILE *stream);
DESCRIPTION
The fseek() function sets the file position indicator for the stream pointed to by stream.
The new position, measured in bytes, is obtained by adding offset bytes to the position
specified by whence. If whence is set to SEEK_SET, SEEK_CUR, or SEEK_END, the offset is
relative to the start of the file, the current position indicator, or end-of-file, respec‐
tively. A successful call to the fseek() function clears the end-of-file indicator for
the stream and undoes any effects of the ungetc(3) function on the same stream.
The ftell() function obtains the current value of the file position indicator for the
stream pointed to by stream.
The rewind() function sets the file position indicator for the stream pointed to by stream
to the beginning of the file. It is equivalent to:
(void) fseek(stream, 0L, SEEK_SET)
except that the error indicator for the stream is also cleared (see clearerr(3)).
One caveat here is that the offsets used by fseek and returned by ftell are byte offsets, not character offsets. So when accessing a non-binary file (anything not opened with a "b" modifier to fopen) the offsets might not correspond to characters exactly. It should always be ok to pass an offset returned by ftell back to fseek unmodifed to get to the same spot in the file, but trying to compute offsets otherwise may be tricky.
There are three origin constants you can use in functions like fseek to determine from where your offset is counted: SEEK_SET, SEEK_CUR, and SEEK_END. SEEK_CUR and SEEK_END seem self-explanatory to mean the current position and end of the file stream, but why is SEEK_SET used to mean the beginning? Why not something like SEEK_BEG?
Because you can add an offset. By using SEEK_SET, you can explicitly set an offset. (By adding it to the beginning)
From the manpage of fseek:
The new position, measured in bytes, is
obtained by adding offset bytes to the position specified by whence.
If whence is set to SEEK_SET, SEEK_CUR, or SEEK_END, the offset is
relative to the start of the file, the current position indicator, or
end-of-file, respectively.
From the manpage of lseek:
SEEK_SET
The file offset is set to offset bytes.
SEEK_CUR
The file offset is set to its current location plus offset
bytes.
SEEK_END
The file offset is set to the size of the file plus offset
bytes.
Another answer to the question as stated is "Because fseek has a second argument which isn't always zero".
If you always passed the second argument as zero, then SEEK_CUR would set the file pointer to its current position (which would be a nearly useless no-op), and SEEK_END would set the file pointer to the end of file, and SEEK_CUR would set it to the beginning of the file, which might make you wonder why it wasn't called SEEK_BEG.
But of course fseek does have that second argument, and you usually pass it as an interesting, non-zero offset. Much of the time, the second argument is the absolute offset you want to seek to, which is what SEEK_SET means. As a convenience, you can also set a position plus-or-minus the current position, which is what SEEK_CUR is for, or plus-or-minus the end of the file, which is what SEEK_END is for.
In the case that whence is SEEK_SET and the offset is 0, meaning that you're trying to set the file pointer to the beginning of the file, there maybe ought to be a convenient shortcut for that, too. But the shortcut isn't called SEEK_BEG, it's a completely different library function: rewind(fp), which is indeed a shortcut for fseek(fp, 0L, SEEK_SET).
Huh!!How shall I put the whole thing in a clear question!!Let me try:
I know that the files opened using fopen() are buffered into memory.We use a buffer for efficiency and ease.During a read from the file, the contents of the file are first read to the buffer,and we read from that buffer.Similarly,in a write to the file, the contents are written to the buffer first ,and then to the file.
But what with fseek(),fsetpos() and rewind()dropping the effect of the previous calls to ungetc()? Can you tell me how it is done?I mean,given we have opened a file for read and it is copied into the buffer.Now using ungetc() we've changed some characters in the buffer.Here is what I just fail to understand even after much effort:
Here's what said about the ungetc() --"A call to fseek, fsetpos or rewind on stream will discard any characters previously put back into it with this function." --How can characters already put into the buffer be discarded?One approach is that the original characters that were removed are "remembered",and each new character that was put in is identified and replaced with original character.But it seems very inefficient.The other option is to load a copy of the original file into buffer and place the file pointer at the intended position.Which approach of these two does fseek, fsetpos or rewind take to discard the characters put using ungetc()?
For text streams,how does the presence of unread characters in the stream,characters that were put in using ungetc(), affect the return value of ftell()?My confusion arise from the following line about ftell() and ungetc() from this link about ftell(SOURCE)
"For text streams, the numerical value may not be meaningful but can still be used to restore the position to the same position later using fseek (if there are characters put back using ungetc still pending of being read, the behavior is undefined)."
Focusing on the last line of the above paragraph,what has pending of being read got to do with a "ungetc()-obtained" character being discarded? Each time we read a character that was put into the stream using ungetc(),is it discarded after the read?
A good mental model of the put back character is simply that it's some extra little property which hangs off the FILE * object. Imagine you have:
typedef struct {
/* ... */
int putback_char;
/* ... */
} FILE;
Imagine putback_char is initialized to the value EOF which indicates "there is no putback char", and ungetc simply stores the character to this member.
Imagine that every read operation goes through getc, and that getc does something like this:
int getc(FILE *stream)
{
int ret = stream->putback_char;
if (ret != EOF) {
stream->putback_char = EOF;
if (__is_binary(stream))
stream->current_position--;
return ret;
}
return __internal_getc(stream); /* __internal_getc doesn't know about putback_char */
}
The functions which clear the pushback simply assign EOF to putback_char.
In other words, the put back character (and only one needs to be supported) can actually be a miniature buffer which is separate from the regular buffering. (Consider that even an unbuffered stream supports ungetc: such a stream has to put the byte or character somewhere.)
Regarding the position indicator, the C99 standard says this:
For a text stream, the value of its file position indicator after a successful call to the ungetc function is unspecified until all pushed-back characters are read or discarded. For a binary stream, its file position indicator is decremented by each successful call to the ungetc function; if its value was zero before a call, it is indeterminate after the call. [7.19.7.11 The ungetc function]
So, the www.cplusplus.com reference you're using is incorrect; the behavior of ftell is not undefined when there are pending characters pushed back with ungetc.
For text streams, the value is unspecified. Accessing an unspecified value isn't undefined behavior, because an unspecified value cannot be a trap representation.
The undefined behavior exists for binary streams if a push back occurs at position zero, because the position then becomes indeterminate. Indeterminate means that it's an unspecified value which could be a trap representation. Accessing it could halt the program with an error message, or trigger other behaviors.
It's better to get programming language and library specifications from the horse's mouth, rather than from random websites.
Lets start from the beginning,
int ungetc(int c, FILE *stream);
The ungetc() function shall push the byte specified by c (converted to an unsigned char) back onto the input stream pointed to by stream.A character is virtually put back into an input stream, decreasing its internal file position as if a previous getc operation was undone.This only affects further input operations on that stream, and not the content of the physical file associated with it, which is not modified by any calls to this function.
int fseek(FILE *stream, long offset, int whence);
The new position, measured in bytes from the beginning of the file, shall be obtained by adding offset to the position specified by whence. The specified point is the beginning of the file for SEEK_SET, the current value of the file-position indicator for SEEK_CUR, or end-of-file for SEEK_END.fseek either flushes any buffered output before setting the file position or else remembers it so it will be written later in its proper place in the file
int fsetpos(FILE *stream, const fpos_t *pos);
The fsetpos() function sets the file position and state indicators for the stream pointed to by stream according to the value of the object pointed to by pos, which must be a value obtained from an earlier call to fgetpos() on the same stream.
void rewind(FILE *stream);
The rewind function repositions the file pointer associated with stream to the beginning of the file. A call to rewind is similar to
(void) fseek( stream, 0L, SEEK_SET );
So as you see ungetc(), Pushing back characters doesn't alter the file; only the internal buffering for the stream is affected.so your second comment "The other option is to load a copy of the original file into buffer and place the file pointer at the intended position" is correct.
Now Answering your second question - A successful intervening call (with the stream pointed to by stream) to a file-positioning function discards any pushed-back characters for the stream. The external storage corresponding to the stream is unchanged
I'm trying to read the last 50 characters in a file by doing this:
FILE* fptIn;
char sLine[51];
if ((fptIn = fopen("input.txt", "rb")) == NULL) {
printf("Coudln't access input.txt.\n");
exit(0);
}
if (fseek(fptIn, 50, SEEK_END) != 0) {
perror("Failed");
fclose(fptIn);
exit(0);
}
fgets(sLine, 50, fptIn);
printf("%s", sLine);
This doesn't return anything that makes sense remotely. Why?
Change 50 to -50. Also note that this will only work with fixed-length character encodings like ASCII. Finding the 50th character from the end is far from trivial with things like UTF-8.
Try setting the offset to -50.
Besides the sign of the offset the following things could make trouble:
A newline character makes fgets stop reading, but it is considered a valid character and therefore it is included in the string copied to str.
Use either ferror or feof to check whether an error happened or the End-of-File was reached.
See also
fseek(fptIn, 50, SEEK_END)
Sets the stream pointer at the end of the file, and then tries to position the cursor 50 bytes ahead thereof. Remember, for binary streams:
3 For a binary stream, the new position, measured in characters from the beginning of the file, is obtained by adding offset to the position specified by whence..The specified
position is the beginning of the file if whence is SEEK_SET, the current value of the file
position indicator if SEEK_CUR, or end-of-file if SEEK_END. A binary stream need not
meaningfully support fseek calls with a whence value of SEEK_END.
This call should fail. The next call to fgets invokes UB. Try -50 as an offset and also iff the call succeeds try to read it into your buffer
Note: emphasis mine