How to use fgets after using fgetc? - c

I'm trying to write a specific program that reads data from a file but I realized that when I read the file with fgetc, if I use fgets later, it doesn't have any output.
For example, this code:
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE * arq = fopen("arquivo.txt", "r");
char enter = fgetc(arq);
int line_count = 1;
while(enter != EOF) {
if (enter == '\n') line_count++;
enter = fgetc(arq);
}
printf("%d", line_count);
char str[128];
while(fgets(str, 128, arq)) printf("%s", str);
}
the second while doesn't print anything but if I delete the first while, the code prints the file content. Why is that happening?

TLDR: rewind(arq); is what you want
When you read from a file, the internal file pointer advances as you read, so that each subsequent read will return the next data in the file. When you get to the end, all subsequent reads will return EOF as there is nothing more to read.
You can manipulate the internal file pointer with the fseek and ftell functions. fseek allows you to set the internal file pointer to any point in the file, relative to the beginning, the end, or the current position. ftell will tell you the current position. This allows you to easily remember any position in the file and go back to it later.
SYNOPSIS
#include <stdio.h>
int fseek(FILE *stream, long offset, int whence);
long ftell(FILE *stream);
void rewind(FILE *stream);
DESCRIPTION
The fseek() function sets the file position indicator for the stream pointed to by stream.
The new position, measured in bytes, is obtained by adding offset bytes to the position
specified by whence. If whence is set to SEEK_SET, SEEK_CUR, or SEEK_END, the offset is
relative to the start of the file, the current position indicator, or end-of-file, respec‐
tively. A successful call to the fseek() function clears the end-of-file indicator for
the stream and undoes any effects of the ungetc(3) function on the same stream.
The ftell() function obtains the current value of the file position indicator for the
stream pointed to by stream.
The rewind() function sets the file position indicator for the stream pointed to by stream
to the beginning of the file. It is equivalent to:
(void) fseek(stream, 0L, SEEK_SET)
except that the error indicator for the stream is also cleared (see clearerr(3)).
One caveat here is that the offsets used by fseek and returned by ftell are byte offsets, not character offsets. So when accessing a non-binary file (anything not opened with a "b" modifier to fopen) the offsets might not correspond to characters exactly. It should always be ok to pass an offset returned by ftell back to fseek unmodifed to get to the same spot in the file, but trying to compute offsets otherwise may be tricky.

Related

What does fseek do?

I'm new to C and am using fseek. My question is, what exactly does it do?
eg: fseek(FILE *stream, -1, SEEK_CUR) will read 1 character backwards, so is it already at the end of the file?
Does that mean fseek(FILE *stream, 0, SEEK_CUR) does nothing?
And how do we apply the functions after fseek to a data structure?
I'm new to C and am using fseek. My question is, what exactly does it do?
fseek sets the file position indicator for the stream pointed to by stream. The new position will be given by the offset in conjunction with the flag:
For SEEK_SET, the offset is relative to the start of the file.
For SEEK_CUR the offset is relative to the current indicator position.
For SEEK_END it's relative to end-of-file.
eg: fseek(FILE *stream, -1, SEEK_CUR) will read 1 character backwards, so is it already at the end of the file?
It may be at beginning, in the middle or at the end, the current position is where it is right now. If the indicator is at the end of the file it will be moved one byte back.
It does not read characters, as stated, it positions the indicator to the desired position, by offset, from it's current position as explained above.
Does that mean fseek(FILE *stream, 0, SEEK_CUR) does nothing?
In practical terms it does nothing visible, because the offset is 0 it does not move the indicator, it just places it at the same position. It does clear the end-of-file status (resets the EOF flag as mentioned by #rici).
From C11 N1570 §7.21.9.2
#include <stdio.h>
int fseek(FILE *stream, long int offset, int whence);
Description:
2 The fseek function sets the file position indicator for the stream pointed to by stream. If a read or write error occurs, the error indicator for the stream is set and fseek fails.
3 For a binary stream, the new position, measured in characters from the beginning of the file, is obtained by adding offset to the position specified by whence. The specified position is the beginning of the file if whence is SEEK_SET, the current value of the file position indicator if SEEK_CUR, or end-of-file if SEEK_END. A binary stream need not meaningfully support fseek calls with a whence value of SEEK_END.
4 For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET.
5 After determining the new position, a successful call to the fseek function undoes any effects of the ungetc function on the stream, clears the end-of-file indicator for the stream, and then establishes the new position. After a successful fseek call, the next operation on an update stream may be either input or output.
Returns:
6 The fseek function returns nonzero only for a request that cannot be satisfied.

What does the following line mean?

I was writing a program and had to handle buffers. But when I employed some loops I realized that the buffer was not being flushed after each iteration and withheld its last input value. I searched on the internet and found this code line. It works but I don't know what this means.
fseek(stdin,0,SEEK_END);
It moves the read/write pointer to the end of the file/stream and so it needs to be flushed.
see Tutorialspoint
int fseek(FILE *stream, long int offset, int whence)
Parameters
stream − This is the pointer to a FILE object that identifies the stream.
offset − This is the number of bytes to offset from whence.
whence − This is the position from where offset is added. It is specified by one of the following constants −
SEEK_SET: Beginning of file
SEEK_CUR: Current position of the file
pointer
SEEK_END: End of file
You can also use the function
int fflush(FILE *stream)
on stdin. That should do the same operation.

About the FILE * streams and how fputc() works

I wonder about the operation of FILE pointer f and how the function fputc works.
First, when I open a file (I have not been working on it yet, like writing or reading). What position of f in the file? Is it before the first character?
Second, when I use:
fseek(f, -1, SEEK_CUR);
fputc(' ', f);
what position of my pointer f now?
Reading the manuals should help you.
For fopen: the stream is positioned at the beginning of the file. Except for mode like 'a'
For fseek: that function can fail, you have to test the return value; and it is not difficult to imagine that you cannot obtain a negative offset.
When you open the file, the current position is 0, at the first character.
If you try to fseek before the beginning of the file, fseek will fail and return -1.
Note that if you seek backwards on a text file, there is no guarantee that is can succeed. On linux and/or for a binary stream, assuming you are not at the start of the stream, opened in write mode for a real file, after the sequence
fseek(f, -1L, SEEK_CUR);
fputc(' ', f);
the position of the stream will be the same as before the fseek.
But consider this seemingly simpler example:
fputc('\n', f);
fseek(f, -1L, SEEK_CUR);
On systems such as Windows, where '\n' will at some point be converted into a sequence of 2 bytes <CR><LF>, what do you think it should do?
Because of all these possibilities for failure (and a few more exotic ones), you should always test the return value of fseek and try to minimize its use.
When accessing files through C, the first necessity is to have a way to access the files. For C File I/O you need to use a FILE pointer, which will let the program keep track of the file being accessed. For Example:
FILE *fp;
To open a file you need to use the fopen function, which returns a FILE pointer. Once you've opened a file, you can use the FILE pointer to let the compiler perform input and output functions on the file.
FILE *fopen(const char *filename, const char *mode);
Here filename is string literal which you will use to name your file and mode can have one of the following values
w - open for writing (file need not exist)
a - open for appending (file need not exist)
r+ - open for reading and writing, start at beginning
w+ - open for reading and writing (overwrite file)
a+ - open for reading and writing (append if file exists)
Following is the declaration for fseek() function.
int fseek(FILE *stream, long int offset, int whence)
SEEK_SET Beginning of file
SEEK_CUR Current position of the file pointer
SEEK_END End of file
Following fputc() example :
/* fputc example: alphabet writer */
#include <stdio.h>
int main ()
{
FILE * pFile;
char c;
pFile = fopen ("alphabet.txt","w");
if (pFile!=NULL) {
for (c = 'A' ; c <= 'Z' ; c++)
fputc ( c , pFile );
fclose (pFile);
}
return 0;
}
It depends on your current position/offset for an example if your file pointer was on 100th offset and you write fseek(f, -1, SEEK_CUR); and the offset will be at 99th position, and then you write space on 99th position, after writing space using fputc(' ', f); file pointer's offset will be 100th again.

Undoing the effects of ungetc() : "How" do fseek(),rewind() and fsetpos() do it?Is buffer refilled each time?

Huh!!How shall I put the whole thing in a clear question!!Let me try:
I know that the files opened using fopen() are buffered into memory.We use a buffer for efficiency and ease.During a read from the file, the contents of the file are first read to the buffer,and we read from that buffer.Similarly,in a write to the file, the contents are written to the buffer first ,and then to the file.
But what with fseek(),fsetpos() and rewind()dropping the effect of the previous calls to ungetc()? Can you tell me how it is done?I mean,given we have opened a file for read and it is copied into the buffer.Now using ungetc() we've changed some characters in the buffer.Here is what I just fail to understand even after much effort:
Here's what said about the ungetc() --"A call to fseek, fsetpos or rewind on stream will discard any characters previously put back into it with this function." --How can characters already put into the buffer be discarded?One approach is that the original characters that were removed are "remembered",and each new character that was put in is identified and replaced with original character.But it seems very inefficient.The other option is to load a copy of the original file into buffer and place the file pointer at the intended position.Which approach of these two does fseek, fsetpos or rewind take to discard the characters put using ungetc()?
For text streams,how does the presence of unread characters in the stream,characters that were put in using ungetc(), affect the return value of ftell()?My confusion arise from the following line about ftell() and ungetc() from this link about ftell(SOURCE)
"For text streams, the numerical value may not be meaningful but can still be used to restore the position to the same position later using fseek (if there are characters put back using ungetc still pending of being read, the behavior is undefined)."
Focusing on the last line of the above paragraph,what has pending of being read got to do with a "ungetc()-obtained" character being discarded? Each time we read a character that was put into the stream using ungetc(),is it discarded after the read?
A good mental model of the put back character is simply that it's some extra little property which hangs off the FILE * object. Imagine you have:
typedef struct {
/* ... */
int putback_char;
/* ... */
} FILE;
Imagine putback_char is initialized to the value EOF which indicates "there is no putback char", and ungetc simply stores the character to this member.
Imagine that every read operation goes through getc, and that getc does something like this:
int getc(FILE *stream)
{
int ret = stream->putback_char;
if (ret != EOF) {
stream->putback_char = EOF;
if (__is_binary(stream))
stream->current_position--;
return ret;
}
return __internal_getc(stream); /* __internal_getc doesn't know about putback_char */
}
The functions which clear the pushback simply assign EOF to putback_char.
In other words, the put back character (and only one needs to be supported) can actually be a miniature buffer which is separate from the regular buffering. (Consider that even an unbuffered stream supports ungetc: such a stream has to put the byte or character somewhere.)
Regarding the position indicator, the C99 standard says this:
For a text stream, the value of its file position indicator after a successful call to the ungetc function is unspecified until all pushed-back characters are read or discarded. For a binary stream, its file position indicator is decremented by each successful call to the ungetc function; if its value was zero before a call, it is indeterminate after the call. [7.19.7.11 The ungetc function]
So, the www.cplusplus.com reference you're using is incorrect; the behavior of ftell is not undefined when there are pending characters pushed back with ungetc.
For text streams, the value is unspecified. Accessing an unspecified value isn't undefined behavior, because an unspecified value cannot be a trap representation.
The undefined behavior exists for binary streams if a push back occurs at position zero, because the position then becomes indeterminate. Indeterminate means that it's an unspecified value which could be a trap representation. Accessing it could halt the program with an error message, or trigger other behaviors.
It's better to get programming language and library specifications from the horse's mouth, rather than from random websites.
Lets start from the beginning,
int ungetc(int c, FILE *stream);
The ungetc() function shall push the byte specified by c (converted to an unsigned char) back onto the input stream pointed to by stream.A character is virtually put back into an input stream, decreasing its internal file position as if a previous getc operation was undone.This only affects further input operations on that stream, and not the content of the physical file associated with it, which is not modified by any calls to this function.
int fseek(FILE *stream, long offset, int whence);
The new position, measured in bytes from the beginning of the file, shall be obtained by adding offset to the position specified by whence. The specified point is the beginning of the file for SEEK_SET, the current value of the file-position indicator for SEEK_CUR, or end-of-file for SEEK_END.fseek either flushes any buffered output before setting the file position or else remembers it so it will be written later in its proper place in the file
int fsetpos(FILE *stream, const fpos_t *pos);
The fsetpos() function sets the file position and state indicators for the stream pointed to by stream according to the value of the object pointed to by pos, which must be a value obtained from an earlier call to fgetpos() on the same stream.
void rewind(FILE *stream);
The rewind function repositions the file pointer associated with stream to the beginning of the file. A call to rewind is similar to
(void) fseek( stream, 0L, SEEK_SET );
So as you see ungetc(), Pushing back characters doesn't alter the file; only the internal buffering for the stream is affected.so your second comment "The other option is to load a copy of the original file into buffer and place the file pointer at the intended position" is correct.
Now Answering your second question - A successful intervening call (with the stream pointed to by stream) to a file-positioning function discards any pushed-back characters for the stream. The external storage corresponding to the stream is unchanged

Reading the last 50 characters of a file with fseek()

I'm trying to read the last 50 characters in a file by doing this:
FILE* fptIn;
char sLine[51];
if ((fptIn = fopen("input.txt", "rb")) == NULL) {
printf("Coudln't access input.txt.\n");
exit(0);
}
if (fseek(fptIn, 50, SEEK_END) != 0) {
perror("Failed");
fclose(fptIn);
exit(0);
}
fgets(sLine, 50, fptIn);
printf("%s", sLine);
This doesn't return anything that makes sense remotely. Why?
Change 50 to -50. Also note that this will only work with fixed-length character encodings like ASCII. Finding the 50th character from the end is far from trivial with things like UTF-8.
Try setting the offset to -50.
Besides the sign of the offset the following things could make trouble:
A newline character makes fgets stop reading, but it is considered a valid character and therefore it is included in the string copied to str.
Use either ferror or feof to check whether an error happened or the End-of-File was reached.
See also
fseek(fptIn, 50, SEEK_END)
Sets the stream pointer at the end of the file, and then tries to position the cursor 50 bytes ahead thereof. Remember, for binary streams:
3 For a binary stream, the new position, measured in characters from the beginning of the file, is obtained by adding offset to the position specified by whence..The specified
position is the beginning of the file if whence is SEEK_SET, the current value of the file
position indicator if SEEK_CUR, or end-of-file if SEEK_END. A binary stream need not
meaningfully support fseek calls with a whence value of SEEK_END.
This call should fail. The next call to fgets invokes UB. Try -50 as an offset and also iff the call succeeds try to read it into your buffer
Note: emphasis mine

Resources