Looking one char ahead when reading from file in C - c

I am aware that in c there are functions getc() and ungetc().
I would need this counter-function for fgetc(), sadly unfgetc() doesn't really exist. So I tried writting it on my own.
This is how it looks:
int getNextChar(FILE* fd)
{
// get the character
int nextCharacter = fgetc(fd);
// fseek it back, so you don't really move the file descriptor
fseek(fd, -1, SEEK_CUR);
// returning the char (as int)
return nextCharacter;
}
But well... that doesn't seem to work.
I call it inside a while loop, like this.
while ( (c = fgetc(fd)) != EOF)
{
cx = getNextChar(fd);
printf("%c", c);
}
It gets stucked on the last character of the file (it prints it with every iteration to infinity). Now a little explanation why I need that in case that I'm doing it all wrong and that there would be another suitable solution.
I need to check the next character on being EOF. If it is EOF, I force send token, that is created in the while loop (this part is not important for my issue, so I didnt include it).
I am going through the loop and whenever I find a character that doesnt respond to a mask, I assume that I should send a token and start making a new one with that character that doesnt respond. Naturally, when I read the last char in the file, no next iteration will be done, therefore I won't send the last token. I need to check next char to be EOF. If it is EOF, I force send token.
Thank you for your advices!

You need to check that nextCharacter isn't EOF, since if it is, you'll still back off, thus causing the outer reading to never see the end of the file. Also check return values of more functions, like fseek().

Related

Is there a way to setback ptr inside my file after using fgetc()?

int main(){
int ms = 0, vs = 0, cif = 0, intzn = 0, i;
FILE* dat = fopen("file.txt", "r");
for(i = 0; !feof(dat); i++)
{
if(isupper(fgetc(dat)))
vs++;
else
{
fseek(dat, ftell(dat) - 1, SEEK_SET);
}
if(islower(fgetc(dat)))
ms++;
else
{
fseek(dat, ftell(dat) - 1, SEEK_SET);
}
if(isdigit(fgetc(dat)))
cif++;
else
{
fseek(dat, ftell(dat) - 1, SEEK_SET);
}
if(ispunct(fgetc(dat)))
intzn++;
}
printf("\nVS: %d\nMS: %d\nCif: %d\nIntznc: %d", vs, ms, cif, intzn);
fclose(dat);
return 0;
}
every time i use "fgetc(dat)" in my if statement, the pointer pointing to that character in that file advances, so I am trying to figure out how to set it back in case my if statement is false, where I've tried using "fseek()" but it still wont work, why?
Your loop should not use feof to test for end-of-file because the end-of-file indicator is set only after the end is reached. It cannot tell you in advance there is not another character to get.
Looking at your loop, I suspect you do not need to “go back” in the file. You just need to read one character and examine it in multiple ways. To do that, you can simply read the character and assign its value to a variable. Then you can use the variable multiple times without rereading the character. For future reference, when you do want to reject a character, you can “put it back” into the stream with ungetc(c, dat);.
Further, I suspect you want to read each character in the file and characterize it. So you want a loop to read through the file until the end. To do this, you can use:
To read one character, test it, and reject it if it is not satisfactory, use:
while (1) do
{
/* Use int to get a character, because `fgetc` may
return either an unsigned char value or EOF.
*/
int c = fgetc(dat);
// If fgetc failed to get a character, leave the loop.
if (c == EOF)
break;
// Now test the character.
if (isupper(c))
++vs;
else if (islower(c))
++ms;
else if (isdigit(c))
++cif;
else if (ispunct(c))
++intzn;
// If desired, include a final else for all other cases:
else
++other;
}
There are several issues with your program, among them:
for (...; !feof(f); ...) is wrong. The feof() function determines whether end-of-file has been observed by a previous read from the file. If one has not, then that function does not speak to whether a future read will succeed. feof() is for distinguishing between read failures resulting from EOF and those resulting from I/O errors, not for predicting the future.
When one of your tests succeeds, you do not go back to the beginning of the loop. So suppose that your fseek()s were all working (which may in fact be the case), that your input is "StackOverflow", and that at the beginning of a loop iteration, the 'k' is the next character. Then
the 'k' will be read and rejected by the uppercase test, and the file moved back
the 'k' will be read an accepted by the lowercase test
the 'O' will be read and rejected by the digit test, and the file moved back
the 'O' will be read and rejected by the punctuation test, and lost because the file is not moved back.
I've tried using "fseek()" but it still wont work, why?
It's hard to tell for sure without the input you are using and an explanation of what behavior you actually observe, but likely the problem is related to the second point above.
The thing to do is probably to
avoid any need to push back a character. On each iteration of the loop, read one character from the file and store it in a variable, then perform as many tests as you like on that variable's value.
test the return value of fgetc() (instead of using feof()) to determine when the end of the file has been reached. This dovetails with (1) above.
As an aside, fseek(dat, -1, SEEK_CUR) would be more idiomatic for backing up the file pointer by one byte.

How can I implement a IF to detect the last byte when file reading?

I'm trying to read a file and to work on the last byte a little differently. Here is my code:
FILE * file = fopen(path,"rb");
unsigned char curr;
while (fread(&in, 1, 1, file) >= 1)
{
If (Is_Last_Byte){
//Does something
}
else{
//Does something else
}
}
How do I perform this check? Is it possible to set a pointer to the last byte and during each loop iteration perform this check?
Thanks in advance.
Since there is nothing which distinguishes the last byte from any other byte, your best bet is to always read one byte (or buffer) ahead of the byte (buffer) you are processing. When the read ahead returns EOF, you know that the byte (buffer) you are about to process is the last one.
This is essentially the same logical structure as the often-repeated advice that read loops should terminate when the read returns EOF, rather than having the form while(!feof(file)). (The fact that the advice needs to be repeated demonstrates that the message is somehow not getting through :-) ) while(!feof(file)) is wrong precisely because the last byte is not special; even after reading it, it is not known that no more bytes follow.

getchar() keeps reading a '\n'. What is going on?

I have a small program I'm writing to practice programming in C.
I want it to use the getchar(); function to get input from the user.
I use the following function to prompt for user input, then loop using getchar() to store input in an array:
The function is passed a pointer referencing a struct's member.
getInput(p->firstName); //The function is passed an argument like this one
void getInput(char * array)
{
int c;
while((c=getchar()) != '\n')
*array++ = c;
*array = '\0'; //Null terminate
}
This function is called multiple times, as it is a part of a function that creates a structure, and populates it's array members.
However when the program executes, The first two calls to it work fine, but any subsequent calls to this function will cause every-other call to getchar() to not wait for keyboard input.
After some debugging I traced the bug to be that getchar(); was for some reason reading in the '\n' character instead of waiting for input, the while loop test fails, and the function returns essentially an empty string.
I have done some research and keep finding to use
while(getchar() != '\n');
at the end of the function in order to properly flush stdin, however, this produces undesirable results, as the program will prompt again for more input after I type ENTER. Pressing ENTER again continues the program, but every-other subsequent calls continue to read in this mysterious '\n' character right off the bat, causing the test to fail, and resulting in empty strings whenever it comes time to print the contents of the the structure.
Could anyone explain to me what is going on here? Why does getchar() keep fetching a '\n' even though I supposedly cleared the input buffer? I have tried just placing a getchar(); statement at the beginning and end of the function, tried 'do while' loops, and taken other jabs at it, but I can't seem to figure this out.
The code you have written has several drawbacks. I'll try to explain them as it is unclear where your code is failing (probably outside the function you posted)
First of all, you don't check for EOF in getchar() result value. getchar(3) doesn't return a char precisely to allow to return al possible char values plus an extra one, EOF, to mark the end of file (this can be generated from a terminal by input of Ctrl-D in unix, or Ctrl-Z on windows machines) That case must be explicitly contempled in your code, as you'll convert the result to a char and will lose the extra information you received from the function. Read getchar(3) man page to solve this issue.
Second, you don't check for input of enough characters to fill all the array and overflow it. To the function you pass only a pointer to the beginning of the array, but nothing indicates how far it extends, so you can be overfilling past the end of its bounds, just overwritting memory that was not reserved for input purposes. This normally results in something called U.B. in the literature (Undefined Behaviour) and is something you must care of. This can be solved by passing a counter of valid positions to fill in the array and decrementing it for each valid position filled. And not allowing more input once the buffer has filled up.
On other side, you have a standar function that does exactly that, fgets(3) just reads one string array from an input file, and stores it on the pointer (and size) you pass to it:
char *fgets(char *buffer, size_t buffer_size, FILE *file_descriptor);
You can use it as in:
char buffer[80], *line;
...
while (line = fgets(buffer, sizeof buffer, stdin)) {
/* process one full line of input, with the final \n included */
....
}
/* on EOF, fgets(3) returns NULL, so we shall be here after reading the
* full input file */

Using fscanf until EOF without using getc

I have a text file of composed of sequences of 2 bytes which I have to store in an array.
I have declared FILE *ptr.
How can I loop until EOF without using the method:
while(c = getc() != EOF)
{
// do something
}
I want to implement something along the lines of (PSEUDOCODE):
while (ptr is not pointing to the end of the file)
{
fscanf(...) // I will read in the bytes etc.
}
The getc() method wouldn't work well for me because I am reading in blocks of 2 bytes at a time.
You can use fread to read more than one byte at a time. fread returns the number of items it was able to read.
For example, to read 2-byte chunks you might use:
while ((fread(target, 2, 1, ptr) == 1) {
/* ... */
}
Here 2 is the number of bytes in each "item", and 1 is the number of "items" you want to read on each call.
In general, you shouldn't use feof() to control when to terminate an input loop. Use the value returned by whichever input routine you're using. Different input functions vary in the information they provide; you'll have to read the documentation for the one you're using.
Note that this will treat an end-of-line as a single '\n' character. You say you're reading from a text file; it's not clear how you want to handle line endings. You should also decide what you want to do if the file has an odd number of characters.
Another option is to call getc() twice in the loop, checking its result both times.
The only way to tell when you've reached the end of the file is when you try to read past it, and the read fails. (Yes, there is an feof() function, but it only returns true after you've tried to read past the end of the file.)
This means that, if you're going to use fscanf() to read your input, it's the return value of fscanf() itself that you need to check.
Specifically, fscanf() returns the number of items it has successfully read, or EOF (which is a negative value, typically -1) if the input ended before anything at all could be read. Thus, your input loop might look something like this:
while (1) {
/* ... */
int n = fscanf(ptr, "...", ...);
if (n == EOF && !ferror(ptr)) {
/* we've reached the end of the input; stop the loop */
break;
} else if (n < items_requested) {
if (ferror(ptr)) perror("Error reading input file");
else fprintf(stderr, "Parse error or unexpected end of input!\n");
exit(1); /* or do whatever you want to handle the error */
}
/* ... */
}
That said, there may be other options, too. For example, if your input is structured as lines (which a lot of text input is), you may be better off reading the input line by line with fgets(), and then parsing the lines e.g. with sscanf().
Also, technically, there is a way to peek one byte ahead in the input, using ungetc(). That is, you could do something like this:
int c;
while ((c = getc(ptr)) != EOF) {
ungetc(c, ptr);
/* ... now read and parse the input ... */
}
The problem is that this only checks that you can read one more byte before EOF; it doesn't, and can't, actually check that your fscanf() call will have enough data to match all the requested items. Thus, you still need to check the return value of fscanf() anyway — and if you're going to do that, you might as well use it for EOF detection too.

Please Explain this Example C Code

This code comes from K&R. I have read it several times, but it still seems to escape my grasp.
#define BUFSIZE 100
char buf[BUFSIZE];
int bufp = 0;
int getch(void)
{
return(bufp>0)?buf[--bufp]:getchar();
}
int ungetch(int c)
{
if(bufp>=BUFSIZE)
printf("too many characters");
else buf[bufp++]=c;
}
The purpose of these two functions, so K&R says, is to prevent a program from reading too much input. i.e. without this code a function might not be able to determine it has read enough data without first reading too much. But I don't understand how it works.
For example, consider getch().
As far as I can see this is the steps it takes:
check if bufp is greater than 0.
if so then return the char value of buf[--bufp].
else return getchar().
I would like to ask a more specific question, but I literally dont know how this code achieves what it is intended to achieve, so my question is: What is (a) the purpose and (b) the reasoning of this code?
Thanks in advance.
NOTE: For any K&R fans, this code can be found on page 79 (depending on your edition, I suppose)
(a) The purpose of this code is to be able to read a character and then "un-read" it if it turns out you accidentally read a character too many (with a max. of 100 characters to be "un-read"). This is useful in parsers with lookahead.
(b) getch reads from buf if it has contents, indicated by bufp>0. If buf is empty, it calls getchar. Note that it uses buf as a stack: it reads it from right-to-left.
ungetch pushes a character onto the stack buf after doing a check to see if the stack isn't full.
The code is not really for "reading too much input", instead is it so you can put back characters already read.
For example, you read one character with getch, see if it is a letter, put it back with ungetch and read all letters in a loop. This is a way of predicting what the next character will be.
This block of code is intended for use by programs that make decisions based on what they read from the stream. Sometimes such programs need to look at a few character from the stream without actually consuming the input. For example, if your input looks like abcde12xy789 and you must split it into abcde, 12, xy, 789 (i.e. separate groups of consecutive letters from groups of consecutive digits) you do not know that you have reached the end of a group of letters until you see a digit. However, you do not want to consume that digit at the time you see it: all you need is to know that the group of letters is ending; you need a way to "put back" that digit. An ungetch comes in handy in this situation: once you see a digit after a group of letters, you put the digit back by calling ungetch. Your next iteration will pick that digit back up through the same getch mechanism, sparing you the need to preserve the character that you read but did not consume.
1. The other idea also shown here can be also called as a very primitive I/O stack mangement system and gives the implementation of the function getch() and ungetch().
2. To go a step further , suppose you want to design an Operating System , how can you handle the memory which stores all the keystrokes?
This is solved by the above code snippet.An extension of this concept is used in file handling , especially in editing files .In that case instead of using getchar() which is used to take input from Standard input , a file is used as a source of input.
I have a problem with code given in question. Using buffer (in form of stack) in this code is not correct as when getting more than one extra inputs and pushing into stack will have undesired effect in latter processing (getting input from buffer).
This is because when latter processing (getting input) going on ,this buffer (stack) will give extra input in reverse order (means last extra input given first).
Because of LIFO (Last in first out ) property of stack , the buffer in this code must be quene as it will work better in case of more than one extra input.
This mistake in code confused me and finally this buffer must be quene as shown below.
#define BUFSIZE 100
char buf[BUFSIZE];
int bufr = 0;
int buff = 0;
int getch(void)
{
if (bufr ==BUFSIZE)
bufr=0;
return(bufr>=0)?buf[bufr++]:getchar();
}
int ungetch(int c)
{
if(buff>=BUFSIZE && bufr == 0)
printf("too many characters");
else if(buff ==BUFSIZE)
buff=0;
if(buff<=BUFSIZE)
buf[buff++]=c;
}

Resources