fgets from stdin with unpredictable input size - c

I'm trying to read a line from stdin but I don't know to properly handle the cases when input size is at least equal to the limit. Example code:
void myfun() {
char buf[5];
char somethingElse;
printf("\nInsert string (max 4 characters): ");
fgets (buf, 5, stdin);
...
printf("\nInsert char: ");
somethingElse = getchar();
}
Now, the user can do three things:
Input less than 4 characters (plus newline): in this case there's nothing left in stdin and the subsequent getchar() correctly waits for user input;
Input exactly 4 characters (plus newline): in this case there's a newline left in stdin and the subsequent getchar() reads it;
Input more than 4 characters (plus newline): in this case there's at least another character left in stdin and the subsequent getchar() reads it, leaving at least a newline in.
Cases 2 and 3 would require emptying stdin using something like while(getchar() != '\n'), whereas case 1 doesn't require any additional action. As I understand from reading answers to similar questions and c-faq, there's no standard/portable way to know whether the actual scenario is the one described in 1 or not.
Did I get it well? Or there actually is a portable way to do it? Or maybe a totally different approach?

The fgets function will store the newline in the buffer if there is room for it. So if the last character in the string is not a newline, you know you need to flush the buffer.
fgets (buf, 5, stdin);
if (strrchr(buf, '\n') == NULL) {
// flush buffer
int c;
while ((c = getchar()) != '\n') && (c != EOF));
}

If ones assumes that a null character '\0' is never read, then #dbush answer will work.
If a null character is read, then strrchr(buf, '\n') does not find any '\n' that may have been read.
Code could pre-set the buffer to see if a '\n' was read in the end.
buf[sizeof buf - 2] = '\n';
if (fgets (buf, sizeof buf, stdin)) {
if (strrchr(buf, '\n') == NULL) {
// incomplete line read. (the usual detection)
} else if (buf[sizeof buf - 2] != '\n') {
// incomplete line read with prior null character (see below note).
}
}
Yet the C standard does not specify that data past what was read in buf[] is unchanged, pre-filling the buffer with some pattern is not sufficient to detect if a null character '\0' was read.
is a portable way to do it?
The most portable way is to use repeated calls to fgetc() or the like instead of fgets().
maybe a totally different approach?
I recommend fgetc() or the common but not C standard getline()
Another alternative: Use scanf("%4[^\n]%n", buf, &n): It is very cumbersome, yet a portable way is possible. It keeps track of the number of characters read before the '\n' even if some are null characters.
int n = 0;
cnt = scanf("%4[^\n]%n", buf, &n); // Cumbersome to get that 4 here.
// Lots of TBD code now needed:
// Handle if cnt != 1 (\n to be read or EOF condition)
// Handle if n == sizeof buf - 1, (0 or more to read)
// Handle if n < sizeof buf - 1, (done, now get \n)
// A \n may still need to be consumed
// Handle EOF conditions

Related

Will fgets always read from the beginning of the string?

I need to use fgets to read multiple lines of stdin that are separated by newlines (ex.
Hello
world
!
). If I include fgets in a for loop, will it read from the first line during every iteration? If so, how would I go about achieving this?
Will fgets always read from the beginning of the string?
No. fgets() reads based on the current state, which might be partially into a line of user input. It does not somehow magically go back to the beginning - just starts from where the stream is currently positioned..
fgets() does not read strings. It read a line of input and converts the input into a string. #Fe2O3. It does not stop when a null character is read.
fgets() reads until:
An '\n' is read.
The size - 1 passed in fgets() is met.
An end-of-file or input error is encountered (e.g. wrong stream mode, parity error, phase of the moon input error, ...)
If I include fgets in a for loop, will it read from the first line during every iteration? If so, how would I go about achieving this?
First before using fgets(), be sure the entire prior line was read.
2nd: if this line of input after using fgets() was incomplete, finish it.
char buf[N];
if (fgets(buf, sizeof buf, stdin) {
size_t len = strlen(buf);
if (len + 1 == sizeof buf && buf[sizeof buf - 2] != '\n')) {
// User input is not complete, read rest of line.
int ch;
while ((ch = getchar() != '\n') ** ch != EOF) {
;
}
}
Pedantic: fgets() is not a great input function if input might include null characters. In that case, more advance techniques are needed.

Usage of scanf ... getchar

Is the following pattern ok in C to get a string up until a newline?
int n = scanf("%40[^\n]s", title);
getchar();
It seems to work in being a quick way to strip off the trailing newline, but I'm wondering if there are shortcomings I'm not seeing here.
The posted code has multiple problems:
the s in the format string is not what you think it is: the specification is %40[^\n] and the s will try and match an s in the input stream, which may occur after 40 bytes have been stored into title.
scanf() will fail to convert anything of the pending input is a newline, leaving title unchanged and potentially uninitialized
getchar() will not necessarily read the newline: if more than 40 characters are present on the line, it will just read the next character.
If you want to read a line, up to 40 bytes and ignore the rest of the line up to and including the newline, use this:
char title[41];
*title = '\0';
if (scanf("%40[^\n]", title) == EOF) {
// end of file reached before reading anything, handle this case
} else {
scanf("%*[^\n]"); // discard the rest of the line, if any
getchar(); // discard the newline if any (or use scanf("%1*[\n]"))
}
It might be more readable to write:
char title[41];
int c, len = 0;
while ((c = getchar()) != EOF && c != '\n') {
if (len < 40)
title[len++] = c;
}
title[len] = '\0';
if (c == EOF && len == 0) {
// end of file reached before reading a line
} else {
// possibly empty line of length len was read in title
}
You can also use fgets():
char title[41];
if (fgets(title, sizeof title, stdin) {
char *p = strchr(title, '\n');
if (p != NULL) {
// strip the newline
*p = '\0';
} else {
// no newline found: discard reamining characters and the newline if any
int c;
while ((c = getchar()) != EOF && c != '\n')
continue;
}
} else {
// at end of file: nothing was read in the title array
}
Previous note, the s should be removed, it's not part of the specifier and is enough to mess up your read, scanf will try to match an s character against the string you input past the 40 characters, until it finds one the execution will not advance.
To answer your question using a single getchar is not the best approach, you can use this common routine to clear the buffer:
int n = scanf(" %40[^\n]", title);
int c;
while((c = getchar()) != '\n' && c != EOF){}
if(c == EOF){
// in the rare cases this can happen, it may be unrecoverable
// it's best to just abort
return EXIT_FAILURE;
}
//...
Why is this useful? It reads and discards all the characters remaing in the stdin buffer, regardless of what they are.
In a situation when an inputed string has, let's say 45 characters, this approach will clear the stdin buffer whereas a single getchar only clears 1 character.
Note that I added a space before the specifier, this is useful because it discards all white spaces before the first parseable character is found, newlines, spaces, tabs, etc. This is usually the desired behavior, for instance, if you hit Enter, or space Enter it will discard those and keep waiting for the input, but if you want to parse empty lines you should remove it, or alternatively use fgets.
There are a number of problems with your code like n never being used and wrong specifier for scanf.
The better approach is to use fgets. fgets will also read the newline character (if present before the buffer is full) but it's easy to remove.
See Removing trailing newline character from fgets() input

How to avoid pressing enter twice when using getchar() to clear input buffer?

I have this program:
#include <stdio.h>
#define SIZE 19
int main(){
char string[SIZE];
while (string[0] != 'A'){
printf("\nEnter a new string.\n");
fgets(string,SIZE,stdin);
int storage = 0;
while (storage != '\n')
{
storage = getchar();
}
}
}
The nested while loop with getchar() exists in case the inputted string exceeds the maximum number of characters string[] can hold. If that is not there, inputting a string with, say, 20 characters, would cause the output to be:
Enter a new string.
12345123451234512345
Enter a new string.
Enter a new string.
The problem is that this requires me to press enter twice in order to enter a new string: once for 'fgets' and one for the nested while loop (this is my understanding of what's going on).
Is there a way to change this so I only have to press 'Enter' once, or possibly a way to change the entire while loop into something more elegant?
If the buffer that receives from fgets contains a newline, you know it read everything that was inputted so you don’t need to do anything else. If not, then you use the extra loop to flush the buffer.
You are thinking correctly, you just need to think through how and when you need to empty the input buffer a bit further.
All line-oriented input functions (fgets and POSIX getline) will read, and include, the trailing '\n' in the buffers they fill (fgets only when sufficient space is provided in the buffer).
When using fgets, you have only two possible returns, (1) a pointer to the buffer filled, or (2) "NULL on error or when end of file occurs while no characters have been read."
In case fgets returns a valid pointer, then it is up to you to determine whether a complete line of input was read, or whether the input exceeds the buffer size, leaving characters in the input buffer unread.
To make that determination, you check whether the last character in the buffer is '\n' and if not, whether the buffer contains SIZE-1 characters indicating that characters remain in the input buffer. You can do that a number of ways, you can use strchr (to get a pointer to the '\n'), strcspn (to get an index to it) or good old strlen (to get the length of the string) and then check the character at len-1.
(a note on preference, you can use whatever method you like, but in either case of strcspn or strlen, save the index or length so it can be used to validate whether the input exceeded the buffer size or whether the user ended input by generating a manual EOF. You save the index or length to prevent having to make duplicate function calls to either)
It is also helpful to create a simple helper-function to clear the input buffer to avoid placing loops everywhere you need the check. A simple function will do the trick, e.g.
/* simple function to empty stdin */
void empty_stdin (void)
{
int c = getchar();
while (c != '\n' && c != EOF)
c = getchar();
}
of if you prefer the more-compact, but arguably less readable version, a single for loop will do, e.g.
void empty_stdin (void)
{
for (int c = getchar(); c != '\n' && c != EOF; c = getchar()) {}
}
The remainder of your example can be structured to complete each of the tests described above to provide input handling as you have described (although using the 1st character of the buffer being 'A' to control the loop is a bit strange), e.g.
#include <stdio.h>
#include <string.h>
#define STRSIZE 19
/* simple function to empty stdin */
void empty_stdin (void)
{
int c = getchar();
while (c != '\n' && c != EOF)
c = getchar();
}
int main (void) {
char string[STRSIZE] = ""; /* initialize string all 0 */
while (*string != 'A') { /* up to you, but 'A' is a bit odd */
size_t len = 0; /* variable for strlen return */
printf ("enter a string: "); /* prompt */
if (!fgets (string, STRSIZE, stdin)) { /* validate read */
putchar ('\n'); /* tidy up with POSIX EOF on NULL */
break;
}
len = strlen (string); /* get length of string */
if (len && string[len-1] == '\n') /* test if last char is '\n' */
string[--len] = 0; /* overwrite with nul-character */
else if (len == STRSIZE - 1) /* test input too long */
empty_stdin(); /* empty input buffer */
}
return 0;
}
An arguably more useful approach is to have the loop exit if nothing is input (e.g. when Enter alone is pressed on an empty line). The test would then be while (*string != '\n'). A better approach rather is simply controlling your input loop with while (fgets (string, STRSIZE, stdin)). There, you have validated the read before entering the loop. You can also wrap the whole thing in a for (;;) loop and control the loop exit based on any input condition you choose.
Those are all possibilities to consider. Look things over and let me know if you have further questions.
fgets() does read the newline IF (and only if) the buffer is long enough to reach and contain it, along with a trailing nul terminator.
Your sample input is 20 characters, which will be followed by a newline, and then a nul terminator. That won't go into a buffer of 19 characters.
The simple way is to use fgets() in a loop, until the newline is included in the buffer.
printf("\nEnter a new string.\n");
do
{
fgets(string,SIZE,stdin);
/*
handle the input, noting it may or may not include a trailing '\n'
before the terminating nul
*/
} while (strlen(string) > 0 && string[strlen(string) - 1] != '\n');
This loop will clear input up to and including the first newline, and also allow you to explicit handle (discard if needed) ALL the input received. It is therefore not necessary to use a second loop with getchar().
You haven't checked if fgets() returns NULL, so neither have I. It is advisable to check, as that can indicate errors on input.

Clear input stream after fgets set null char before newline

I have a program where the user enters two inputs. Being that I can't control what the user enters, the user can go past the fixed size of the array. Since fgets() appends retains a newline to the end before the null character, in the event that a newline cannot fit when the user goes beyond the intended size, the null character truncates the input. Does the newline character when the user hits enter still exist in the input stream? If so, is this the reason why fgets()skips the second time because of the newline from the first input?
#include <stdio.h>
int main(){
char str[5];
fgets(str,5,stdin);
printf("Output:%s",str);
fgets(str,5,stdin);
printf("Output:%s",str);
return 0;
}
Example Input
ABCDE\n
Output
Output:ABCDOutput:E
After reading this SO answer fgets() isn't prompting user a second time
, the issue seems to be not flushing the input stream via fflush(stdin), but I've heard conflicting information saying as this leads to undefined behavior. My last question is, what would be the appropriate way to clear the input stream if it's the retained newline that's causing issues?
the user can go past the fixed size of the array. Since fgets() appends a newline to the end before the null character
No, it does not. It writes characters read from the input into the provided buffer, up to and including the first newline, or until the specified buffer size is exhausted (less one byte for the string terminator), or until an error occurs or the end of the stream is reached, whichever comes first. The newline is not invented by fgets(); it comes from the input.
, in the event that a newline cannot fit when the user goes beyond the intended size, the null character truncates the input. Does the newline character when the user hits enter still exist in the input stream?
All characters entered by the user and not copied into the buffer remain waiting to be read in the stream. That will include the newline, if the user entered one.
If so, is this the reason why fgets()skips the second time because of the newline from the first input?
fgets() does not skip, but it does pick up where the previous call left off transferring characters from input to the buffer. No characters are lost. That means that the second call returns part of the first input line if the first call did not return the whole thing. You need to account one way or another for the possibility that the input does not conform to your line-length expectations.
the issue seems to be not flushing the input stream via fflush(stdin),
No, it isn't. Flushing is for sending buffered output to the underlying output device. Flushing an input stream produces undefined behavior. In principle, that could manifest as a buffer dump, and a given implementation might even specify such behavior, but you don't want that because there may be more data buffered than you want to get rid of.
but I've heard conflicting information saying as this leads to undefined behavior. My last question is, what would be the appropriate way to clear the input stream if it's the retained newline that's causing issues?
You read from the input until you've read the newline. There are plenty of I/O functions to choose from to accomplish this. fgets() itself might prove convenient, since you're already using it:
char str[5];
if (fgets(str, 5, stdin)) {
printf("Output:%s", str);
// read and consume the tail of the line, if any (overwrites str)
while (!strchr(str, '\n') && fgets(str, 5, stdin)) { /* empty */ }
}
fgets() reads until
1) New-line
2) Buffer is full
3) End-of-file
4) Input error (rare)
This code reads and takes care of #3 & #4
#define N 5
char buf[N];
if (fgets(buf, sizeof buf, stdin) == NULL) {
// Handle EOF or Error
return EOF;
}
To distinguish if a '\n' is present ... (#2 from #1)
// look for a lack of \n
if (strchr(buf, '\n') == NULL) {
And if so, read until it is found or EOF.
int ch;
while ((ch = fgetc(stdin)) != '\n' && ch != EOF);
}
--
Do not use the following code. It can be exploited by reading a null character as the first character.
size_t len = strlen(buf);
if (buf[len - 1] != '\n') { // bad way to detect \n
Could use
if (len > 0 && buf[len - 1] != '\n') { // Good
Being that I can't control what the user enters...
No, you cannot.
the user can go past the fixed size of the array.
Right. This is always a concern. However, In general you'll want to arrange things so that this rarely happens.
For example, if you really want to limit the user to (say) a 4-character-long input string, let him type whatever he wants, then see how much he typed, and if it was more than your limit, print a nice error message or something. But I do not recommend calling fgets(str, 5, stdin) if you're expecting 4 characters of input plus a newline, because it's just way too hard to recover when (not if) the user types too much.
in the event that a newline cannot fit when the user goes beyond the intended size, the null character truncates the input. Does the newline character when the user hits enter still exist in the input stream?
Absolutely yes.
If so, is this the reason why fgets()skips the second time because of the newline from the first input?
Pretty much yes.
I recommend allocating a much bigger buffer, and then proceeding something like this:
char inpbuf[512);
if(fgets(inpbuf, sizeof(inpbuf), stdin) == NULL) {
fprintf(stderr, "end of file\n");
return;
}
char *p = strrchr(inpbuf, '\n');
if(p == NULL) {
fprintf(stderr, "looks like you typed *way* too much\n");
return;
}
*p = '\0'; /* erase the \n */
if(strlen(inpbuf) > 4) {
fprintf(stderr, "you typed too much (max 4)\n");
return;
}
strcpy(str, inpbuf);
printf("Output:%s", str);
One glitch with this code as written, though: if the user hits the end-of-file key (control-D on Unix/Linux) before hitting Return, you'll falsely get the "looks like you typed way too much" message.
If the string read by fgets doesn't end in a newline, you know it's still in the buffer. In that case, call getchar in a loop until you get a newline.
fgets(str,5,stdin);
printf("Output:%s",str);
if (strchr(str, '\n') == NULL) {
int c;
while ((c = getchar()) != EOF && c != '\n');
}
fgets(str,5,stdin);
printf("Output:%s",str);

How to read input in C

I'm trying to read a line with scanf("%[^\n]"); right before it I'm reading an integer with "%d", was told to me that scanf doesn't erase the '\n' after reading, so I have to call fflush() to avoid it, but even doing that I still have the same problems, so here is my code:
scanf("%d", &n);
fflush(stdin);
lines = (char**)malloc(sizeof(char*)*n);
for(i = 0; i < n; i++){
lines[i] = (char*)malloc(sizeof(char)*1001);
}
for(i = 0;i < n;i++){
scanf("%[^\n]", linhes[i]);
}
I read an integer and then the scanf doesn't wait, it starts reading the input — doesn't matter what the integer value is, whether 5 or 10, the scanf reads all the strings to empty. Already tried with fgets and the result is almost the same, except that it reads some of the strings and skips others.
Let us look at this step by step:
"... read a line with scanf("%[^\n]");".
scanf("%[^\n]", buf) does not read a line. It almost does - sometimes. "%[^\n]" directs scanf() to read any number of non-'\n' char until one is encountered (that '\n' is then put back into stdin) or EOF occurs.
This approach has some problems:
If the first char is '\n', scanf() puts it back into stdin without changing buf in anyway! buf is left as is - perhaps uninitialized. scanf() then returns 0.
If at least one non-'\n' is read, it is saved into buf and more char until a '\n' occurs. A '\0' is appended to buf and the '\n' is put back into stdin and scanf() returns 1. This unlimited-ness can easily overfill buf. If no char was saved and EOF or input error occurs, scanf() returns EOF.
Always check the return value of scanf()/fgets(), etc. functions. If your code does not check it, the state of buf is unknown.
In any case, a '\n' is still usually left in stdin, thus the line was not fully read. This '\n' often is an issue for the next input function.
... scanf doesn't erase the '\n' after reading
Another common misconception. scanf() reads a '\n', or not, depending on the supplied format. Some formats consume '\n', others do not.
... call fflush() to avoid it
fflush(stdin) is well defined in some compilers but is not in the C standard. The usual problem is code wants to eliminate any remaining data in stdin. A common alternative, when the end of the line had not yet occurred, is to read and dispose until '\n' is found:
int ch; // Use int
while ((ch = fgetc(stdin)) != '\n' && ch != EOF);
I still have the same problems
The best solution, IMO, is to read a line of user input and then scan it.
char buf[sizeof lines[i]];
if (fgets(buf, sizeof buf, stdin) == NULL) return NoMoreInput();
// If desired, remove a _potential_ trailing \n
buf[strcspn(buf, "\n")] = 0;
strcpy(lines[i], buf);
I recommend that a buffer should be about 2x the size of expected input for typical code. Robust code, not this snippet, would detect if more of the line needs to be read. IMO, such excessively long lines are more often a sign of hackers and not legitimate use.
BLUEPIXY in the comment answered my question:
try "%[^\n]" change to " %[^\n]"

Resources