getchar() keeps reading a '\n'. What is going on? - c

I have a small program I'm writing to practice programming in C.
I want it to use the getchar(); function to get input from the user.
I use the following function to prompt for user input, then loop using getchar() to store input in an array:
The function is passed a pointer referencing a struct's member.
getInput(p->firstName); //The function is passed an argument like this one
void getInput(char * array)
{
int c;
while((c=getchar()) != '\n')
*array++ = c;
*array = '\0'; //Null terminate
}
This function is called multiple times, as it is a part of a function that creates a structure, and populates it's array members.
However when the program executes, The first two calls to it work fine, but any subsequent calls to this function will cause every-other call to getchar() to not wait for keyboard input.
After some debugging I traced the bug to be that getchar(); was for some reason reading in the '\n' character instead of waiting for input, the while loop test fails, and the function returns essentially an empty string.
I have done some research and keep finding to use
while(getchar() != '\n');
at the end of the function in order to properly flush stdin, however, this produces undesirable results, as the program will prompt again for more input after I type ENTER. Pressing ENTER again continues the program, but every-other subsequent calls continue to read in this mysterious '\n' character right off the bat, causing the test to fail, and resulting in empty strings whenever it comes time to print the contents of the the structure.
Could anyone explain to me what is going on here? Why does getchar() keep fetching a '\n' even though I supposedly cleared the input buffer? I have tried just placing a getchar(); statement at the beginning and end of the function, tried 'do while' loops, and taken other jabs at it, but I can't seem to figure this out.

The code you have written has several drawbacks. I'll try to explain them as it is unclear where your code is failing (probably outside the function you posted)
First of all, you don't check for EOF in getchar() result value. getchar(3) doesn't return a char precisely to allow to return al possible char values plus an extra one, EOF, to mark the end of file (this can be generated from a terminal by input of Ctrl-D in unix, or Ctrl-Z on windows machines) That case must be explicitly contempled in your code, as you'll convert the result to a char and will lose the extra information you received from the function. Read getchar(3) man page to solve this issue.
Second, you don't check for input of enough characters to fill all the array and overflow it. To the function you pass only a pointer to the beginning of the array, but nothing indicates how far it extends, so you can be overfilling past the end of its bounds, just overwritting memory that was not reserved for input purposes. This normally results in something called U.B. in the literature (Undefined Behaviour) and is something you must care of. This can be solved by passing a counter of valid positions to fill in the array and decrementing it for each valid position filled. And not allowing more input once the buffer has filled up.
On other side, you have a standar function that does exactly that, fgets(3) just reads one string array from an input file, and stores it on the pointer (and size) you pass to it:
char *fgets(char *buffer, size_t buffer_size, FILE *file_descriptor);
You can use it as in:
char buffer[80], *line;
...
while (line = fgets(buffer, sizeof buffer, stdin)) {
/* process one full line of input, with the final \n included */
....
}
/* on EOF, fgets(3) returns NULL, so we shall be here after reading the
* full input file */

Related

if my scanf variable is a float and a user inputs a character how can i prompt them to input a number? assuming the scanf is inside a do while loop

i have tried to use k = getchar() but it doesn't work too;
here is my code
#include<stdio.h>
int main()
{
float height;
float k=0;
do
{
printf("please type a value..\n");
scanf("%f",&height);
k=height;
}while(k<0);// i assume letters and non positive numbers are below zero.
//so i want the loop to continue until one types a +ve float.
printf("%f",k);
return 0;
}
i want a if a user types letters or negative numbers or characters he/she should be prompted to type the value again until he types a positive number
Like Govind Parmar already suggested, it is better/easier to use fgets() to read a full line of input, rather than use scanf() et al. for human-interactive input.
The underlying reason is that the interactive standard input is line-buffered by default (and changing that is nontrivial). So, when the user starts typing their input, it is not immediately provided to your program; only when the user presses Enter.
If we do read each line of input using fgets(), we can then scan and convert it using sscanf(), which works much like scanf()/fscanf() do, except that sscanf() works on string input, rather than an input stream.
Here is a practical example:
#include <stdlib.h>
#include <stdio.h>
#define MAX_LINE_LEN 100
int main(void)
{
char buffer[MAX_LINE_LEN + 1];
char *line, dummy;
double value;
while (1) {
printf("Please type a number, or Q to exit:\n");
fflush(stdout);
line = fgets(buffer, sizeof buffer, stdin);
if (!line) {
printf("No more input; exiting.\n");
break;
}
if (sscanf(line, " %lf %c", &value, &dummy) == 1) {
printf("You typed %.6f\n", value);
continue;
}
if (line[0] == 'q' || line[0] == 'Q') {
printf("Thank you; now quitting.\n");
break;
}
printf("Sorry, I couldn't parse that.\n");
}
return EXIT_SUCCESS;
}
The fflush(stdout); is not necessary, but it does no harm either. It basically ensures that everything we have printf()'d or written to stdout, will be flushed to the file or device; in this case, that it will be displayed in the terminal. (It is not necessary here, because standard output is also line buffered by default, so the \n in the printf pattern, printing a newline, also causes the flush.
I do like to sprinkle those fflush() calls, wherever I need to remember that at this point, it is important for all output to be actually flushed to output, and not cached by the C library. In this case, we definitely want the prompt to be visible to the user before we start waiting for their input!
(But, again, because that printf("...\n"); before it ends with a newline, \n, and we haven't changed the standard output buffering, the fflush(stdout); is not needed there.)
The line = fgets(buffer, sizeof buffer, stdin); line contains several important details:
We defined the macro MAX_LINE_LEN earlier on, because fgets() can only read a line as long as the buffer it is given, and will return the rest of that line in following calls.
(You can check if the line read ended with a newline: if it does not, then either it was the final line in an input file that does not have a newline at the end of the last line, or the line was longer than the buffer you have, so you only received the initial part, with the rest of the line still waiting for you in the buffer.)
The +1 in char buffer[MAX_LINE_LEN + 1]; is because strings in C are terminated by a nul char, '\0', at end. So, if we have a buffer of 19 characters, it can hold a string with at most 18 characters.
Note that NUL, or nul with one ell, is the name of the ASCII character with code 0, '\0', and is the end-of-string marker character.
NULL (or sometimes nil), however, is a pointer to the zero address, and in C99 and later is the same as (void *)0. It is the sentinel and error value we use, when we want to set a pointer to a recognizable error/unused/nothing value, instead of pointing to actual data.
sizeof buffer is the number of chars, total (including the end-of-string nul char), used by the variable buffer.
In this case, we could have used MAX_LINE_LEN + 1 instead (the second parameter to fgets() being the number of characters in the buffer given to it, including the reservation for the end-of-string char).
The reason I used sizeof buffer here, is because it is so useful. (Do remember that if buffer was a pointer and not an array, it would evaluate to the size of a pointer; not the amount of data available where that pointer points to. If you use pointers, you will need to track the amount of memory available there yourself, usually in a separate variable. That is just how C works.)
And also because it is important that sizeof is not a function, but an operator: it does not evaluate its argument, it only considers the size (of the type) of the argument. This means that if you do something silly like sizeof (i++), you'll find that i is not incremented, and that it yields the exact same value as sizeof i. Again, this is because sizeof is an operator, not a function, and it just returns the size of its argument.
fgets() returns a pointer to the line it stored in the buffer, or NULL if an error occurred.
This is also why I named the pointer line, and the storage array buffer. They describe my intent as a programmer. (That is very important when writing comments, by the way: do not describe what the code does, because we can read the code; but do describe your intent as to what the code should do, because only the programmer knows that, but it is important to know that intent if one tries to understand, modify, or fix the code.)
The scanf() family of functions returns the number of successful conversions. To detect input where the proper numeric value was followed by garbage, say 1.0 x, I asked sscanf() to ignore any whitespace after the number (whitespace means tabs, spaces, and newlines; '\t', '\n', '\v', '\f', '\r', and ' ' for the default C locale using ASCII character set), and try to convert a single additional character, dummy.
Now, if the line does contain anything besides whitespace after the number, sscanf() will store the first character of that anything in dummy, and return 2. However, because I only want lines that only contain the number and no dummy characters, I expect a return value of 1.
To detect the q or Q (but only as the first character on the line), we simply examine the first character in line, line[0].
If we included <string.h>, we could use e.g. if (strchr(line, 'q') || strchr(line, 'Q')) to see if there is a q or Q anywhere in the line supplied. The strchr(string, char) returns a pointer to the first occurrence of char in string, or NULL if none; and all pointers but NULL are considered logically true. (That is, we could equivalently write if (strchr(line, 'q') != NULL || strchr(line, 'Q') != NULL).)
Another function we could use declared in <string.h> is strstr(). It works like strchr(), but the second parameter is a string. For example, (strstr(line, "exit")) is only true if line has exit in it somewhere. (It could be brexit or exitology, though; it is just a simple substring search.)
In a loop, continue skips the rest of the loop body, and starts the next iteration of the loop body from the beginning.
In a loop, break skips the rest of the loop body, and continues execution after the loop.
EXIT_SUCCESS and EXIT_FAILURE are the standard exit status codes <stdlib.h> defines. Most prefer using 0 for EXIT_SUCCESS (because that is what it is in most operating systems), but I think spelling the success/failure out like that makes it easier to read the code.
I wouldn't use scanf-family functions for reading from stdin in general.
fgets is better since it takes input as a string whose length you specify, avoiding buffer overflows, which you can later parse into the desired type (if any). For the case of float values, strtof works.
However, if the specification for your deliverable or homework assignment requires the use of scanf with %f as the format specifier, what you can do is check its return value, which will contain a count of the number of format specifiers in the format string that were successfully scanned:
§ 7.21.6.2:
The [scanf] function returns the value of the macro EOF if an input failure occurs
before the first conversion (if any) has completed. Otherwise, the function returns the
number of input items assigned, which can be fewer than provided for, or even zero, in
the event of an early matching failure.
From there, you can diagnose whether the input is valid or not. Also, when scanf fails, stdin is not cleared and subsequent calls to scanf (i.e. in a loop) will continue to see whatever is in there. This question has some information about dealing with that.

How can I clear stdin in a C progam?

This is a seemingly simple question that I have not been able to answer for far too long:
I am trying to read input from a user in a C program using fgets(). However, I am running into the problem that if the user enters more characters than fgets() is set to read, the next call to read a string from the user automatically reads the remaining characters in the stdin buffer, and this is NOT behavior I want.
I have tried many ways to clear the stdin stream, and while I know something like
while(getchar()!='\n');
will work, this requires the user to hit enter an additional time which is not something I want.
The structure of the code looks something like this:
void read_string(char *s, int width){
fgets(s,width,stdin);
clear_stdin();
.
.
}
while (1){
read_string()
.
.
}
But I cannot get a clear_stdin() function that works desirably. How on earth can I clear the stdin, without having the user needlessly need to hit enter twice?
To achieve what you want — reading and ignoring extra characters up to a newline if the buffer you supplied is over-filled — you need to conditionally read up to the newline, only doing so if there isn't a newline already in the input buffer that was read.
void read_string(char *s, int width)
{
if (fgets(s, width, stdin) != 0)
{
size_t length = strlen(s);
if (length > 0 && s[length-1] != '\n')
{
int c;
while ((c = getchar()) != '\n' && c != EOF)
;
}
/* Use the input data */
}
else
/* Handle EOF or error */
}
The other part of the technique is to make sure that you use a big enough buffer that it is unlikely that anyone will overflow it. Think in terms of char buffer[4096]; as an opening bid; you can increase it if you prefer. That makes it unlikely that anyone will (be able to) type enough data on a single line to overflow the buffer you provide, thus avoiding the problem.
Another option is to use POSIX getline(). It reads a line into allocated space, allocating more space until the data fits (or you run out of memory). It has at least one other major advantage over fgets() — it reports the number of characters in the line it read, which means it is not confused by null bytes ('\0', usually typed as Control-#) in the input. By contrast, you can't tell whether there was any data entered after the first null byte with fgets(); you have to assume that the input stopped at the first null byte.
Note that the simple loop shown in the question (while (getchar() != '\n');) becomes infinite if it encounters EOF before reading a newline.
You cannot clear stdin in a portable way (because no function from <stdio.h> is specified doing that). BTW, stdin can usually be not only a terminal, but also a redirection or a pipe (or even perhaps some socket). Details matter of course (e.g. your operating system and/or running environment).
You could avoid stdio and use operating system specific ways to deal with standard input (e.g. working at the file descriptor level on POSIX systems).
On Linux (specifically) you might read more about the Tty demystified, and code low level code based on such knowledge. See termios(3). Consider using readline(3).
You could use (on Linux at least) getline(3) to read a heap-allocated line buffer.
while ((getchar()) != '\n');
This will not always work...(but on the bright side, the cases in which it doesn't are just as portable as the cases in which it does). But if stdin has not been redirected, the terminal char of the user's input, unless a manual EOF, will usually be a newline. After you extract what you expect, assuming you don't expect the \n, you can drain what's there up until(and including) the '\n', and then iterate anew. As others have suggested, there are higher level interfaces to deal with this minutia more reliably than manual fringe case handling most of the time.
More Details on Challenge and Solutions
This link contains the cardinal sin of "C\C++" in its heading, which doesn't exist as an entity. Rest assured, separate C examples are given, discrete from alternate C++ ones.

Stack around the variable was corrupted

I am having a problem with gets.
The purpose is to get input from the user until he hits the 'Enter'.
This is the code:
struct LinkedListNode* executeSection2()
{
char inputArr [3] = {'\0'};
struct LinkedListNode* newNode;
struct LinkedListNode* head = NULL;
gets (inputArr);
while (inputArr[0] != 0) // The user didn't press "Enter"
{
newNode=newLinkedListNode();
newNode->tree=newHuffmanNode(inputArr[0],atoi(inputArr+2));
head = addNode(&head, newNode);
gets (inputArr);
}
head = buildHuffmanTree(&head);
return head;
}
It seems OK, the user hits the 'Enter', the code go out from the while, but after the return, I get the error message:
Stack around the variable 'inputArr' was corrupted
I guess I dont read the input from the keyboard properly.
I will be happy for some guidness.
Thanks.
This error is a perfect illustration to the reason why gets has been deprecated: it is prone to buffer overruns, which corrupt stack or whatever memory happens to be near the end of your buffer. When the user enters more than two characters, the first three get placed into the buffer, and the rest go into whatever happens to be in the memory after it, causing undefined behavior.
You need to replace the call of gets with a call of fgets, which accepts the size of the buffer, end prevents user input from overrunning it:
fgets (inputArr, 3, stdin);
on every while iteration, the user hit enter and at the end, when he wants to stop, he hits only enter.
fgets considers '\n' part of the string, so when the user hits enter the only character in the returned string will be '\n':
while (inputArr[0] != '\n') { // The user didn't press "Enter"
...
}
The gets function in C is a classic buffer overflow function. gets is one of the functions which gives C a bad name for security. You are experiencing a buffer overflow. As long as you never intend to distribute this code, I won't object. However, you should never use gets for anything more than a toy program. The man page says as much, and informs you that no check for buffer overrun is performed. On my Mac, the man page say:
It is the caller's responsibility to ensure that the input line, if any, is sufficiently short to fit in the string.
As to why this happens, it happens because the user is inputting more data than your program can handle. Your program can handle two characters. It doesn't look like the newline character(s) should be counted. In a properly coded application, it should be impossible for user input to corrupt memory in this way.

Pthreads, fread(), and printf(): Getting random D4's in my string

The Scoop:
I am creating a method that runs through a lengthy file in chunks: using pthreads. I am calling fread() to read the file in this sort of fashion:
fread( thread_data[i].buffer, 1, 50, f )
/*
thread_data is a data structure for each thread (hence i)
buffer is in thread_data as an array of length 50
*/
I am then directly calling a print statement to see what each thread is doing, as a weird pattern was showing up in some of the parts that I was printing. Namely, my print statement would look something like this:
this is suppose to be 50 characters, but it is only a fewgD4
That D4 directly above is what I have my question on. Every thread that I make, at the end of the string, we are printing D4, and in this case, followed by a g. Other times, it is followed by a d, and most commonly a �. Now, I did read the wikipedia page on this character, which states:
replacement character used to replace an unknown or unrepresentable character
My question:
What kind of an error am I running into? Why is the end of each read statement containing unknown characters, especially the weird gD4 guy?
Aside:
I am trying to make a function in c that utilizes pthreads to find the frequency of each word in a file, in case anyone was wondering. These weird characters were showing up in my list, which is something that I find slightly unpleasent. Finally, don't bother linking me to the Obligaroty Unicode article, I am already aware of it, and the characters are not outside of what I am working with.
The strings you are printing out are not null-terminated — fread() does not null-terminate its output, it simply reads in as many raw bytes as you asked for (or fewer). So when you print out your buffer, your print function is walking past the end of the data and printing out whatever garbage memory comes after the buffer, which in your case just happens to be gD4.
You need to either explicitly null-terminate your buffer; or, if your print function supports it, tell it exactly how many characters to print. Either way, you need to save the return value from fread to know how many characters you read. For example:
int n = fread(thread_data[i].buffer, 1, 50, f);
if (n < 0) /* Handle error */ ;
// Explicitly add a null terminator -- make sure the buffer has room for it!
thread_data[i].buffer[n] = 0;

How does fgets work in this program and how does it tie into the 'stream' concept?

I am having difficulty with a feature of a segment of code that is designed to illustrate the fgets() function for input. Before I proceed, I would like to make sure that my understanding of I/O and streams is correct and that I'm not completely off base:
Input and Output in C has no specific viable function for working with strings. The one function specific for working with strings is the 'gets()' function, which will accept input beyond the limits of the char array to store the input (thus making it effectively illegal for all but backward compatibility), and create buffer overflows.
This brings up the topic of streams, which to the best of my understanding is a model to explain I/O in a program. A stream is considered 'flowing water' on which the data utilized by programs is conveyed. See links: (also as a conveyor belt)
Can you explain the concept of streams?
What is a stream?
In the C language, there are 3 predefined ANSII streams for standard input and output, and 2 additional streams if using windows or DOS which are as follows:
stdin (keyboard)
stdout (screen)
stderr (screen)
stdprn (printer)
stdaux (serial port)
As I understand, to make things manageable it is okay to think of these as rivers that exist in your operating system, and a program uses I/O functions to put data in them, take data out of them, or change the direction of where the streams are flowing (such as reading or writing a file would require). Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system. What you need to be concerned with is where the water takes your data, and that is mediated by use of specific functions (such as printf(), puts(), gets(), fgets(), etc.).
This is where my questions start to take form. Now I am interested in getting a grasp on the fgets() function and how it ties into streams. fgets() uses the 'stdin' stream (naturally) and has the built in fail safe (see below) that will not allow user input to exceed the array used to store the input. Here is the outline of the fgets() function, rather its prototype (which I don't see why one would ever need to declare it?):
char *fgets(char *str , int n , FILE *fp);
Note the three parameters that the fgets function takes:
p1 is the address of where the input is stored (a pointer, which will likely just be the name of the array you use, e.g., 'buffer')
p2 is the maximum length of characters to be input (I think this is where my question is!)
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
Now, the code I have below will allow you to type characters until your heart is content. When you hit return, the input is printed on the screen in rows of the length of the second parameter minus 1 (MAXLEN -1). When you enter a return with no other text, the program terminates.
#include <stdio.h>
#define MAXLEN 10
int main(void)
{
char buffer[MAXLEN];
puts("Enter text a line at a time: enter a blank line to exit");
while(1)
{
fgets(buffer, MAXLEN, stdin); //Read comments below. Note 'buffer' is indeed a pointer: just to array's first element.
if(buffer[0] == '\n')
{
break;
}
puts(buffer);
}
return 0;
}
Now, here are my questions:
1) Does this program allow me to input UNLIMITED characters? I fail to see the mechanism that makes fgets() safer than gets(), because my array that I am storing input in is of a limited size (256 in this case). The only thing that I see happening is my long strings of input being parsed into MAXLEN - 1 slices? What am I not seeing with fgets() that stops buffer overflow that gets() does not? I do not see in the parameters of fgets() where that fail-safe exists.
2) Why does the program print out input in rows of MAXLEN-1 instead of MAXLEN?
3) What is the significance of the second parameter of the fgets() function? When I run the program, I am able to type as many characters as I want. What is MAXLEN doing to guard against buffer overflow? From what I can guess, when the user inputs a big long string, once the user hits return, the MAXLEN chops up the string in to MAXLEN sized bites/bytes (both actually work here lol) and sends them to the array. I'm sure I'm missing something important here.
That was a mouthful, but my lack of grasp on this very important subject is making my code weak.
Question 1
You can actually type as much character as your command line tool will allow you per input. However, you call to fgets() will handle only MAXLEN in your example because you tell him to do so.
Moreover, there is no safe check inside fgets(). The second parameter you gave to fgets is the "safety" argument. Try to give to change your call to fgets to fgets(buffer, MAXLEN + 10, stdin); and then type more than MAXLEN characters. Your program will crash because you are accessing unallocated memory.
Question 2
When you make a call to fgets(), it will read MAXLEN - 1 characters because the last one is reserved to the character code \0 which usually means end of string
The second parameter of fgets() is not the number of character you want to store but the maximum capacity of your buffer. And you always have to think about string termination character \0
Question 3
If you undestood the 2 answer before, you will be able to answer to this one by yourself. Try to play with this value. And use a different value than the one used for you buffer size.
Also, you said
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
You can use fgets to read files stored on your computer. Here is an example :
char buffer[20];
FILE *stream = fopen("myfile.txt", "r"); //Open the file "myfile.txt" in readonly mode
fgets(buffer, 20, stream); //Read the 19 first characters of the file "myfile.txt"
puts(buffer);
When you call fgets(), it lets you type in as much as you want into stdin, so everything stays in stdin. It seems fgets() takes the first 9 characters, attaches a null character, and assigns it to buffer. Then puts() displays buffer then creates a newline.
The key is it's in a while loop -- the code loops again then takes what was remaining in stdin and feeds it into fgets(), which takes the next 9 characters and repeats. Stdin just still had stuff "in queue".
Input and Output in C has no specific viable function for working with strings.
There are several functions for outputting strings, such as printf and puts.
Strings can be input with fgets or scanf; however there is no standard function that both inputs and allocates memory. You need to pre-allocate some memory, and then read some characters into that memory.
Your analogy of a stream as a river is not great. Rivers flow whether or not you are taking items out of them, but streams don't. A better analogy might be a line of people at the gates to a stadium.
C also has the concept of a "line", lines are marked by having a '\n' character at the end. In my analogy let's say the newline character is represented by a short person.
When you do fgets(buf, 20, stdin) it is like "Let the next 19 people in, but if you encounter a short person during this, let him through but not anybody else". Then the fgets function creates a string out of these 0 to 19 characters, by putting the end-of-string marker on the end; and that string is placed in buf.
Note that the second argument to fgets is the buffer size , not the number of characters to read.
When you type in characters, that is like more people joining the queue.
If there were fewer than 19 people and no short people, then fgets waits for more people to arrive. In standard C there's no way to check if people are waiting without blocking to wait for them if they aren't.
By default, C streams are line buffered. In my analogy, this is like there is a "pre-checking" gate earlier on than the main gate, where all people that arrive go into a holding pen until a short person arrives; and then everyone from the holding pen plus that short person get sent onto the main gate. This can be turned off using setvbuf.
Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system.
This is something you do have to worry about. stdin etc. are already begun before you enter main(), but other streams (e.g. if you want to read from a file on your hard drive), you have to begin them.
Streams may end. When a stream is ended, fgets will return NULL. Your program must handle this. In my analogy, the gate is closed.

Resources