C confusion of printf, and gets method - c

Im new to C. I have a problem in understanding a piece of code.
What I don't understand are two things. The second argument of fgets is the maximum length of bytes can be stored in the buffer.
Why if I type more letters in the terminal and hit enter still the string is printed back in full. I am assuming that if the length of the string inserted in the console is larger than the buffer will overflow and the printf will work because it stops on the termination of the string, but then what it is the point of setting a max limit as second argument to fgets?
#define buff_size 4
//3.71
void good_echo(){
char buf[buff_size];
while(1) {
char* p = fgets(buf, 8, stdin);
if(p == NULL) {
break;
}
printf("%s", p);
}
return;
}

fgets will stop when it reads the enter key OR when the buffer is full.
If you say the buffer is 8 characters long, and you type abcdef<enter> it will put in the buffer a, b, c, d, e, f, \n and \0 (8 characters).
If you say the buffer is 8 characters long, and you type abcdefgh<enter>, it will put in the buffer a, b, c, d, e, f, g and \0 (8 characters). There is still h<enter> left over, which will be read the next time you call fgets (or gets or getchar or scanf etc)

The reason because you're still getting back what you write is that stdin is buffered, that means when you read less characters than you actually wrote, considering you're in a while loop and fgets doesn't give any error(so you don't break in the if statement) you do once more the fgets and get the remaining chars.

Why if I type more letters in the stdin still the string is printed back. I am assuming that if the length of the string inserted in the console is larger than the buffer will overflow and the printf will work because it stops on the termination of the string,
You get the full input echoed back by the program because the fgets() and printf() calls are inside a loop. Each fgets() call will read and store as many characters as will fit in the buffer, up to and including a newline if one is in fact encountered before the available buffer space is exhausted. If there is more data than will fit in the buffer -- because you typed more characters than can fit at once, or because you typed ahead multiple lines at hyperspeed -- then whatever is not read by fgets() remains in the stream, waiting to be read via some future call to an I/O function.
In your program, the printf() echoes back the characters that were read and stored, and control then loops back to fgets(). If there are more characters available to read, then it will read them, up to, again, the buffer capacity or a newline, whichever comes first. In this way, one long line may be consumed from the standard input and echoed to the standard output over multiple iterations of the loop.
but then what it is the point of setting a max limit as second argument to fgets?
It does exactly what it is advertized to do. On any call, fgets() will write up to that many bytes into the provided buffer.

I believe the confusion is related to the behavior of the terminal rather than the behavior of the program. When you type a single character into the terminal, it is displayed on the screen by the terminal driver. That behavior has nothing to do with your program. Eventually, the fgets function will read some data from stdin, but that may happen many seconds or minutes (or hours, if you go on a long lunch) after you initially typed the key. Generally, the terminal driver will hold on to the data until you hit 'enter' (or 'return'), at which point a line of text will be sent to the program. It might be easier to see if you actually write some data and see what your program is doing. eg:
$ cat a.c
#include <signal.h>
#include <stdio.h>
#define buff_size 4
void
good_echo(void)
{
char buf[buff_size];
char *p;
while( (p = fgets(buf, sizeof buf, stdin)) != NULL ){
printf("good_echo read: %s\n", p);
}
}
int
main(int argc, char **argv)
{
good_echo();
return 0;
}
$ gcc a.c
$ ./a.out
this is some text that is typed
good_echo read: thi
good_echo read: s i
good_echo read: s s
good_echo read: ome
good_echo read: te
good_echo read: xt
good_echo read: tha
good_echo read: t i
good_echo read: s t
good_echo read: ype
good_echo read: d
In the above, you can see that each call to fgets only consumes 3 bytes.

Related

Printing text too often?

i am new to c and have used python before. This whole buffer overflow stuff is really breaking my mind.
#include <stdio.h>
int main(){
char str1[3];
while(true){
scanf("%2s", str1);
printf("test\n");
}
}
This is a little code i've written to test the syntax and the stdio library. I was really suprised when the program outputted "test" multiple times, depending on how many characters i entered. So for example, when I entered two characters, it printed "test" two times. Can anyone please tell me why this is happening and how I can fix it?
You can figure out what happens by modifying your code as follows:
#include <stdio.h>
int main(){
char str1[3];
while( 1 ){
scanf("%2s", str1);
printf("test: %s\n", str1);
}
}
which simply prints the contents of the str1 alongside of the "test" string.
Here is an example output for an input string of 1234567:
1234567
test: 12
test: 34
test: 56
test: 7
The scanf("%2s", str1); statement reads two characters from the stdin and assings them to the str1. The read characters are "popped" from the input stream, i.e., they are removed. If the stdin happens to contain more characters, the excess ones are left untouched. Therefore, for the given input, when the first scanf is returned, the str1 containes 12\0, and the stdin contains 34567.
Since these are in the infinite loop, the code repeats, scanf gets called again, reading the first two characters from the stdin again, only this time finds 34.
And the process repeats, untill there are no characters left on the stdin, then the scanf waits for the user input, presumably as you would have expected.
Basically, scanf keeps reading instead of waiting for user input, since the stdin already contains something to read.
So for example, when I entered two characters, it printed "test" two times.
This on the other hand, does not make sense, as it should be printing "test" for N/2 times, rounded up, where N is the number of characters you enter.
There is not much that I can suggest for "fixing this", since it is not really clear what you are expecting. But if you want to get rid of the remaining characters in the stdin, you can check this.
You need to clear your input buffer as per this answer
Otherwise, you'll read from the stdin, print it, jump back to the loop head and continue reading, if there is still something in the buffer.
Each time through the loop, scanf("%2s", str1) reads at most 2 non-whitespace characters from the input stream. If there are more than 2 non-whitespace characters available in the stream, the loop will continuously call scanf (and printf) until scanf blocks waiting for data. If the input stream contains ffff\n and has not yet been closed (eg, a user is entering data interactively from a tty), the first 2 calls to scanf will immediately return and printf will be called twice. The 3rd call to scanf will block until more data is available, or the stream is closed, or there is an error.

using fgets() with stdin as input: ^D fails to signal EOF

I'm writing a program in C on my MacBook which uses Mojave and I'm trying to use fgets() to get a string from stdin.
My code compiles - the only issue is that when I run the program in the terminal, after fgets() is called and I type in the desired input, I can't figure out how to signal the end of the input so that the program can continue running.
I recognise many people have had this issue and that there are many pages on this site addressing it. But none of the solutions (that I have understood) have worked for me. I've read this and this but these aren't helping.
I've checked out the documentation for fgets() which says:
"fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an *EOF* or a newline. If a newline is read, it is stored into the buffer. A terminating null byte (\0) is stored after the last character in the buffer." - from this page.
Entering 'stty all' in the terminal shows that EOF indeed corresponds to ^D. I've tried entering ^D twice, three times, pressing Enter then ^D, ^D then Enter, etc. etc. Nothing seems to work.
What am I doing wrong? Here's the relevant bit of the code (originally from here, under the 'Pointers to Structures Containing Pointers' section):
#include <stdio.h>
typedef struct
{
char name[21];
char city[21];
char phone[21];
char *comment;
} Address;
int main(void)
{
Address s;
char comm[100];
fgets(s.name, 20, stdin);
fgets(s.city, 20, stdin);
fgets(s.phone, 20, stdin);
fgets(comm, 100, stdin);
return 0;
}
You do not test the return value of fgets(): if you indeed signal an end of file from the terminal, the subsequent calls to fgets() will return NULL and the destination arrays will be left uninitialized.
There is nothing in your code that prevents program operation at end of file. Just hit enter after each piece of input. Why do you think you need to signal end of file?

About the mechanism of using fgets() and stdin together

I would like to have a better understanding of using fgets() and stdin.
The following is my code:
int main()
{
char inputBuff[6];
while(fgets(inputBuff, 6, stdin))
{
printf("%s", inputBuff);
}
return 0;
}
Let's say my input is aaaabbbb and I press Enter. By using a loopcount, I understand that actually the loop will run twice (including the one I input aaaabbbb) before my next input.
Loop 1: After I have typed in the characters, aaaabbbb\n will be stored in the buffer of stdin file stream. And fgets() is going to retrieve a specific number of data from the file stream and put them in inputBuff. In this case, it will retrieve 5 (6 - 1) characters at a time. So that when fgets() has already run once, inputBuff will store aaaab, and then be printed.
Loop 2: Then, since bbb\n are left in the file stream, fgets() will execute for the second time so that inputBuff contains bbb\n, and then be printed.
Loop 3: The program will ask for my input (the 2nd time) as the file stream has reached the end (EOF).
Question: It seems that fgets() will only ask for my keyboard input after stdin stream has no data left in buffer (EOF). I am just wondering why couldn't I use keyboard to input anything in loop 2, and fgets() just keep on retrieving 5 characters from stdin stream and left the excess data in the file stream for next time retrieval. Do I have any misunderstanding about stdin or fgets()? Thank you for your time!
The behavior of your program is somewhat more subtle than you expect:
fgets(inputBuff, 6, stdin) reads at most 5 bytes from stdin and stops reading when it gets a newline character, which is stored into the destination array.
Hence as you correctly diagnose, the first call reads the 5 bytes aaab and prints them and the second call reads 4 bytes bbb\n and prints them, then the third call gets an empty input stream and waits for user input.
The tricky part is how stdin gets input from the user, also known as console input.
Both console input and stdin are usually line buffered by default, so you can type a complete line of input regardless of the size of the buffer passed to fgets(). Yet if you can set stdin as unbuffered and console input as uncooked, the first fgets() would indeed read the first 5 bytes as soon as you type them.
Console input is an intricate subject. Here is an in depth article about its inner workings: https://www.linusakesson.net/programming/tty/
Everything is there in manual page of fgets() whatever you are asking. Just need to read it properly, It says
char *fgets(char *s, int size, FILE *stream);
fgets() reads in at most one less than sizecharacters
from stream and stores them into the buffer pointed to by s. Reading
stops after an EOF or a newline. If a newline is read, it is
stored into the buffer. A terminating null byte (aq\0aq) is stored
after the last character in the buffer.
If input is aaaabbbb and in fgets() second argument you specified size as 6 i.e it will read one less 5 character and terminating \0 will be added so first time inputBuff holds aaaab and since still EOF or \n didn't occur so next time inputBuff holds bbb\n as new line also get stored at last.
Also you should check the return type of fgets() and check if \n occurs then break the loop. For e.g
char *ptr = NULL;
while( (ptr = fgets(inputBuff, 6, stdin))!= NULL){
if(*ptr == '\n')
break;
printf("%s", inputBuff);
}
fgets() does only read until either '\n' or EOF. Everything after that will be left in stdin and therefore be read when you call fgets() again. You can however remove the excess chars from stdin by for example using getc() until you reach '\0'. You might want to look at the manpages for that.

How is this buffer really working?

as a Linux system programming exercise I've written my own version of the tree command, which is to read from stdin and write to stdout using only the basic read() and write() C library functions. I've done it so that when an asterisk (*) is entered, the program is terminated. I have managed to get it to work properly, my problem is that I don't really understand why it works the way it does. What confuses me is the buffer. First of all, here is the code portion in question:
char buf[1];
...
do {
read(STDIN_FILENO, buf, 1);
if( buf[0] == '*') break;
write(STDOUT_FILENO, buf, 1);
} while( buf[0] != '*');
...
My idea was to read from stdin char by char, thereby storing the char in buf, check if it was an asterisk, then write the char from buf to stdout.
The behaviour is the following: I type a string of any number of chars, press ENTER, that string gets output to stdout, at which point I can type a new char string. If the string ends with an asterisk, the string is output up until the asterisk, then the program is terminated.
My problems are:
1) buf is sopposed to contain only one char. How is it possible that I enter any number of chars und upon pressing ENTER all of them are output to stdout? I would expect one char at a time to be output, or only the last one. How does a one-char buffer store all of those chars? Or do many one-char buffers get created? By whom?
2) What is so special about the newline character that prompts the string to be output? Why is it not just another char within the string? Is it just a matter of definition within the function read()?
Thank you for any help in understanding the working of the buffer!
This is based upon the way the IO calls - read and write will work on most OS's.
You are reading only 1 byte, so while you are typing, stuff will be held by an io buffer (not yours), until your loop reads it. Since you have no sleeps, it will be reading, or waiting to read faster than you can humanly type.
Also as R Sahu suggests - the input buffer may not be presented to your program until you press enter on the console you are typing at. This depends on the console and its config - but most will buffer lines and wait for enter too. This would be different if you were piping into stdin.
The last parameter to read, the '1', is what instructs it to read one byte here.
The second part is that your output is also buffered, and newline is commonly used by console output buffers to flush and show the line. Until that case, it is being written by your code to that output buffer. If you do not want this behaviour, then an fflush call after the write should output character by character instead.
When you type in your input at a console, the input characters are not immediately fed to stdin. After you press the Enter button, the entire line you typed, including the newline character, are is fed to stdin by the run time environment.

how to read scanf with spaces

I'm having a weird problem
i'm trying to read a string from a console with scanf()
like this
scanf("%[^\n]",string1);
but it doesnt read anything. it just skips the entire scanf.
I'm trying it in gcc compiler
Trying to use scanf to read strings with spaces can bring unnecessary problems of buffer overflow and stray newlines staying in the input buffer to be read later. gets() is often suggested as a solution to this, however,
From the manpage:
Never use gets(). Because it is
impossible to tell without knowing the
data in advance how many characters
gets() will read, and because gets()
will continue to store characters past
the end of the buffer, it is extremely
dangerous to use. It has been used to
break computer security. Use fgets()
instead.
So instead of using gets, use fgets with the STDIN stream to read strings from the keyboard
That should work fine, so something else is going wrong. As hobbs suggests, you might have a newline on the input, in which case this won't match anything. It also won't consume a newline, so if you do this in a loop, the first call will get up to the newline and then the next call will get nothing. If you want to read the newline, you need another call, or use a space in the format string to skip whitespace. Its also a good idea to check the return value of scanf to see if it actually matched any format specifiers.
Also, you probably want to specify a maximum length in order to avoid overflowing the buffer. So you want something like:
char buffer[100];
if (scanf(" %99[^\n]", buffer) == 1) {
/* read something into buffer */
This will skip (ignore) any blank lines and whitespace on the beginning of a line and read up to 99 characters of input up to and not including a newline. Trailing or embedded whitespace will not be skipped, only leading whitespace.
I'll bet your scanf call is inside a loop. I'll bet it works the first time you call it. I'll bet it only fails on the second and later times.
The first time, it will read until it reaches a newline character. The newline character will remain unread. (Odds are that the library internally does read it and calls ungetc to unread it, but that doesn't matter, because from your program's point of view the newline is unread.)
The second time, it will read until it reaches a newline character. That newline character is still waiting at the front of the line and scanf will read all 0 of the characters that are waiting ahead of it.
The third time ... the same.
You probably want this:
if (scanf("%99[^\n]%*c", buffer) == 1) {
Edit: I accidentally copied and pasted from another answer instead of from the question, before inserting the %*c as intended. This resulting line of code will behave strangely if you have a line of input longer than 100 bytes, because the %*c will eat an ordinary byte instead of the newline.
However, notice how dangerous it would be to do this:
scanf("%[^n]%*c", string1);
because there, if you have a line of input longer than your buffer, the input will walk all over your other variables and stack and everything. This is called buffer overflow (even if the overflow goes onto the stack).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *text(int n);
int main()
{
char str[10];
printf("enter username : ");
scanf(text(9),str);
printf("username = %s",str);
return 0;
}
char *text(int n)
{
fflush(stdin);fflush(stdout);
char str[50]="%",buf[50],st2[10]="[^\n]s";
char *s;itoa(n,buf,10);
// n == -1 no buffer protection
if(n != -1) strcat(str,buf);
strcat(str,st2);s=strdup(str);
fflush(stdin);fflush(stdout);
return s;
}

Resources