Is there any way to peek at the stdin buffer?

Is there any way to peek at the stdin buffer? - c

We know that stdin is, by default, a buffered input; the proof of that is in usage of any of the mechanisms that "leave data" on stdin, such as scanf():
int main()
{
char c[10] = {'\0'};
scanf("%9s", c);
printf("%s, and left is: %d\n", c, getchar());
return 0;
}
./a.out
hello
hello, and left is 10
10 being newline of course...
I've always been curious, is there any way to "peek" at the stdin buffer without removing whatever may reside there?
EDIT
A better example might be:
scanf("%9[^.]", c);
With an input of "at.ct", now I have "data" (ct\n) left on stdin, not just a newline.

Portably, you can get the next character in the input stream with getchar() and then push it back with ungetc(), which results in a state as if the character wasn't removed from the stream.
The ungetc function pushes the character specified by c (converted to an unsigned char) back onto the input stream pointed to by stream. Pushed-back characters will be returned by subsequent reads on that stream in the reverse order of their pushing.
Only one character of pushback is guaranteed by the standard, but usually, you can push back more.
As mentioned in the other answers resp. the comments there, in practice, you can almost certainly peek at the buffer if you provide your own buffer with setvbuf, although that is not without problems:
If buf is not a null pointer, the array it points to may be used instead of a buffer allocated by the setvbuf function
that leaves the possibility that the provided buffer may not be used at all.
The contents of the array at any time are indeterminate.
that means you have no guarantee that the contents of the buffer reflects the actual input (and it makes using the buffer undefined behaviour if it has automatic storage duration, if we're picky).
However, in practice the principal problem would be finding out where in the buffer the not-yet-consumed part of the buffered input begins and where it ends.

If you want to look at the stdin buffer without changing it, you could tell it to use a another buffer with setbuf, using an array you can access:
char buffer[BUFSIZ];
if (setbuf(stdin, buffer) != 0)
// error
getchar();
printf("%15s\n", buffer);
This let you see something more than ungetc, but I don't think you can go further in a portable way.
Actually this is legal but is not correct for the standard, quoting from it about the setvbuf (setbuf has the same behavior):
The contents of the array at any time are indeterminate.
So this is not what you need if you're looking for complete portability and standard-compliance, but I can't imagine why the buffer should not contain what is expected. However, it seems to work on my computer.
Beware that you have to provide an array of at least BUFSIZ characters to setbuf, and you must not do any I/O operation on the stream before it. If you need more flexibility, take a look at setvbuf.

You could set your own buffer with setvbuf on stdin, and peek there whenever you want.

Related

Does the C standard guarantee buffers are not touched past their null terminator?

In the various cases that a buffer is provided to the standard library's many string functions, is it guaranteed that the buffer will not be modified beyond the null terminator? For example:
char buffer[17] = "abcdefghijklmnop";
sscanf("123", "%16s", buffer);
Is buffer now required to equal "123\0efghijklmnop"?
Another example:
char buffer[10];
fgets(buffer, 10, fp);
If the read line is only 3 characters long, can one be certain that the 6th character is the same as before fgets was called?

The C99 draft standard does not explicitly state what should happen in those cases, but by considering multiple variations, you can show that it must work a certain way so that it meets the specification in all cases.
The standard says:
%s - Matches a sequence of non-white-space characters.252)
If no l length modifier is present, the corresponding argument shall be a
pointer to the initial element of a character array large enough to accept the
sequence and a terminating null character, which will be added automatically.
Here's a pair of examples that show it must work the way you are proposing to meet the standard.
Example A:
char buffer[4] = "abcd";
char buffer2[10]; // Note the this could be placed at what would be buffer+4
sscanf("123 4", "%s %s", buffer, buffer2);
// Result is buffer = "123\0"
// buffer2 = "4\0"
Example B:
char buffer[17] = "abcdefghijklmnop";
char* buffer2 = &buffer[4];
sscanf("123 4", "%s %s", buffer, buffer2);
// Result is buffer = "123\04\0"
Note that the interface of sscanf doesn't provide enough information to really know that these were different. So, if Example B is to work properly, it must not mess with the bytes after the null character in Example A. This is because it must work in both cases according to this bit of spec.
So implicitly it must work as you stated due to the spec.
Similar arguments can be placed for other functions, but I think you can see the idea from this example.
NOTE:
Providing size limits in the format, such as "%16s", could change the behavior. By the specification, it would be functionally acceptable for sscanf to zero out a buffer to its limits before writing the data into the buffer. In practice, most implementations opt for performance, which means they leave the remainder alone.
When the intent of the specification is to do this sort of zeroing out, it is usually explicitly specified. strncpy is an example. If the length of the string is less than the maximum buffer length specified, it will fill the rest of the space with null characters. The fact that this same "string" function could return a non-terminated string as well makes this one of the most common functions for people to roll their own version.
As far as fgets, a similar situation could arise. The only gotcha is that the specification explicitly states that if nothing is read in, the buffer remains untouched. An acceptable functional implementation could sidestep this by checking to see if there is at least one byte to read before zeroing out the buffer.

Each individual byte in the buffer is an object. Unless some part of the function description of sscanf or fgets mentions modifying those bytes, or even implies their values may change e.g. by stating their values become unspecified, then the general rule applies: (emphasis mine)
6.2.4 Storage durations of objects
2 [...] An object exists, has a constant address, and retains its last-stored value throughout its lifetime. [...]
It's this same principle that guarantees that
#include <stdio.h>
int a = 1;
int main() {
printf ("%d\n", a);
printf ("%d\n", a);
}
attempts to print 1 twice. Even though a is global, printf can access global variables, and the description of printf doesn't mention not modifying a.
Neither the description of fgets nor that of sscanf mentions modifying buffers past the bytes that actually were supposed to be written (except in the case of a read error), so those bytes don't get modified.

The standard is somewhat ambiguous on this, but I think a reasonable reading of it is that the answer is: yes, it's not allowed to write more bytes to the buffer than it read+null. On the other hand, a stricter reading/interpretation of the text could conclude that the answer is no, there's no guarantee. Here's what a publicly avaialble draft says about fgets.
char *fgets(char * restrict s, int n, FILE * restrict stream);
The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.
The fgets function returns s if successful. If end-of-file is encountered and no characters have been read into the array, the contents of the array remain unchanged and a null pointer is returned. If a read error occurs during the operation, the array contents are indeterminate and a null pointer is returned.
There's a guarantee about how much it is supposed to read from the input, i.e. stop reading at newline or EOF and not read more than n-1 bytes. Although nothing is said explicitly about how much it's allowed to write to the buffer, the common knowledge is that fgets's n parameter is used to prevent buffer overflows. It's a little strange that the standard uses the ambiguous term read, which may not necessarily imply that gets can't write to the buffer more than n bytes, if you want to nitpick on the terminology it uses. But note that the same "read" terminology is used about both issues: the n-limit and the EOF/newline limit. So if you interpret the n-related "read" as a buffer-write limit, then [for consistency] you can/should interpret the other "read" the same way, i.e. not write more than what it read when string is shorter than the buffer.
On the other hand, if you distinguish between the uses of the phrase-verb "read into" (="write") and just "read", then you can't read the committee's text the same way. You are guaranteed that it won't "read into" (="write to") the array more than n bytes, but if the input string is terminated sooner by newline or EOF you're only guaranteed the rest (of the input) won't be "read", but whether that implies in won't be "read into" (="written to") the buffer is unclear under this stricter reading. The crucial issue is keyword is "into", which is elided, so the problem is whether the completion given by me in brackets in the following modified quote is the intended interpretation:
No additional characters are read [into the array] after a new-line character (which is retained) or after end-of-file.
Frankly a single postcondition stated as a formula (and would be pretty short in this case) would have been a lot more helpful than the verbiage I quoted...
I can't be bothered to try and analyze their writeup about the *scanf family, because I suspect it's going to be even more complicated given all the other things that happen in those functions; their writeup for fscanf is about five pages long... But I suspect a similar logic applies.

is it guaranteed that the buffer will not be modified beyond the null
terminator?
No, there's no guarantee.
Is buffer now required to equal "123\0efghijklmnop"?
Yes. But that's only because you've used correct parameters to your string related functions. Should you mess up buffer length, input modifiers to sscanf and such, then you program will compile. But it will most likely fail during runtime.
If the read line is only 3 characters long, can one be certain that the 6th character is the same as before fgets was called?
Yes. Once fgets() figures you have a 3 character input string it stores the input in the provided buffer, and it doesn't care about the reset of provided space at all.

Is buffer now required to equal "123\0efghijklmnop"?
Here buffer is just consists of 123 string guaranteed terminating at NUL.
Yes the memory allocated for array buffer will not get de-allocated, however you are making sure/restricting your string buffer can atmost only have 16 char elements which you can read into it at any point of time. Now depends whether you write just a single char or maximum what buffer can take.
For example:
char buffer[4096] = "abc";`
actually does something below,
memcpy(buffer, "abc", sizeof("abc"));
memset(&buffer[sizeof("abc")], 0, sizeof(buffer)-sizeof("abc"));
The standard insists that if any part of char array is initialized that is all it consists of at any moment until obeying its memory boundary.

There are no any guarantees from standard, which is why the functions sscanf and fgets are recommended to be used (with respect to the size of the buffer) as you show in your question (and using of fgets is considered preferable compared with gets).
However, some standard functions use null-terminator in their work, e.g. strlen (but I suppose you ask about string modification)
EDIT:
In your example
fgets(buffer, 10, fp);
untouching characters after 10-th is guaranteed (content and length of buffer will not be considered by fgets)
EDIT2:
Moreover, when using fgets keep in mind that '\n' will be stored in the buffers. e.g.
"123\n\0fghijklmnop"
instead of expected
"123\0efghijklmnop"

Depends on the function in use (and to a lesser degree its implementation). sscanf will start writing when it encounters its first non-whitespace character, and continue writing until its first whitespace character, where it will add a finishing 0 and return. But a function like strncpy (famously) zeroes out the rest of the buffer.
There is however nothing in the C standard which mandates how these functions behave.

About gets() in C

I am writing a C program, which has a 5-element array to store a string. And I am using gets() to get input. When I typed in more than 5 characters and then output the string, it just gave me all the characters I typed in. I know the string is terminated by a \0 so even I exceeded my array, it will still output the whole thing.
But what I am curious is where exactly gets() stores input, either buffer or just directly goes to my array?
What if I type in a long long string, will gets() try to store characters in the memories that should not be touched? Would it gives me a segment fault?

That's why gets is an evil. It does not check array bound and often invokes undefined behavior. Never use gets, instead you can use fgets.
By the way, now gets is no longer be a part of C. It has been removed in C11 standard in favor of a new safe alternative, gets_s1 (see the wiki). So, better to forget about gets.
1. C11: K.3.5.4.1 The gets_s function
Synopsis
#define _ _STDC_WANT_LIB_EXT1_ _ 1
#include <stdio.h>
char *gets_s(char *s, rsize_t n);

gets() will store the characters in the 5-element buffer. If you type in more than 4 characters, the end of string character will be missed and the result may not work well in any string operations in your program.

excerpt from man page on Ubuntu Linux
gets() reads a line from stdin into the buffer pointed to by s until
either a terminating newline or EOF, which it replaces with a null byte
('\0'). No check for buffer overrun is performed
The string is stored in the buffer and if it is too long it is stored in contiguous memory after the buffer. This can lead to unintended writing over of data or a SEGV fault or other problems. It is a security issue as it can be used to inject code into programs.

gets() stores the characters you type directly into your array and you can safely use/modify them. But indeed, as haccks and unxnut correctly state, gets doesn't care about the size of the array you give it to store its chars in, and when you type more characters than the array has space for you might eventually get a segmentation fault or some other weird results.
Just for the sake of completeness, gets() reads from a buffered file called stdin which contains the chars you typed. More specifically, it takes the chars until it reaches a newline. That newline too is put into your array and next the '\0' terminator. You should, as haccks says, use fgets which is very much alike:
char buf[100]; // the input buffer
fgets(buf, 100, stdin); // reads until it finds a newline (your enter) but never
// more than 99 chars, using the last char for the '\0'
// you can now use and modify buf

How does fgets work in this program and how does it tie into the 'stream' concept?

I am having difficulty with a feature of a segment of code that is designed to illustrate the fgets() function for input. Before I proceed, I would like to make sure that my understanding of I/O and streams is correct and that I'm not completely off base:
Input and Output in C has no specific viable function for working with strings. The one function specific for working with strings is the 'gets()' function, which will accept input beyond the limits of the char array to store the input (thus making it effectively illegal for all but backward compatibility), and create buffer overflows.
This brings up the topic of streams, which to the best of my understanding is a model to explain I/O in a program. A stream is considered 'flowing water' on which the data utilized by programs is conveyed. See links: (also as a conveyor belt)
Can you explain the concept of streams?
What is a stream?
In the C language, there are 3 predefined ANSII streams for standard input and output, and 2 additional streams if using windows or DOS which are as follows:
stdin (keyboard)
stdout (screen)
stderr (screen)
stdprn (printer)
stdaux (serial port)
As I understand, to make things manageable it is okay to think of these as rivers that exist in your operating system, and a program uses I/O functions to put data in them, take data out of them, or change the direction of where the streams are flowing (such as reading or writing a file would require). Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system. What you need to be concerned with is where the water takes your data, and that is mediated by use of specific functions (such as printf(), puts(), gets(), fgets(), etc.).
This is where my questions start to take form. Now I am interested in getting a grasp on the fgets() function and how it ties into streams. fgets() uses the 'stdin' stream (naturally) and has the built in fail safe (see below) that will not allow user input to exceed the array used to store the input. Here is the outline of the fgets() function, rather its prototype (which I don't see why one would ever need to declare it?):
char *fgets(char *str , int n , FILE *fp);
Note the three parameters that the fgets function takes:
p1 is the address of where the input is stored (a pointer, which will likely just be the name of the array you use, e.g., 'buffer')
p2 is the maximum length of characters to be input (I think this is where my question is!)
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
Now, the code I have below will allow you to type characters until your heart is content. When you hit return, the input is printed on the screen in rows of the length of the second parameter minus 1 (MAXLEN -1). When you enter a return with no other text, the program terminates.
#include <stdio.h>
#define MAXLEN 10
int main(void)
{
char buffer[MAXLEN];
puts("Enter text a line at a time: enter a blank line to exit");
while(1)
{
fgets(buffer, MAXLEN, stdin); //Read comments below. Note 'buffer' is indeed a pointer: just to array's first element.
if(buffer[0] == '\n')
{
break;
}
puts(buffer);
}
return 0;
}
Now, here are my questions:
1) Does this program allow me to input UNLIMITED characters? I fail to see the mechanism that makes fgets() safer than gets(), because my array that I am storing input in is of a limited size (256 in this case). The only thing that I see happening is my long strings of input being parsed into MAXLEN - 1 slices? What am I not seeing with fgets() that stops buffer overflow that gets() does not? I do not see in the parameters of fgets() where that fail-safe exists.
2) Why does the program print out input in rows of MAXLEN-1 instead of MAXLEN?
3) What is the significance of the second parameter of the fgets() function? When I run the program, I am able to type as many characters as I want. What is MAXLEN doing to guard against buffer overflow? From what I can guess, when the user inputs a big long string, once the user hits return, the MAXLEN chops up the string in to MAXLEN sized bites/bytes (both actually work here lol) and sends them to the array. I'm sure I'm missing something important here.
That was a mouthful, but my lack of grasp on this very important subject is making my code weak.

Question 1
You can actually type as much character as your command line tool will allow you per input. However, you call to fgets() will handle only MAXLEN in your example because you tell him to do so.
Moreover, there is no safe check inside fgets(). The second parameter you gave to fgets is the "safety" argument. Try to give to change your call to fgets to fgets(buffer, MAXLEN + 10, stdin); and then type more than MAXLEN characters. Your program will crash because you are accessing unallocated memory.
Question 2
When you make a call to fgets(), it will read MAXLEN - 1 characters because the last one is reserved to the character code \0 which usually means end of string
The second parameter of fgets() is not the number of character you want to store but the maximum capacity of your buffer. And you always have to think about string termination character \0
Question 3
If you undestood the 2 answer before, you will be able to answer to this one by yourself. Try to play with this value. And use a different value than the one used for you buffer size.
Also, you said
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
You can use fgets to read files stored on your computer. Here is an example :
char buffer[20];
FILE *stream = fopen("myfile.txt", "r"); //Open the file "myfile.txt" in readonly mode
fgets(buffer, 20, stream); //Read the 19 first characters of the file "myfile.txt"
puts(buffer);

When you call fgets(), it lets you type in as much as you want into stdin, so everything stays in stdin. It seems fgets() takes the first 9 characters, attaches a null character, and assigns it to buffer. Then puts() displays buffer then creates a newline.
The key is it's in a while loop -- the code loops again then takes what was remaining in stdin and feeds it into fgets(), which takes the next 9 characters and repeats. Stdin just still had stuff "in queue".

Input and Output in C has no specific viable function for working with strings.
There are several functions for outputting strings, such as printf and puts.
Strings can be input with fgets or scanf; however there is no standard function that both inputs and allocates memory. You need to pre-allocate some memory, and then read some characters into that memory.
Your analogy of a stream as a river is not great. Rivers flow whether or not you are taking items out of them, but streams don't. A better analogy might be a line of people at the gates to a stadium.
C also has the concept of a "line", lines are marked by having a '\n' character at the end. In my analogy let's say the newline character is represented by a short person.
When you do fgets(buf, 20, stdin) it is like "Let the next 19 people in, but if you encounter a short person during this, let him through but not anybody else". Then the fgets function creates a string out of these 0 to 19 characters, by putting the end-of-string marker on the end; and that string is placed in buf.
Note that the second argument to fgets is the buffer size , not the number of characters to read.
When you type in characters, that is like more people joining the queue.
If there were fewer than 19 people and no short people, then fgets waits for more people to arrive. In standard C there's no way to check if people are waiting without blocking to wait for them if they aren't.
By default, C streams are line buffered. In my analogy, this is like there is a "pre-checking" gate earlier on than the main gate, where all people that arrive go into a holding pen until a short person arrives; and then everyone from the holding pen plus that short person get sent onto the main gate. This can be turned off using setvbuf.
Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system.
This is something you do have to worry about. stdin etc. are already begun before you enter main(), but other streams (e.g. if you want to read from a file on your hard drive), you have to begin them.
Streams may end. When a stream is ended, fgets will return NULL. Your program must handle this. In my analogy, the gate is closed.

need for the last '\0' in fgets

I've seen several usage of fgets (for example, here) that go like this:
char buff[7]="";
(...)
fgets(buff, sizeof(buff), stdin);
The interest being that, if I supply a long input like "aaaaaaaaaaa", fgets will truncate it to "aaaaaa" here, because the 7th character will be used to store '\0'.
However, when doing this:
int i=0;
for (i=0;i<7;i++)
{
buff[i]='a';
}
printf("%s\n",buff);
I will always get 7 'a's, and the program will not crash. But if I try to write 8 'a's, it will.
As I saw it later, the reason for this is that, at least on my system, when I allocate char buff[7] (with or without =""), the 8th byte (counting from 1, not from 0) gets set to 0. From what I guess, things are done like this precisely so that a for loop with 7 writes, followed by a string formatted read, could succeed, whether the last character to be written was '\0' or not, and thus avoiding the need for the programmer to set the last '\0' himself, when writing chars individually.
From this, it follows that in the case of
fgets(buff, sizeof(buff), stdin);
and then providing a too long input, the resulting buffstring will automatically have two '\0' characters, one inside the array, and one right after it that was written by the system.
I have also observed that doing
fgets(buff,(sizeof(buff)+17),stdin);
will still work, and output a very long string, without crashing. From what I guessed, this is because fgets will keep writing until sizeof(buff)+17, and the last char to be written will precisely be a '\0', ensuring that any forthcoming string reading process would terminate properly (although the memory is messed up anyway).
But then, what about fgets(buff, (sizeof(buff)+1),stdin);? this would use up all the space that was rightfully allocated in buff, and then write a '\0' right after it, thus overwriting...the '\0' previously written by the system. In other words, yes, fgets would go out of bounds, but it can be proven that when adding only one to the length of the write, the program will never crash.
So in the end, here comes the question: why does fgets always terminates its write with a '\0', when another '\0', placed by the system right after the array, already exists? why not do like in the one by one for-loop based write, that can access the whole of the array and write anything the programmer wants, without endangering anything?
Thank you very much for your answer!
EDIT: indeed, there is no proof possible, as long as I do not know whether this 8th '\0' that mysteriously appears upon allocation of buff[7], is part of the C standard or not, specifically for string arrays. If not, then...it's just luck that it works :-)

but it can be proven that when adding only one to the length of the write, the program will never crash.
No! You can't prove that! Not in the sense of a mathematical proof. You have only shown that on your system, with your compiler, with those particular compiler settings you used, with particular environment configuration, it might not crash. This is far from a mathematical proof!
In fact the C standard itself, although it guarantees that you can get the address of "one place after the last element of an array", it also states that dereferencing that address (i.e. trying to read or write from that address) is undefined behaviour.
That means that an implementation can do everything in this case. It can even do what you expect with naive reasoning (i.e. work - but it's sheer luck), but it may also crash or it may also format your HD (if your are very, very unlucky). This is especially true when writing system software (e.g. a device driver or a program running on the bare metal), i.e. when there is no OS to shield you from the nastiest consequences of writing bad code!
Edit This should answer the question made in a comment (C99 draft standard):
7.19.7.2 The fgets function
Synopsis
#include <stdio.h>
char *fgets(char * restrict s, int n,
FILE * restrict stream);
Description
The fgets function reads at most one less than the number of characters specified by n
from the stream pointed to by stream into the array pointed to by s. No additional
characters are read after a new-line character (which is retained) or after end-of-file. A
null character is written immediately after the last character read into the array.
Returns
The fgets function returns s if successful. If end-of-file is encountered and no
characters have been read into the array, the contents of the array remain unchanged and a
null pointer is returned. If a read error occurs during the operation, the array contents are
indeterminate and a null pointer is returned.
Edit: Since it seems that the problem lies in a misunderstanding of what a string is, this is the relevant excerpt from the standard (emphasis mine):
7.1.1 Definitions of terms
A string is a contiguous sequence of characters terminated by and including the first null
character. The term multibyte string is sometimes used instead to emphasize special
processing given to multibyte characters contained in the string or to avoid confusion
with a wide string. A pointer to a string is a pointer to its initial (lowest addressed)
character. The length of a string is the number of bytes preceding the null character and
the value of a string is the sequence of the values of the contained characters, in order.

From C11 standard draft:
The fgets function reads at most one less than the number of characters speciﬁed by n
from the stream pointed to by stream into the array pointed to by s. No additional
characters are read after a new-line character (which is retained) or after end-of-ﬁle. A
null character is written immediately after the last character read into the array.
The fgets function returns s if successful. If end-of-ﬁle is encountered and no
characters have been read into the array, the contents of the array remain unchanged and a
null pointer is returned. If a read error occurs during the operation, the array contents are indeterminate and a null pointer is returned.
The behaviour you describe is undefined.

using read system call after a scanf

I am having a confusion regarding the following code,
#include<stdio.h>
int main()
{
char buf[100]={'\0'};
int data=0;
scanf("%d",&data);
read(stdin,buf,4); //attaching to stdin
printf("buffer is %s\n",buf);
return 1;
}
suppose on runtime I provided with the input 10abcd so as per my understanding following should happen:
scanf should place 10 in data
and abcd will still be on the stdin buffer
when read tries to read the stdin (already abcd is there) it should place the abcd into the buf
so printf should print abcd
but it is not happening ,printf showing no o/p
am I missing something here?

First of all read (stdin, ...) should give warnings (if you have them enabled) which you would be wise to heed. read() takes an integer as the first parameter specifying which channel to read from. stdin is of type FILE *.
Even if you changed it to read(0,..., this is not recommended practice. scanf is reading from FILE *stdin which is buffered from file handle 0. read (0, ...) reads directly from the underlying file handle and ignore any characters which were buffered. This will cause strange results unless stdin is set unbuffered.

Ignoring mechanical issues related to the syntax of the read() function call, there are two cases to consider:
Input is from a terminal.
Input is from a file.
Terminal
No data will be available for reading until the user hits return. At that point, the standard I/O library will read all the available data into the buffer associated with stdin (that would be "10abcd\n"). It will then parse the number, leaving the a in the buffer to be read later by other standard I/O functions.
When the read() occurs, it will also wait for the user to provide some input. It has no clue about the data in the stdin buffer. It will hang until the user hits return, and will then read the next lot of data, returning up to 4 bytes in the buffer (no null termination unless it so happens that the fourth character is an ASCII NUL '\0').
File
Actually, this isn't all that much different, except that instead of reading a line of data into the buffer, the standard I/O library will probably read an entire buffer full, (BUFSIZ bytes, which might be 512 or larger). It will then convert the 10 and leave the a for later use. (If the file is shorter than the buffer size, it will all be read into the stdin buffer.)
The read will then collect the next 4 bytes from the file. If the whole file was read already, then it will return nothing — 0 bytes read.
You need to record and check the return value from read(). You should also check the return value from scanf() to ensure it did actually read a number.

try... man read first.
read is declared as ssize_t read(int fd, void *buf, size_t count);
and stdin is declared as FILE *. thats the issue. use fread() instead and you will be sorted.
int main()
{
char buf[100]={'\0'};
int data=0;
scanf("%d",&data);
fread(buf, 1, 4, stdin);
printf("buffer is %s\n",buf);
return 1;
}
EDIT: Your understanding is almost correct but not totally.
To address your question properly, i will agree with Jonathen Laffer.
how your code works,
1) scanf should place 10 in data.
2) abcd will still be on the stdin buffer when you press ENTER.
3) then read() will again wait for entry and you have to again press ENTER to run program further.
4)now if you have entered anything before pressing ENTER for 2nd time the printf should print it else you will not get anything on output other than your printf statement.
Thats why i asked you to use fread instead. hope it helps.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight