using read system call after a scanf - c

I am having a confusion regarding the following code,
#include<stdio.h>
int main()
{
char buf[100]={'\0'};
int data=0;
scanf("%d",&data);
read(stdin,buf,4); //attaching to stdin
printf("buffer is %s\n",buf);
return 1;
}
suppose on runtime I provided with the input 10abcd so as per my understanding following should happen:
scanf should place 10 in data
and abcd will still be on the stdin buffer
when read tries to read the stdin (already abcd is there) it should place the abcd into the buf
so printf should print abcd
but it is not happening ,printf showing no o/p
am I missing something here?

First of all read (stdin, ...) should give warnings (if you have them enabled) which you would be wise to heed. read() takes an integer as the first parameter specifying which channel to read from. stdin is of type FILE *.
Even if you changed it to read(0,..., this is not recommended practice. scanf is reading from FILE *stdin which is buffered from file handle 0. read (0, ...) reads directly from the underlying file handle and ignore any characters which were buffered. This will cause strange results unless stdin is set unbuffered.

Ignoring mechanical issues related to the syntax of the read() function call, there are two cases to consider:
Input is from a terminal.
Input is from a file.
Terminal
No data will be available for reading until the user hits return. At that point, the standard I/O library will read all the available data into the buffer associated with stdin (that would be "10abcd\n"). It will then parse the number, leaving the a in the buffer to be read later by other standard I/O functions.
When the read() occurs, it will also wait for the user to provide some input. It has no clue about the data in the stdin buffer. It will hang until the user hits return, and will then read the next lot of data, returning up to 4 bytes in the buffer (no null termination unless it so happens that the fourth character is an ASCII NUL '\0').
File
Actually, this isn't all that much different, except that instead of reading a line of data into the buffer, the standard I/O library will probably read an entire buffer full, (BUFSIZ bytes, which might be 512 or larger). It will then convert the 10 and leave the a for later use. (If the file is shorter than the buffer size, it will all be read into the stdin buffer.)
The read will then collect the next 4 bytes from the file. If the whole file was read already, then it will return nothing — 0 bytes read.
You need to record and check the return value from read(). You should also check the return value from scanf() to ensure it did actually read a number.

try... man read first.
read is declared as ssize_t read(int fd, void *buf, size_t count);
and stdin is declared as FILE *. thats the issue. use fread() instead and you will be sorted.
int main()
{
char buf[100]={'\0'};
int data=0;
scanf("%d",&data);
fread(buf, 1, 4, stdin);
printf("buffer is %s\n",buf);
return 1;
}
EDIT: Your understanding is almost correct but not totally.
To address your question properly, i will agree with Jonathen Laffer.
how your code works,
1) scanf should place 10 in data.
2) abcd will still be on the stdin buffer when you press ENTER.
3) then read() will again wait for entry and you have to again press ENTER to run program further.
4)now if you have entered anything before pressing ENTER for 2nd time the printf should print it else you will not get anything on output other than your printf statement.
Thats why i asked you to use fread instead. hope it helps.

Related

How to control the output of fileno?

I'm facing a piece of code that I don't understand:
read(fileno(stdin),&i,1);
switch(i)
{
case '\n':
printf("\a");
break;
....
I know that fileno return the file descriptor associated with the sdtin here, then read put this value in i variable.
So, what should be the value of stdin to allow i to match with the first "case", i.e \n ?
Thank you
But what should be the value of stdin to match with the first "case", i.e \n ?
The case statement doesn't look at the "value" of stdin.
read(fileno(stdin),&i,1);
reads in a single byte into i (assuming read() call is successful) and if that byte is \n (newline character) then it'll match the case. You probably need to read the man page of read(2) to understand what it does.
I know that fileno return the file descriptor associated with the sdtin here,
Yes, though I suspect you don't know what that means.
then read put this value in i variable.
No. No no no no no. read() does not put the value of the file descriptor, or any part of it, into the provided buffer (in your case, the bytes of i). As its name suggests, read() attempts to read from the file represented by the file descriptor passed as its first argument. The bytes read, if any, are stored in the provided buffer.
stdin represents the program's standard input. If you run the program from an interactive shell, that will correspond to your keyboard. The program attempts to read user input, and to compare it with a newline.
The program is likely flawed, and maybe outright wrong, though it's impossible to tell from just the fragment presented. If i is a variable of type int then its representation is larger than one byte, but you're only reading one byte into it. That will replace only one byte of the representation, with results depending on C implementation and the data read.
What the program seems to be trying to do can be made to work with read(), but I would recommend using getchar() instead:
#include <stdio.h>
/*
...
int i;
...
*/
i = getchar();
/* ... */

fully-buffered stream get flushed when it is not full

I really confused with how exactly a buffer work. So I write a little snippet to verify:
#include<stdio.h>
#define BUF_SIZE 1024
char buf[BUF_SIZE];
char arr[20];
int main()
{
FILE* fs=fopen("test.txt","r");
setvbuf(fs,buf,_IOFBF,1024);
fread(arr,1,1,fs);
printf("%s",arr);
getchar();
return 0;
}
As you see, I set the file stream fs to fully buffered stream(I know most of the time it would default to fully-buffered. just making sure). And I also set its related buffer to be size 1024, which mean that the stream would not be flushed until it contain 1024 bytes of stuff(right?).
In my opinion, the routine of fread() is that, it read data from the file stream, store it at its buffer buf,and then the data in the buf would be send to the arr as soon as it is full of 1024 bytes of data(right?).
But now, I read only one character from the stream!!And also, there is are only four characters in the file test.txt. why can I find something in the arr in case that there is only one char(I can print that one character out)
The distinctions between fully-buffered, line-buffered, and unbuffered really only matter for output streams. I'm pretty sure that input streams are pretty much always act like they're fully buffered.
But even for fully-buffered input streams, there's at least one case where the buffer won't be fully full, and as you've discovered, that's where there aren't enough characters left in the input to fill the buffer. If there are only 4 characters in the file, then when the system goes to fill the buffer, it gets those 4 characters and puts them in the buffer, and then you can start taking them out, as usual.
(The same situation would arise any time the file contains a number of characters that's not an exact multiple of the buffer size. For example, if the input file contained 1028 characters, then after filling the buffer with the first 1024 characters and letting you read them, the next time it filled the buffer, it'd end up with 4 again.)
What were you expecting it to do in this case? Block waiting to read 1,020 more characters from the file (that were never going to come)?
P.S. You said "the stream would not be flushed until it contained 1024 bytes of stuff, right?" But flushing is only defined for output streams; it doesn't mean anything for input streams.
From what I understand, an input buffer works different to what you suggested: if you request one Byte to be read, the system reads 1023 more Bytes into the buffer, so on the next 1023 subsequent read calls it can return data directly from the buffer instead of having to read from the file.

fopen() in read-only mode and its buffer

Consider the following, albeit very messy, code in C:
#include<stdio.h>
int main() {
char buf[3]; //a new, small buffer
FILE *fp = fopen("test.txt", "r"); //our test file, with the contents "123abc"
setvbuf(fp, buf, _IOFBF, 2); //we assign our small buffer as fp's buffer \
//in fully buffered mode
char character = fgetc(fp); // get the first character...
character = fgetc(fp); // and the next...
character = fgetc(fp); // and the next... (third character, '3')
buf[2] = '\0'; //add a terminating line for display
fputs(buf, stderr); //write our buffer to stderr, should show up immediately
}
Compiling and running the code will print '3a' as the contents of our self-designated buffer, buf. My question is: how does this buffer get filled? Does a call to fgetc() mean several calls until the buffer is full and then stops (we only made three calls to fgetc, which should not include the present 'a')? The first buffer was "12", so does this mean when another fgetc() call is made and the pointer references something outside of the scope of the buffer, is the buffer purged and then filled with the next block of data, or simply overwritten? I understand buffer sizes are platform dependent so I'm more concerned with how, in general, an fopen()ed stream in a read mode pulls characters into it's buffer.
The buffer, and exactly how and when it is filled, is an implementation detail inside the stdio package. But the way it is likely to be implemented is that fgetc gets one character from the buffer, if there are characters available in the buffer. If the buffer is empty, it fills it by reading (in your case) two more characters from the file.
So your first fgetc will read 12 from the file and put it in the buffer, and then return '1'. Your second fgetc will not read from the file, since a character is available in the buffer, and return '2'. Your third fgetc will find that the buffer is empty, so it will read 3a from the file and put it in the buffer, and then return '3'. Therefore, when you print the content of the buffer, it will be 3a.
Note that there are two levels of "reading" happening here. First you have your fgetc calls, and then, below that level, code inside the stdio packade which is reading from the file. If we assume this is on a Unix or Linux system, the second type of reading is done using the system call read(2).
The lower-level reading fills the entire buffer at once, so you don't need as many calls to read as calls to fgetc. (Which is the entire point of having the buffer.)

How does fgets work in this program and how does it tie into the 'stream' concept?

I am having difficulty with a feature of a segment of code that is designed to illustrate the fgets() function for input. Before I proceed, I would like to make sure that my understanding of I/O and streams is correct and that I'm not completely off base:
Input and Output in C has no specific viable function for working with strings. The one function specific for working with strings is the 'gets()' function, which will accept input beyond the limits of the char array to store the input (thus making it effectively illegal for all but backward compatibility), and create buffer overflows.
This brings up the topic of streams, which to the best of my understanding is a model to explain I/O in a program. A stream is considered 'flowing water' on which the data utilized by programs is conveyed. See links: (also as a conveyor belt)
Can you explain the concept of streams?
What is a stream?
In the C language, there are 3 predefined ANSII streams for standard input and output, and 2 additional streams if using windows or DOS which are as follows:
stdin (keyboard)
stdout (screen)
stderr (screen)
stdprn (printer)
stdaux (serial port)
As I understand, to make things manageable it is okay to think of these as rivers that exist in your operating system, and a program uses I/O functions to put data in them, take data out of them, or change the direction of where the streams are flowing (such as reading or writing a file would require). Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system. What you need to be concerned with is where the water takes your data, and that is mediated by use of specific functions (such as printf(), puts(), gets(), fgets(), etc.).
This is where my questions start to take form. Now I am interested in getting a grasp on the fgets() function and how it ties into streams. fgets() uses the 'stdin' stream (naturally) and has the built in fail safe (see below) that will not allow user input to exceed the array used to store the input. Here is the outline of the fgets() function, rather its prototype (which I don't see why one would ever need to declare it?):
char *fgets(char *str , int n , FILE *fp);
Note the three parameters that the fgets function takes:
p1 is the address of where the input is stored (a pointer, which will likely just be the name of the array you use, e.g., 'buffer')
p2 is the maximum length of characters to be input (I think this is where my question is!)
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
Now, the code I have below will allow you to type characters until your heart is content. When you hit return, the input is printed on the screen in rows of the length of the second parameter minus 1 (MAXLEN -1). When you enter a return with no other text, the program terminates.
#include <stdio.h>
#define MAXLEN 10
int main(void)
{
char buffer[MAXLEN];
puts("Enter text a line at a time: enter a blank line to exit");
while(1)
{
fgets(buffer, MAXLEN, stdin); //Read comments below. Note 'buffer' is indeed a pointer: just to array's first element.
if(buffer[0] == '\n')
{
break;
}
puts(buffer);
}
return 0;
}
Now, here are my questions:
1) Does this program allow me to input UNLIMITED characters? I fail to see the mechanism that makes fgets() safer than gets(), because my array that I am storing input in is of a limited size (256 in this case). The only thing that I see happening is my long strings of input being parsed into MAXLEN - 1 slices? What am I not seeing with fgets() that stops buffer overflow that gets() does not? I do not see in the parameters of fgets() where that fail-safe exists.
2) Why does the program print out input in rows of MAXLEN-1 instead of MAXLEN?
3) What is the significance of the second parameter of the fgets() function? When I run the program, I am able to type as many characters as I want. What is MAXLEN doing to guard against buffer overflow? From what I can guess, when the user inputs a big long string, once the user hits return, the MAXLEN chops up the string in to MAXLEN sized bites/bytes (both actually work here lol) and sends them to the array. I'm sure I'm missing something important here.
That was a mouthful, but my lack of grasp on this very important subject is making my code weak.
Question 1
You can actually type as much character as your command line tool will allow you per input. However, you call to fgets() will handle only MAXLEN in your example because you tell him to do so.
Moreover, there is no safe check inside fgets(). The second parameter you gave to fgets is the "safety" argument. Try to give to change your call to fgets to fgets(buffer, MAXLEN + 10, stdin); and then type more than MAXLEN characters. Your program will crash because you are accessing unallocated memory.
Question 2
When you make a call to fgets(), it will read MAXLEN - 1 characters because the last one is reserved to the character code \0 which usually means end of string
The second parameter of fgets() is not the number of character you want to store but the maximum capacity of your buffer. And you always have to think about string termination character \0
Question 3
If you undestood the 2 answer before, you will be able to answer to this one by yourself. Try to play with this value. And use a different value than the one used for you buffer size.
Also, you said
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
You can use fgets to read files stored on your computer. Here is an example :
char buffer[20];
FILE *stream = fopen("myfile.txt", "r"); //Open the file "myfile.txt" in readonly mode
fgets(buffer, 20, stream); //Read the 19 first characters of the file "myfile.txt"
puts(buffer);
When you call fgets(), it lets you type in as much as you want into stdin, so everything stays in stdin. It seems fgets() takes the first 9 characters, attaches a null character, and assigns it to buffer. Then puts() displays buffer then creates a newline.
The key is it's in a while loop -- the code loops again then takes what was remaining in stdin and feeds it into fgets(), which takes the next 9 characters and repeats. Stdin just still had stuff "in queue".
Input and Output in C has no specific viable function for working with strings.
There are several functions for outputting strings, such as printf and puts.
Strings can be input with fgets or scanf; however there is no standard function that both inputs and allocates memory. You need to pre-allocate some memory, and then read some characters into that memory.
Your analogy of a stream as a river is not great. Rivers flow whether or not you are taking items out of them, but streams don't. A better analogy might be a line of people at the gates to a stadium.
C also has the concept of a "line", lines are marked by having a '\n' character at the end. In my analogy let's say the newline character is represented by a short person.
When you do fgets(buf, 20, stdin) it is like "Let the next 19 people in, but if you encounter a short person during this, let him through but not anybody else". Then the fgets function creates a string out of these 0 to 19 characters, by putting the end-of-string marker on the end; and that string is placed in buf.
Note that the second argument to fgets is the buffer size , not the number of characters to read.
When you type in characters, that is like more people joining the queue.
If there were fewer than 19 people and no short people, then fgets waits for more people to arrive. In standard C there's no way to check if people are waiting without blocking to wait for them if they aren't.
By default, C streams are line buffered. In my analogy, this is like there is a "pre-checking" gate earlier on than the main gate, where all people that arrive go into a holding pen until a short person arrives; and then everyone from the holding pen plus that short person get sent onto the main gate. This can be turned off using setvbuf.
Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system.
This is something you do have to worry about. stdin etc. are already begun before you enter main(), but other streams (e.g. if you want to read from a file on your hard drive), you have to begin them.
Streams may end. When a stream is ended, fgets will return NULL. Your program must handle this. In my analogy, the gate is closed.

Is there any way to peek at the stdin buffer?

We know that stdin is, by default, a buffered input; the proof of that is in usage of any of the mechanisms that "leave data" on stdin, such as scanf():
int main()
{
char c[10] = {'\0'};
scanf("%9s", c);
printf("%s, and left is: %d\n", c, getchar());
return 0;
}
./a.out
hello
hello, and left is 10
10 being newline of course...
I've always been curious, is there any way to "peek" at the stdin buffer without removing whatever may reside there?
EDIT
A better example might be:
scanf("%9[^.]", c);
With an input of "at.ct", now I have "data" (ct\n) left on stdin, not just a newline.
Portably, you can get the next character in the input stream with getchar() and then push it back with ungetc(), which results in a state as if the character wasn't removed from the stream.
The ungetc function pushes the character specified by c (converted to an unsigned char) back onto the input stream pointed to by stream. Pushed-back characters will be returned by subsequent reads on that stream in the reverse order of their pushing.
Only one character of pushback is guaranteed by the standard, but usually, you can push back more.
As mentioned in the other answers resp. the comments there, in practice, you can almost certainly peek at the buffer if you provide your own buffer with setvbuf, although that is not without problems:
If buf is not a null pointer, the array it points to may be used instead of a buffer allocated by the setvbuf function
that leaves the possibility that the provided buffer may not be used at all.
The contents of the array at any time are indeterminate.
that means you have no guarantee that the contents of the buffer reflects the actual input (and it makes using the buffer undefined behaviour if it has automatic storage duration, if we're picky).
However, in practice the principal problem would be finding out where in the buffer the not-yet-consumed part of the buffered input begins and where it ends.
If you want to look at the stdin buffer without changing it, you could tell it to use a another buffer with setbuf, using an array you can access:
char buffer[BUFSIZ];
if (setbuf(stdin, buffer) != 0)
// error
getchar();
printf("%15s\n", buffer);
This let you see something more than ungetc, but I don't think you can go further in a portable way.
Actually this is legal but is not correct for the standard, quoting from it about the setvbuf (setbuf has the same behavior):
The contents of the array at any time are indeterminate.
So this is not what you need if you're looking for complete portability and standard-compliance, but I can't imagine why the buffer should not contain what is expected. However, it seems to work on my computer.
Beware that you have to provide an array of at least BUFSIZ characters to setbuf, and you must not do any I/O operation on the stream before it. If you need more flexibility, take a look at setvbuf.
You could set your own buffer with setvbuf on stdin, and peek there whenever you want.

Resources