I'm facing a piece of code that I don't understand:
read(fileno(stdin),&i,1);
switch(i)
{
case '\n':
printf("\a");
break;
....
I know that fileno return the file descriptor associated with the sdtin here, then read put this value in i variable.
So, what should be the value of stdin to allow i to match with the first "case", i.e \n ?
Thank you
But what should be the value of stdin to match with the first "case", i.e \n ?
The case statement doesn't look at the "value" of stdin.
read(fileno(stdin),&i,1);
reads in a single byte into i (assuming read() call is successful) and if that byte is \n (newline character) then it'll match the case. You probably need to read the man page of read(2) to understand what it does.
I know that fileno return the file descriptor associated with the sdtin here,
Yes, though I suspect you don't know what that means.
then read put this value in i variable.
No. No no no no no. read() does not put the value of the file descriptor, or any part of it, into the provided buffer (in your case, the bytes of i). As its name suggests, read() attempts to read from the file represented by the file descriptor passed as its first argument. The bytes read, if any, are stored in the provided buffer.
stdin represents the program's standard input. If you run the program from an interactive shell, that will correspond to your keyboard. The program attempts to read user input, and to compare it with a newline.
The program is likely flawed, and maybe outright wrong, though it's impossible to tell from just the fragment presented. If i is a variable of type int then its representation is larger than one byte, but you're only reading one byte into it. That will replace only one byte of the representation, with results depending on C implementation and the data read.
What the program seems to be trying to do can be made to work with read(), but I would recommend using getchar() instead:
#include <stdio.h>
/*
...
int i;
...
*/
i = getchar();
/* ... */
Related
char buff[1];
int main() {
int c;
c = getchar();
printf("%d\n", c); //output -1
c = getchar();
printf("%d\n", c); // output -1
int res;
//here I get a prompt for input. What happened to EOF ?
while ((res = read(0, buff, 1)) > 0) {
printf("Hello\n");
}
while ((res = read(0, buff, 1)) > 0) {
printf("Hello\n");
}
return 0;
}
The resulting output showed with commented lines in the code is the result of simply typing Ctrl-D (EOF on macOS).
I'm a bit confused about the behaviour of getchar(), especially when compared to read.
Shouldn't the read system calls inside the while loop also return EOF? Why do they prompt the user? Has some sort of stdin clear occurred?
Considering that getchar() uses the read system call under the hood how come they behave differently? Shouldn't the stdin be "unique" and the EOF condition shared?
How come in the following code the two read system calls return both EOF when a Ctrl-D input is given?
int res;
while ((res = read(0, buff, 1)) > 0) {
printf("Hello\n");
}
while ((res = read(0, buff, 1)) > 0) {
printf("Hello\n");
}
I'm trying to find a logic behind all this. Hope that someone could make it clear what EOF really is a how it really behaves.
P.S I'm using a Mac OS machine
Once the end-of-file indicator is set for stdin, getchar() does not attempt to read.
Clear the end-of-file indicator (e.g. clearerr() or others) to re-try reading.
The getchar function is equivalent to getc with the argument stdin.
The getc function is equivalent to fgetc ...
If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int and advances the associated file position indicator for the stream (if defined).
read() still tries to read each time.
Note: Reading via a FILE *, like stdin, does not attempt to read if the end-of-file indicator is set. Yet even if the error indicator is set, a read attempt still occurs.
MacOs is a derivative of BSD unix systems. Its stdio implementation does not come from GNU software and so it is a different implementation. On EOF, the file descriptor is marked as erroneous when issuing a read(2) system call and receiving 0 as the number of characters returned by read, and so, it doesn't read(2) it again until the error condition is reset, and this produces the behaviour you observe. Use clearerr(stream); on the FILE * descriptor before issuing the next getchar(3) call, and everything will be fine. You can do that with glib also, and then, your program will run the same in either implementation of stdio (glib vs. bsd)
I'm trying to find a logic behind all this. Hope that someone could make it clear what EOF really is a how it really behaves.
EOF is simply a constant (normally it's valued as -1) that is different to any possible char value returned by getchar(3) (getchar() returns an int in the interval 0..255, and not a char for this purpose, to extend the range os possible characters with one more to represent the EOF condition, but EOF is not a char) The end of file condition is so indicated by the getchar family of functions (getchar, fgetc, etc) as the end of file condition is signalled by a read(2) return value of 0 (the number of returned characters is zero) which doesn't map as some character.... for that reason, the number of possible chars is extended to an integer and a new value EOF is defined to be returned when the end of file condition is reached. This is compatible with files that have Ctrl-D characters (ASCII EOT or Cntrl-D, decimal value 4) and not representing an END OF FILE condition (when you read an ASCII EOT from a file it appears as a normal character of decimal value 4)
The unix tty implementation, on the other side, allows on line input mode to use a special character (Ctrl-D, ASCII EOT/END OF TRANSMISSION, decimal value 4) to indicate and end of stream to the driver.... this is a special character, like ASCII CR or ASCII DEL (that produce line editing in input before feeding it to the program) in that case the terminal just prepares all the input characters and allows the application to read them (if there's none, none is read, and you got the end of file) So think that the Cntrl-D is only special in the unix tty driver and only when it is working in canonical mode (line input mode). So, finally, there are only two ways to input data to the program in line mode:
pressing the RETURN key (this is mapped by the terminal into ASCII CR, which the terminal translates into ASCII LF, the famous '\n' character) and the ASCII LF character is input to the program
pressing the Ctrl-D key. this makes the terminal to grab all that was keyed in upto this moment and send it to the program (without adding the Ctrl-D itself) and no character is added to the input buffer, what means that, if the input buffer was empty, nothing is sent to the program and the read(2) call reads effectively zero characters from the buffer.
To unify, in every scenario, the read(2) system call normally blocks into the kernel until one or more characters are available.... only at end of file, it unblocks and returns zero characters to the program. THIS SHOULD BE YOUR END OF FILE INDICATION. Many programs read an incomplete buffer (less than the number of characters you passed as parameter) before a true END OF FILE is signalled, and so, almost every program does another read to check if that was an incomplete read or indeed it was an end of file indication.
Finally, what if I want to input a Cntrl-D character as itself to a file.... there's another special character in the tty implementation that allows you to escape the special behaviour on the special character this one precedes. In today systems, that character is by default Ctrl-V, so if you want to enter a special character (even ?Ctrl-V) you have to precede it with Ctrl-V, making entering Ctrl-D into the file to have to input Ctrl-V + Ctrl-D.
I encrypted a text file using an offset cipher in C. For this, I simply added 128 to each character and got the file size decreased by 3 bytes. I tried the same on some other files too just to get the same result, i.e. decrease in file size by 3 bytes. I got the original size after decryption.
Could you please tell me why does it so happen?
Code for the main logic is given below:
while((ch=fgetc(fs))!=EOF){
fputc(ch+128, ft);
Could you please tell me why does it so happen?
Your ch probably has the wrong declaration. The fputc() function returns an int, not a char, and if you cast to char you will lose the distinction between (char) 0xff and EOF.
// WRONG WRONG WRONG
// char ch = fgetc(fs);
The right declaration:
int ch = fgetc(fs);
Otherwise, it shouldn't happen. Is your process exiting cleanly? If you abort(), then there might be data still in FILE * buffers. Show more code. Run with Valgrind. Check the exit status of your process.
I think the file size should have doubled as two bytes were taken for one character after encryption as something greater than 127 can not be stored in 1 byte.
No, fputc() does not work that way. The fputc() man page (run man fputc in a terminal, unless on Windows):
fputc() writes the character c, cast to an unsigned char, to stream.
Conversion to unsigned char is done by taking the value modulo 256*. So fputc() always writes exactly one byte of data (unlesss it fails).
* This is true all but exceedingly rare systems.
If you talk about Windows, I could imagine that you have opened the file in text mode, not in binary mode.
That leads to the following:
Writing \n leads to a \r\n written to the file.
Reading \r\n from the file gives only \n to the user.
Reading stops at the first \x1A, being a EOF character.
If you add 128 to each byte, the data-to-be-written rolls over at 256. While it may be undefined behaviour to call fputc() with a value > 256 (you should write (ch+128)%256 or (ch+128) & 0xFF), on your systems it obviously writes the value wrapped by 256 and thus you may get \n or \x1A by accident.
I am having difficulty with a feature of a segment of code that is designed to illustrate the fgets() function for input. Before I proceed, I would like to make sure that my understanding of I/O and streams is correct and that I'm not completely off base:
Input and Output in C has no specific viable function for working with strings. The one function specific for working with strings is the 'gets()' function, which will accept input beyond the limits of the char array to store the input (thus making it effectively illegal for all but backward compatibility), and create buffer overflows.
This brings up the topic of streams, which to the best of my understanding is a model to explain I/O in a program. A stream is considered 'flowing water' on which the data utilized by programs is conveyed. See links: (also as a conveyor belt)
Can you explain the concept of streams?
What is a stream?
In the C language, there are 3 predefined ANSII streams for standard input and output, and 2 additional streams if using windows or DOS which are as follows:
stdin (keyboard)
stdout (screen)
stderr (screen)
stdprn (printer)
stdaux (serial port)
As I understand, to make things manageable it is okay to think of these as rivers that exist in your operating system, and a program uses I/O functions to put data in them, take data out of them, or change the direction of where the streams are flowing (such as reading or writing a file would require). Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system. What you need to be concerned with is where the water takes your data, and that is mediated by use of specific functions (such as printf(), puts(), gets(), fgets(), etc.).
This is where my questions start to take form. Now I am interested in getting a grasp on the fgets() function and how it ties into streams. fgets() uses the 'stdin' stream (naturally) and has the built in fail safe (see below) that will not allow user input to exceed the array used to store the input. Here is the outline of the fgets() function, rather its prototype (which I don't see why one would ever need to declare it?):
char *fgets(char *str , int n , FILE *fp);
Note the three parameters that the fgets function takes:
p1 is the address of where the input is stored (a pointer, which will likely just be the name of the array you use, e.g., 'buffer')
p2 is the maximum length of characters to be input (I think this is where my question is!)
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
Now, the code I have below will allow you to type characters until your heart is content. When you hit return, the input is printed on the screen in rows of the length of the second parameter minus 1 (MAXLEN -1). When you enter a return with no other text, the program terminates.
#include <stdio.h>
#define MAXLEN 10
int main(void)
{
char buffer[MAXLEN];
puts("Enter text a line at a time: enter a blank line to exit");
while(1)
{
fgets(buffer, MAXLEN, stdin); //Read comments below. Note 'buffer' is indeed a pointer: just to array's first element.
if(buffer[0] == '\n')
{
break;
}
puts(buffer);
}
return 0;
}
Now, here are my questions:
1) Does this program allow me to input UNLIMITED characters? I fail to see the mechanism that makes fgets() safer than gets(), because my array that I am storing input in is of a limited size (256 in this case). The only thing that I see happening is my long strings of input being parsed into MAXLEN - 1 slices? What am I not seeing with fgets() that stops buffer overflow that gets() does not? I do not see in the parameters of fgets() where that fail-safe exists.
2) Why does the program print out input in rows of MAXLEN-1 instead of MAXLEN?
3) What is the significance of the second parameter of the fgets() function? When I run the program, I am able to type as many characters as I want. What is MAXLEN doing to guard against buffer overflow? From what I can guess, when the user inputs a big long string, once the user hits return, the MAXLEN chops up the string in to MAXLEN sized bites/bytes (both actually work here lol) and sends them to the array. I'm sure I'm missing something important here.
That was a mouthful, but my lack of grasp on this very important subject is making my code weak.
Question 1
You can actually type as much character as your command line tool will allow you per input. However, you call to fgets() will handle only MAXLEN in your example because you tell him to do so.
Moreover, there is no safe check inside fgets(). The second parameter you gave to fgets is the "safety" argument. Try to give to change your call to fgets to fgets(buffer, MAXLEN + 10, stdin); and then type more than MAXLEN characters. Your program will crash because you are accessing unallocated memory.
Question 2
When you make a call to fgets(), it will read MAXLEN - 1 characters because the last one is reserved to the character code \0 which usually means end of string
The second parameter of fgets() is not the number of character you want to store but the maximum capacity of your buffer. And you always have to think about string termination character \0
Question 3
If you undestood the 2 answer before, you will be able to answer to this one by yourself. Try to play with this value. And use a different value than the one used for you buffer size.
Also, you said
p3 specifies the input stream, which in this code is 'stdin' (when would it ever be different?)
You can use fgets to read files stored on your computer. Here is an example :
char buffer[20];
FILE *stream = fopen("myfile.txt", "r"); //Open the file "myfile.txt" in readonly mode
fgets(buffer, 20, stream); //Read the 19 first characters of the file "myfile.txt"
puts(buffer);
When you call fgets(), it lets you type in as much as you want into stdin, so everything stays in stdin. It seems fgets() takes the first 9 characters, attaches a null character, and assigns it to buffer. Then puts() displays buffer then creates a newline.
The key is it's in a while loop -- the code loops again then takes what was remaining in stdin and feeds it into fgets(), which takes the next 9 characters and repeats. Stdin just still had stuff "in queue".
Input and Output in C has no specific viable function for working with strings.
There are several functions for outputting strings, such as printf and puts.
Strings can be input with fgets or scanf; however there is no standard function that both inputs and allocates memory. You need to pre-allocate some memory, and then read some characters into that memory.
Your analogy of a stream as a river is not great. Rivers flow whether or not you are taking items out of them, but streams don't. A better analogy might be a line of people at the gates to a stadium.
C also has the concept of a "line", lines are marked by having a '\n' character at the end. In my analogy let's say the newline character is represented by a short person.
When you do fgets(buf, 20, stdin) it is like "Let the next 19 people in, but if you encounter a short person during this, let him through but not anybody else". Then the fgets function creates a string out of these 0 to 19 characters, by putting the end-of-string marker on the end; and that string is placed in buf.
Note that the second argument to fgets is the buffer size , not the number of characters to read.
When you type in characters, that is like more people joining the queue.
If there were fewer than 19 people and no short people, then fgets waits for more people to arrive. In standard C there's no way to check if people are waiting without blocking to wait for them if they aren't.
By default, C streams are line buffered. In my analogy, this is like there is a "pre-checking" gate earlier on than the main gate, where all people that arrive go into a holding pen until a short person arrives; and then everyone from the holding pen plus that short person get sent onto the main gate. This can be turned off using setvbuf.
Never think of the 'beginning' or 'end' of the streams: this is handled by the operating system.
This is something you do have to worry about. stdin etc. are already begun before you enter main(), but other streams (e.g. if you want to read from a file on your hard drive), you have to begin them.
Streams may end. When a stream is ended, fgets will return NULL. Your program must handle this. In my analogy, the gate is closed.
I am having a confusion regarding the following code,
#include<stdio.h>
int main()
{
char buf[100]={'\0'};
int data=0;
scanf("%d",&data);
read(stdin,buf,4); //attaching to stdin
printf("buffer is %s\n",buf);
return 1;
}
suppose on runtime I provided with the input 10abcd so as per my understanding following should happen:
scanf should place 10 in data
and abcd will still be on the stdin buffer
when read tries to read the stdin (already abcd is there) it should place the abcd into the buf
so printf should print abcd
but it is not happening ,printf showing no o/p
am I missing something here?
First of all read (stdin, ...) should give warnings (if you have them enabled) which you would be wise to heed. read() takes an integer as the first parameter specifying which channel to read from. stdin is of type FILE *.
Even if you changed it to read(0,..., this is not recommended practice. scanf is reading from FILE *stdin which is buffered from file handle 0. read (0, ...) reads directly from the underlying file handle and ignore any characters which were buffered. This will cause strange results unless stdin is set unbuffered.
Ignoring mechanical issues related to the syntax of the read() function call, there are two cases to consider:
Input is from a terminal.
Input is from a file.
Terminal
No data will be available for reading until the user hits return. At that point, the standard I/O library will read all the available data into the buffer associated with stdin (that would be "10abcd\n"). It will then parse the number, leaving the a in the buffer to be read later by other standard I/O functions.
When the read() occurs, it will also wait for the user to provide some input. It has no clue about the data in the stdin buffer. It will hang until the user hits return, and will then read the next lot of data, returning up to 4 bytes in the buffer (no null termination unless it so happens that the fourth character is an ASCII NUL '\0').
File
Actually, this isn't all that much different, except that instead of reading a line of data into the buffer, the standard I/O library will probably read an entire buffer full, (BUFSIZ bytes, which might be 512 or larger). It will then convert the 10 and leave the a for later use. (If the file is shorter than the buffer size, it will all be read into the stdin buffer.)
The read will then collect the next 4 bytes from the file. If the whole file was read already, then it will return nothing — 0 bytes read.
You need to record and check the return value from read(). You should also check the return value from scanf() to ensure it did actually read a number.
try... man read first.
read is declared as ssize_t read(int fd, void *buf, size_t count);
and stdin is declared as FILE *. thats the issue. use fread() instead and you will be sorted.
int main()
{
char buf[100]={'\0'};
int data=0;
scanf("%d",&data);
fread(buf, 1, 4, stdin);
printf("buffer is %s\n",buf);
return 1;
}
EDIT: Your understanding is almost correct but not totally.
To address your question properly, i will agree with Jonathen Laffer.
how your code works,
1) scanf should place 10 in data.
2) abcd will still be on the stdin buffer when you press ENTER.
3) then read() will again wait for entry and you have to again press ENTER to run program further.
4)now if you have entered anything before pressing ENTER for 2nd time the printf should print it else you will not get anything on output other than your printf statement.
Thats why i asked you to use fread instead. hope it helps.
We know that stdin is, by default, a buffered input; the proof of that is in usage of any of the mechanisms that "leave data" on stdin, such as scanf():
int main()
{
char c[10] = {'\0'};
scanf("%9s", c);
printf("%s, and left is: %d\n", c, getchar());
return 0;
}
./a.out
hello
hello, and left is 10
10 being newline of course...
I've always been curious, is there any way to "peek" at the stdin buffer without removing whatever may reside there?
EDIT
A better example might be:
scanf("%9[^.]", c);
With an input of "at.ct", now I have "data" (ct\n) left on stdin, not just a newline.
Portably, you can get the next character in the input stream with getchar() and then push it back with ungetc(), which results in a state as if the character wasn't removed from the stream.
The ungetc function pushes the character specified by c (converted to an unsigned char) back onto the input stream pointed to by stream. Pushed-back characters will be returned by subsequent reads on that stream in the reverse order of their pushing.
Only one character of pushback is guaranteed by the standard, but usually, you can push back more.
As mentioned in the other answers resp. the comments there, in practice, you can almost certainly peek at the buffer if you provide your own buffer with setvbuf, although that is not without problems:
If buf is not a null pointer, the array it points to may be used instead of a buffer allocated by the setvbuf function
that leaves the possibility that the provided buffer may not be used at all.
The contents of the array at any time are indeterminate.
that means you have no guarantee that the contents of the buffer reflects the actual input (and it makes using the buffer undefined behaviour if it has automatic storage duration, if we're picky).
However, in practice the principal problem would be finding out where in the buffer the not-yet-consumed part of the buffered input begins and where it ends.
If you want to look at the stdin buffer without changing it, you could tell it to use a another buffer with setbuf, using an array you can access:
char buffer[BUFSIZ];
if (setbuf(stdin, buffer) != 0)
// error
getchar();
printf("%15s\n", buffer);
This let you see something more than ungetc, but I don't think you can go further in a portable way.
Actually this is legal but is not correct for the standard, quoting from it about the setvbuf (setbuf has the same behavior):
The contents of the array at any time are indeterminate.
So this is not what you need if you're looking for complete portability and standard-compliance, but I can't imagine why the buffer should not contain what is expected. However, it seems to work on my computer.
Beware that you have to provide an array of at least BUFSIZ characters to setbuf, and you must not do any I/O operation on the stream before it. If you need more flexibility, take a look at setvbuf.
You could set your own buffer with setvbuf on stdin, and peek there whenever you want.