How can I tell if the next call to fgets(stdin, ...) will block or not? In other words, how can I tell if the stdin buffer has a newline waiting to be read?
On Unix, I know I can use select() like this:
fd_set reads;
FD_ZERO(&reads);
FD_SET(fileno(stdin), &reads);
int s = select(fileno(stdin)+1, &reads, 0, 0, 0);
if (s) {
//fgets is ready
}
However, select() on Windows only works with sockets, not with 'stdin', so I cannot use it.
I also know on Unix that I can use poll(), ioctl(0, I_NREAD...), and probably a lot of other solutions. None of these work on Windows.
I have tried kbhit() and WaitForSingleObject(GetStdHandle(STD_INPUT_HANDLE), )). The problem is that both of these indicate that input is available as soon as the first key is struck. I need to know if a whole line is available, because fgets() blocks for an entire line.
Perhaps my issue is that Unix shells tend to buffer entire input lines, while Windows doesn't?
Should I just use fgetc() to build up a buffer until I see a newline?
I've done research finding other answers, but none of them work for me. They either use C++, whereas I need a C solution. Or they are focusing on using fgets() with sockets, where I need to use it with stdin.
Any help is greatly appreciated. Thanks!
How can I tell if the next call to fgets(stdin, ...) will block or not? In other words, how can I tell if the stdin buffer has a newline waiting to be read?
Generally speaking, you cannot tell. Not on POSIX systems, either, without making some assumptions. Both POSIX and the Windows API define mechanisms for determining whether input is available, but that's not enough for you. You want to determine whether specific data (i.e. a line terminator) are available to be read, and the only way C defines for doing that is to read the data.
Therefore, if you really need to read a line at a time without blocking your main thread, then I suggest performing your reads asynchronously. You could roll your own, with a reader thread separate from your main one, but you might find that Microsoft's existing asynchronous I/O API supports your needs.
Related
This seems like a simple question, but I have had a really hard time finding an answer. I am writing a program in C where this seems possible (though remotely so) on some systems, as it appears there are situations where stdin has a buffer of only 4k.
So, my question is, is there a standard way an OS deals with stdin filling up (i.e., a de facto standard, a posix requirement, etc)? How predictable is the outcome, if there is in fact some sort of standard way to deal with the situation?
The OS will have a buffer that stores the unread stdin input. In general things writing to stdin will be using blocking calls so that if the buffer fills up they will simply stall until room is available, so no data will be lost. If this is the undesirable behaviour (you don't want to be blocking the writer) then you need to make sure you are reading the buffer in time so that it doesn't fill up.
One thing you could do is create a worker thread that simply sits in a tight loop reading the stdin as fast as it can and puts the data somewhere else (in a much larger buffer for example) and then the main program accesses the data from your new buffer rather than reading from stdin itself.
It's often said that one shouldn't use C standard I/O functions (like fprintf(), fscanf()) when working with sockets.
I can't understand why. I think if the reason was just in their buffered nature, one could just flush the output buffer each time he outputs, right?
Why everyone uses UNIX I/O functions instead? Are there any situations when the use of standard C functions is appropriate and correct?
You can certainly use stdio with sockets. You can even write a program that uses nothing but stdin and stdout, run it from inetd (which provides a socket on STDIN_FILENO and STDOUT_FILENO), and it works even though it doesn't contain any socket code at all.
What you can't do is mix buffered I/O with select or poll because there is no fselect or fpoll working on FILE *'s and you can't even implement one yourself because there's no standard way of querying a FILE * to find out whether its input buffer is empty.
As soon as you need to handle multiple connections, stdio is not good enough.
It's totally fine when you have simple scenario with one socket in blocking mode and your application protocol is text-based.
It quickly becomes a huge pain with more then one or non-blocking socket(s), with any sort of binary encoding, and with any real performance requirements.
Do not know any direct objection. Most likely this will work fine.
At the same time I can imagine that a platform, where fprintf() and fscanf() have their own buffers, staying above the file descriptor layer. You may not be able to flush these buffers.
It is hard to speak about all possible platforms. This means that it is better to avoid this with sockets.
At the end of the day the app program should solve the app problem. It should not be a compiler/library test.
It's because sockets (TCP sockets, for example) are readable and writable as if they were files or pipes, but this is just an abstraction. The inner workings of a network connection are much more complicated than a local file or pipe.
To start with, reading a file is always "fast", either you get the data or bump end-of-file. In the other hand, if you expect 500 bytes from a TCP connection and it sends 499 (and the connection is not closed), you may be waiting forever. Writing is the same thing: it will block after TCP output buffer.
Even the most basic program needs to handle timeouts, disconnection, and all these things interact with FILE's own buffered I/O, not even textbook examples could be expected to work well.
I'm working on a little program that needs to pipe binary streams very closely (unbuffered). It has to rely on select() multiplexing and is never allowed to "hold existing input unless more input has arrived, because it's not worth it yet".
It's possible using System calls, but then again, I would like to use stdio for convenience (string formatting is involved, too).
Can I safely use select() on a stream's underlying file descriptor as long as I'm using unbuffered stdio? If not, how can I determine a FILE stream that will not block from a set?
Is there any call that transfers all input from libc to the the application, besides the char-by-char functions (getchar() and friends)?
While I'm not entirely clear on whether it's sanctioned by the standards, using select on fileno(f) should in practice work when f is unbuffered. Keep in mind however that unbuffered stdio can perform pathologically bad, and that you are not allowed to change the buffering except as the very first operation before you use the stream at all.
If your only concern is being able to do formatted output, the newly-standardized-in-POSIX-2008 dprintf (and vdprintf) function might be a better solution to your problem.
I'm reading from /proc/pid/task/stat to keep track of cpu usage in a thread.
fopen on /proc/pic/task/stat
fget a string from the stream
sscanf on the string
I am having issues however getting the streams buffer to update.
If I fget 1024 characters if regreshes but if I fget 128 characters then it never updates and I always get the same stats.
I rewind the stream before the read and have tried fsync.
I do this very frequently so I'd rather not reopen to file each time.
What is the right way to do this?
Not every program benefits from the use of buffered I/O.
In your case, I think I would just use read(2)1. This way, you:
eliminate all stale buffer2 issues
probably run faster via the elimination of double buffering
probably use less memory
definitely simplify the implementation
For a case like you describe, the efficiency gain may not matter on today's remarkably powerful CPUs. But I will point out that programs like cp(2) and other heavy-duty data movers don't use buffered I/O packages.
1. That is, open(2), read(2), lseek(2), and close(2).
2. And perhaps to intercept an argument, on questions related to this one someone usually offers a "helpful" suggestion along the lines of fflush(stdin), and then another someone comes along to accurately point out that fflush() is defined by C99 on output streams only, and that it's usually unwise to depend on implementation-specific behavior.
Is there a libc function that would do the same thing as getline, but would work with a connected socket instead of a FILE * stream ?
A workaround would be to call fdopen on a socket. What are things that should be taken care of, when doing so. What are reasons to do it/ not do it.
One obvious reason to do it is to call getline and co, but maybe it is a better idea to rewrite some custom getline ?
when you call a read on a socket, then it can return a zero value prematurely.
eg.
read(fd, buf, bufsize)
can return a value less than bufsize if the kernel buffer for the tcp socket is full.
in such a case it may be required to call the read function again unless it returns a zero or a negative result.
thus it is best to avoid stdio functions. you need to create wrappers for the read function in order to implement the iterative call to read for getting bufsize bytes reliably. it should return a zero value only when no more bytes can be read from the socket, as if the file is being read from the local disk.
you can find wrappers in the book Computer Systems: A Programmer's Perspective by Randal Bryant.
The source code is available at this site. look for functions beginning with rio_.
If the socket is connected to untrusted input, be prepared for arbitrary input within arbitrary time frame
\0 character before \r\n
wait eternally for any of \r or \n
any other potentially ugly thing
One way to address the arbitrary timing and arbitrary data would be to provide timeouts on the reads e.g. via select(2) and feed the data you actually receive to some well-written state machine byte by byte.
The problem would be if you don't receive the new line (\n or \r\n, depends on your implementation) the program would hang. I'd write your own version that also makes calls to select() to check if the socket is still read/writable and doesn't have any errors. Really there would be no way to tell if another "\n" or "\r\n" is coming so make sure you know that the data from the client/server will be consistent.
Imagine you coded a webserver that reads the headers using getline(). If an attacker simple sent
GET / HTTP/1.1\r\n
This line isn't terminated: bla
The call the getline would never return and the program would hang. Probably costing you resources and eventually a DoS would be possible.