forcing fgets to block (i.e. faking an "interactive device") - c

I have a C application which provides a "shell" for entering commands. I'm trying to write some automated test-code for the application (Using CUnit). The "shell" input is read from stdin like so:
fgets(buf, sizeof(buf), stdin);
I can "write" commands automatically to the application by freopen()'ning stdin and hooking it to an intermediate file. When the application is executed "normally" the fgets() call blocks untill characters are available because it is "an interactive device", but not so on the intermediate file. So how can I fake fgets into thinking the intermediate file is an "interactive device".
The C program is for Windows (XP) compiled using MinGW.
Regards!

fgets is not blocking when you are reading from a file because it reaches the end of the file which causes EOF to set on the stream and thus calls to fgets return immediately. When you are running from an interactive input EOF is never set, unless you type Ctrl-Z (or Ctrl-D on UNIX system) of course.
If you really want to use an intermediate file I think you'll need to enhance your shell so that when it hits an EOF it clears and retests it after a suitable wait. A function like this should work I think:-
void waitForEofClear(FILE *f)
{
while (feof(f)) {
clearerr(f);
sleep(1);
}
}
You could then call this before the fgets:-
waitForEofClear(stdin);
fgets(buf, sizeof(buf), stdin);

Simply using a file is not going to work, as the other answers have indicated. So, you need to decide what you are going to do instead. A FIFO (named pipe) or plain (anonymous) pipe could be used to feed the interactive program under test - or, on Unix, you could use a pseudo-tty. The advantage of all these is that a program blocks when there is no data read, waiting for the next information to arrive, rather than immediately deciding 'no data to read, must be EOF'.
You will then need a semi-intelligent (or even intelligent) program periodically writing data to the channel for the program under test to read. This program will need to know how long to pause between the messages it writes. This might be as simplistic as 'wait one second; write the next line of data'. Or you might do something more complex.
One scheme that I know of has two programs - a capture program to record what a user types and the timing of it (so the 'data' file is structured; it has records consisting of a delay (in seconds and fractions of a second) plus a set of characters to send (count and list of bytes). This is run to capture what the user types and record it (as well as send the data to the program). There is then a second replay program that reads the file, and interprets the delays and character sequences.
This scheme works adequately if the input sequence is stable; if the same sequence of key strokes is always needed to get the required result. If the data sent to the program needs to adapt to what the program under test is doing and its responses, and may do different things at different times, then you are probably better off going with 'expect'. This has the capacity to do whatever you need - at least for non-GUI programs.

I'm not sure what the windows equivalent is, but in Linux I would make the intermediate file a fifo. If I was going to do a real non-trivial autopilotting, I would wrap it in an expect script.

Related

Guarantee that getchar receives newline or EOF (eventually)?

I would like to read characters from stdin until one of the following occurs:
an end-of-line marker is encountered (the normal case, in my thinking),
the EOF condition occurs, or
an error occurs.
How can I guarantee that one of the above events will happen eventually? In other words, how do I guarantee that getchar will eventually return either \n or EOF, provided that no error (in terms of ferror(stdin)) occurs?
// (How) can we guarantee that the LABEL'ed statement will be reached?
int done = 0;
while (!0) if (
(c = getchar()) == EOF || ferror(stdin) || c == '\n') break;
LABEL: done = !0;
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur. It seems like the answer will have to do with the properties of the device. Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition. Similarly for files stored on disc / SSD.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I am using C89, but I am curious if the answer depends on which C standard is used.
You can't.
Let's say I run your program, then I put a weight on my keyboard's "X" key and go on vacation to Hawaii. On the way there, I get struck by lightning and die.
There will never be any input other than 'x'.
Or, I may decide to type the complete story of Moby Dick, without pressing enter. It will probably take a few days. How long should your program wait before it decides that maybe I won't ever finish typing?
What do you want it to do?
Looking at all the discussion in the comments, it seems you are looking in the wrong place:
It is not a matter of keyboard drivers or wrapping stdin.
It is also not a matter of what programming language you are using.
It is a matter of the purpose of the input in your software.
Basically, it is up to you as a programmer to know how much input you want or need, and then decide when to stop reading input, even if valid input is still available.
Note, that not only are there devices that can send input forever without triggering EOF or end of line condition, but there are also programs that will happily read input forever.
This is by design.
Common examples can be found in POSIX style OS (like Linux) command line tools.
Here is a simple example:
cat /dev/urandom | hexdump
This will print random numbers for as long as your computer is running, or until you hit Ctrl+C
Though cat will stop working when there is nothing more to print (EOF or any read error), it does not expect such an end, so unless there is a bug in the implementation you are using it should happily run forever.
So the real question is:
When does your program need to stop reading characters and why?
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur.
A device such as /dev/zero, for example. Yes, stdin can be connected to a device that never provides a newline or reaches EOF, and that is not expected ever to report an error condition.
It seems like the answer will have to do with the properties of the device.
Indeed so.
Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
Generally, it's a question of the device driver. And in some cases (such as the /dev/zero example) that's all there is anyway. Generally drivers do things that are sensible for the underlying hardware, but in principle, they don't have to do.
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition.
No. Generally speaking, an end-of-line marker is sent by a terminal device if and only if the <enter> key is pressed. An end-of-file condition might be signaled if the terminal disconnects (but the program continues), or if the user explicitly causes one to be sent (by typing <-<D> on Linux or Mac, for example, or <-<Z> on Windows). Neither of those events need actually happen on any given run of a program, and it is very common for the latter not to do.
Similarly for files stored on disc / SSD.
You can generally rely on data read from an ordinary file to contain newlines where they are present in the file itself. If the file is open in text mode, then the system-specific text line terminator will also be translated to a newline, if it differs. It is not necessary for a file to contain any of those, so a program reading from a regular file might never see a newline.
You can rely on EOF being signaled when a read is attempted while the file position is at or past the and of the file's data.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I think you're trying too hard.
Reading to end-of-line might be a reasonable thing to do in some cases. Expecting a newline to eventually be reached is reasonable if the program is intended to support interactive use. But trying to ensure that invalid data cannot be fed to your program is a losing cause. Your objective should be to accept the widest array of inputs you reasonably can, and to fail gracefully when other inputs are presented.
If you need to read input in a line-by-line mode then by all means do that, and document that you do it. If only the first n characters of each line are significant to the program then document that, too. Then, if your program never terminates when a user connects its input to /dev/zero that's on them, not on you.
On the other hand, try to avoid placing arbitrary constraints, especially on sizes of things. If there is not a natural limit on the size of something, then no artificial limit you introduce will ever be enough.

Can I determine how many bytes are in the stdio userspace read buffer associated with a FILE?

I'm writing a C program that connects to another machine over a TCP socket and reads newline-delimited text over that TCP connection.
I use poll to check whether data is available on the file descriptor associated with the socket, and then I read characters into a buffer until I get a newline. However, to make that character-by-character read efficient, I'm using a stdio FILE instead of using the read system call.
When more than one short line of input arrives over the socket quickly, my current approach has a bug. When I start reading characters, stdio buffers several lines of data in userspace. Once I've read one line and processed it, I then poll the socket file descriptor again to determine whether there is more data to read.
Unfortunately, that poll (and fstat, and every other method I know to get the number of bytes in a file) don't know about any leftover data that is buffered in userspace as part of the FILE. This results in my program blocking on that poll when it should be consuming data that has been buffered into userspace.
How can I check how much data is buffered in userspace? The specs specifically tell you not to rely on setvbuf for this purpose (the representation format is undefined), so I'm hoping for another option.
Right now, it seems like my best option is to implement my own userspace buffering where I have control over this, but I wanted to check before going down that road.
EDIT:
Some comments did provide a way to test if there is at least one character available by setting the file to be nonblocking and trying to fgetc/fungetc a single character, but this can't tell you how many bytes are available.

How to make fprintf() writes immediately

One way to write into a file is by using fprintf(). However, this function does not write the results into a file immediately. It rather seems to write everything at once when the program is terminated or finished.
My question is the following: I have a program that takes very long time to run (4-5 hours for big dataset). During this time, I want to see the intermediate results so that I don't have to wait for 5 hours. My university uses Sun Grid Engine for job submission. As most of you know, you have to wait until your job finishes to see your final results. Thus, I want to be able to write the intermediate results into a text file and see the updated results as the program is processing (Similarly if I am using printf).
How can I modify fprintf() to write anything I want immediately to the target file?
You can use the fflush function after each write to flush the output buffer to disk.
fprintf(fileptr, "writing to file\n");
fflush(fileptr);
If you're on a POSIX system (i.e. Linux, BSD, etc), and you really want to be sure the file is written to disk, i.e. you want to flush the kernel buffers as well as the userspace buffers, also use fsync:
fsync(fileno(fileptr));
But fflush should be sufficient. Don't bother with fsync unless you find that you need to.
Maybe you can set FILE pointer _IONBF mode. Then you cloud not use fflush or fsync.
FILE *pFilePointor = fopen(...);
setvbuf(pFilePointor, NULL, _IONBF, 0);
fprintf(...)
fprintf(...)
fflush
This works on FILE *. For your case it looks more appropriate. Please note fflush(NULL) will update all opened files / streams and my be CPU intensive. You may like to use/avoid fflush(NULL) for performance reason.
fsync
This works on int descriptor. It not only updates file/stream, also metadata. It can work even in system crash / reboot cases as well. You can check man page for more details.
Personally I use fflush, and it works fine for me (in Ubuntu / Linux).

How to exit Fread() after encountering a delimiter or after some time?

I am trying to achieve something here , I get the data from a linux system in a named pipe,
the data is sporadic and does not have any determined frequency. So I have a server program in C which reads from the named pipe. But my requirement is that I have to send the data out to another program as soon I recieve the data from the client, but FREAD() function just sits on it until:
a)The buffer is full and it cannot read anymore (or)
b)The client closes the pipe.
The client would send every message with a delimiter of "\0", the size of the messages from the client can vary. My biggest question is how to BREAK fread after reading the message and waiting for couple of seconds and break the Fread. It just sits on the fread waiting for the data.
amountRead = fread(buffer+remaining, (size_t)1, (size_t)(BUFFER_SIZE-remaining), file);
Basically I am trying to understand if there is any way to break the FREAD after a certain amount of time (OR) based on a delimiter?
The most straightforward approach would be to implement your own buffering solution, and use select and read, while implementing a timeout mechanism for select. This would allow you to break off the the read operation based on some time-based criteria.
As for exiting early on a delimiter character, that's not going to happen. fread is buffered and blocking, so it's going to wait until it all the data it's requesting is available.
With select, you can have a dedicated thread waiting on data to be ready, and act on it, or wait for a timeout, etc.
Please refer to the references below for working examples.
References
How to implement a timeout in read function call?, Accessed 2014-06-25, <https://stackoverflow.com/questions/2917881/how-to-implement-a-timeout-in-read-function-call>
What is the difference between read() and fread()?, Accessed 2014-06-25, <https://stackoverflow.com/questions/584142/what-is-the-difference-between-read-and-fread>
Assuming the buffer size is 100 bytes and the delimiter is a comma (,):
fscanf(f, "%99[^,]", buffer);
If the delimiter or buffer size is not something you can easily hard-code, you can use snprintf to construct the format string programmatically, as in:
char fmt[10+3*sizeof(size_t)];
snprintf(fmt, sizeof fmt, "%%%zu[^%c]", sizeof buffer, delim);
fscanf(f, fmt, buffer);
Alternatively, you can loop calling getc until you get the delimiter. Depending on the implementation and the expected run length, this could be slower or faster than the fscanf method.
On POSIX 2008 conforming systems, you could alternatively use the getdelim function. This allows arbitrary input length (automatically allocated) which may be an advantage (ease of use and flexibility) or a disadvantage (bad input can exhaust all memory).
Edit: Sorry, I missed the part about needing a timeout. In that case using stdio is difficult, and you might be better off writing your own buffering system.

How are files written? Why do I not see my data written immediately?

I understand the general process of writing and reading from a file, but I was curious as to what is happening under the hood during file writing. For instance, I have written a program that writes a series of numbers, line by line, to a .txt file. One thing that bothers me however is that I don't see the information written until after my c program is finished running. Is there a way to see the information written while the program is running rather than after? Is this even possible to do? This is a hard question to phrase in one line, so please forgive me if it's already been answered elsewhere.
The reason I ask this is because I'm writing to a file and was hoping that I could scan the file for the highest and lowest values (the program would optimally be able to run for hours).
Research buffering and caching.
There are a number of layers of optimisation performed by:
your application,
your OS, and
your disk driver,
in order to extend the life of your disk and increase performance.
With the careful use of flushing commands, you can generally make things happen "quite quickly" when you really need them to, though you should generally do so sparingly.
Flushing can be particularly useful when debugging.
The GNU C Library documentation has a good page on the subject of file flushing, listing functions such as fflush which may do what you want.
You observe an effect solely caused by the C standard I/O (stdio) buffers. I claim that any OS or disk driver buffering has nothing to do with it.
In stdio, I/O happens in one of three modes:
Fully buffered, data is written once BUFSIZ (from <stdio.h>) characters were accumulated. This is the default when I/0 is redirected to a file or pipe. This is what you observe. Typically BUFSIZ is anywhere from 1k to several kBytes.
Line buffered, data is written once a newline is seen (or BUFSIZ is reached). This is the default when i/o is to a terminal.
Unbuffered, data is written immediately.
You can use the setvbuf() (<stdio.h>) function to change the default, using the _IOFBF, _IOLBF or _IONBF macros, respectively. See your friendly setvbuf man page.
In your case, you can set your output stream (stdout or the FILE * returned by fopen) to line buffered.
Alternatively, you can call fflush() on the output stream whenever you want I/O to happen, regardless of buffering.
Indeed, there are several layers between the writing commands resp. functions and the actual file.
First, you open the file for writing. This causes the file to be either created or emptied. If you write then, the write doesn't actually occur immediately, but the data are cached until the buffer is full or the file is flushed or closed.
You can call fflush() for writing each portion of data, or you can actually wait until the file is closed.
Yes, it is possible to see whats written in the file(s). If you programm under Linux you can open a new Terminal and watch the progress with for example "less Filename".

Resources