Reading data from child process one by one - c

I have a code that creates two pipes for writing and reading data of 2 (to 4) child processes that call another program. The code of this program is simply two printf, one printing - and another printing Done both to the stdout which is connected to the reading pipe of the parent process.
In the parent process I read the data using
read(pipes[i][1][0], buffer, sizeof(buffer)-1);
The problem is that if I set the size of buffer to be 4 (for example) the read() call reads -Do which is not what I want, because I will call read() again after.
If the size is 2 everything works fine because I know the size of what I'm going to read, but in the rest of the code I don't have that information.
I tried fflush(stdout) after each printf() on the child process but it doesn't work. I think this is really easy to solve but I cannot figure it out, is there a way to read the prints made by the child process one by one?

A sane way might be to use newline '\n' characters as separators.
Setting the buffer size to your exact message size is a brittle hack, in that it will break as soon as you add a new message with a different length.
In general anything you expect to send over a stream-oriented connection (pipes, streams or TCP sockets) needs either a message header with length, or a delimiter, to be reasonably easy to parse.
If you desperately want to treat each write as a discrete message, you could alternatively use a datagram socket, which actually behaves like this. You'd be looking for an AF_UNIX/SOCK_DGRAM socketpair in that case.

You can set the size of the buffer to be 5 and when doing the first read() to read "-". Read only 1 character by doing following:
read(pipes[i][1][0], buffer, 2);
And after that when you do the second read. Read 4 characters by doing:
read(pipes[i][1][0], buffer, 4);
The size of the buffer is set to 5 to contain the \0 character in the end, if you want to do string operations to check what you read.

Related

Capturing stdout/stderr separately and simultaneously from child process results in wrong total order (libc/unix)

I'm writing a library that should execute a program in a child process, capture the output, and make the output available in a line by line (string vector) way. There is one vector for STDOUT, one for STDERR, and one for "STDCOMBINED", i.e. all output in the order it was printed by the program. The child process is connected via two pipes to a parent process. One pipe for STDOUT and one for STDERR. In the parent process I read from the read-ends of the pipes, in the child process I dup2()'ed STDOUT/STDERR to the write ends of the pipes.
My problem:
I'd like to capture STDOUT, STDERR, and "STDCOMBINED" (=both in the order they appeared). But the order in the combined vector is different to the original order.
My approach:
I iterate until both pipes show EOF and the child process exited. At each iteration I read exactly one line (or EOF) from STDOUT and exactly one line (or EOF) from STDERR. This works so far. But when I capture out the lines as they come in the parent process, the order of STDOUT and STDERR is not the same as if I execute the program in a shell and look at the output.
Why is this so and how can I fix this? Is this possible at all? I know in the child process I could redirect STDOUT and STDERR both to a single pipe but I need STDOUT and STDERR separately, and "STDCOMBINED".
PS: I'm familiar with libc/unix system calls, like dup2(), pipe(), etc. Therefore I didn't post code. My question is about the general approach and not a coding problem in a specific language. I'm doing it in Rust against the raw libc bindings.
PPS: I made a simple test program, that has a mixup of 5 stdout and 5 stderr messages. That's enough to reproduce the problem.
At each iteration I read exactly one line (or EOF) from STDOUT and exactly one line (or EOF) from STDERR.
This is the problem. This will only capture the correct order if that was exactly the order of output in the child process.
You need to capture the asynchronous nature of the beast: make your pipe endpoints nonblocking, select* on the pipes, and read whatever data is present, as soon as select returns. Then you'll capture the correct order of the output. Of course now you can't be reading "exactly one line": you'll have to read whatever data is available and no more, so that you won't block, and maintain a per-pipe buffer where you append new data, extract any lines that are present, shove the unprocessed output to the beginning, and repeat. You could also use a circular buffer to save a little bit of memcpy-ing, but that's probably not very important.
Since you're doing this in Rust, I presume there's already a good asynchronous reaction pattern that you could leverage (I'm spoiled with go, I guess, and project the hopes on the unsuspecting).
*Always prefer platform-specific higher-performance primitives like epoll on Linux, /dev/poll on Solaris, pollset &c. on AIX
Another possibility is to launch the target process with LD_PRELOAD, with a dedicated library that it takes over glibc's POSIX write, detects writes to the pipes, and encapsulates such writes (and only those) in a packet by prepending it with a header that has an (atomically updated) process-wide incrementing counter stored in it, as well as the size of the write. Such headers can be easily decoded on the other end of the pipe to reorder the writes with a higher chance of success.
I think it's not possible to strictly do what you want to do.
If you think about how it's done when running a command in an interactive shell, what happens is that both stdout and stderr point to the same file descriptor (the TTY), so the total ordering is correct by means of synchronization against the same file.
To illustrate, imagine what happens if the child process has 2 completely independent threads, one only writing to stderr, and to other only writing to stdout. The total ordering would depend on however the scheduler decided to schedule these threads, and if you wanted to capture that, you'd need to synchronize those threads against something.
And of course, something can write thousands of lines to stdout before writing anything to stderr.
There are 2 ways to relax your requirements into something workable:
Have the user pass a flag waiving separate stdout and stderr streams in favor of a correct stdcombined, and then redirect both to a single file descriptor. You might need to change the buffering settings (like stdbuf does) before you execute the process.
Assume that stdout and stderr are "reasonably interleaved", an assumption pointed out by #Nate Eldredge, in which case you can use #Unslander Monica's answer.

print to screen from fifo stdout failed

I have a program which has 2 children (running 2 processes by execl), and one fifo.
I can't use printf, and I want both children to write and read from fifo.
problem is, I want only first child to make sure that everything he writes to my FIFO will be printed out to the screen. "fifoCommunication" is the name of the fifo created by father.
here is the code inside the first child's process only:
int main() {
int fd_write = open("fifoCommunication",O_WRONLY);
dup(fd_write,0);
write(fd_write,"to be printed to screen!" ,18);}
I know it's not the right syntax, but I don't know how to make sure the message is printed to the screen properly, and also preventing the other child to print messages to the screen, only to the FIFO.
I am afraid your requirements conflict with each other.
I want only first child to make sure that everything he writes to my
FIFO will be printed out to the screen.
Therefore FIFO must print to the console whatever it gets. FIFO doesn't distinguish between processes that have printed to it. It doesn't know that it is the first or the second child that called write at this time 1.
preventing the other child to print messages to the screen, only to
the FIFO
Therefore this contradicts above, because printing "only to the fifo" must print to the screen too if former requirement is to be met.
You can achieve what you want by printing separately to fifo and to the stdout.
1 (unless you change kernel code to for example check the first byte of message to be printed, so you would then prefix each data with '1' or '2' or whatever you choose and take appropriate action in kernel based on that - but then what is going to happen to all other uses of fifo on your machine is nothing good most likely, so don't do it)

Passing multiple chunks of data using pipe in C

I need to send 3 char buffers to child process and I want to treat them as 3 separate chunks of data. I thought of using read() and write() system calls but after reading man I can't see a way to separate the data - if I understand it correctly, if I write 3 buffers one by one in parent process, then one call of read() will read all the data. Of course I could put some separators like '\0' in input buffers and separate the data in child, but I'm looking for some more elegant way to do this. So, is there some kind of system call that enables to pass data sequentially?
One possibility is to use what stdio.h already gives you: fdopen() the respective ends of the pipes and use fgets()/fputs() with the FILE pointers. This assumes your data doesn't contain newlines.
Some alternatives could be to use fixed sizes with read()/write() or to use some other delimiter and parse the received data with strtok(). You could also send the size first so the child knows how many bytes to read in the next read() call. There are really lots of options.
If you have it, you can use O_DIRECT to get a "packet-oriented" pipe, but there are limitations of course.
In general for a text-based streamed protocol having separators is cleaner in my opinion.
You have two choices
Put delimiters in the data (as you mentioned in the question).
Provide feedback from the child. In other words, after writing a chunk of data to the pipe, the parent waits for a response from the child, e.g. on a second pipe, or using a semaphore.
You could precede each chunk of data with a header, including a length field if chunks can be variable length. The reader can read the header and then the chunk contents.

C: Why is a fprintf(stdout,....) so slow?

I still often use console output to get ideas what's going on in my code.
I know this may be a bit old fashion, but I also use this to "pipe" stdout
into log files etc.
However, it turns out that the output to the console is slowed down for some
reason. I was wondering if someone can explain why an fprintf() to a console
window appears to be sort of blocking.
What I've done/diagnosed so far:
I measured the time a simple
fprintf(stdout,"quick fprintf\n");
It needs: 0.82ms (in average). This is considered by far too long since a vsprintf_s(...) writes the same output into a string in just a few microseconds. Therefore there must be some blocking specifically to the console.
In oder to escape from the blocking I have used vsprintf_s(...) to copy my output into a fifo alike data structure. The data structure is protected by a critical section object. A separate thread is then unqueing the data structure by putting the queued output to the console.
One further improvement I could obtain by the introduction of pipe services.
The output of my program (supposed to end up in a console window) goes the following way:
A vsprintf_s(...) formats the output to simple strings.
The strings are queued into a fifo alike data structure, a linked list sructure for example. This data structure is protected by a critical section object.
A second thread dequeues the data structure by sending the output strings to a named pipe.
A second process reads the named pipe and puts the strings again into a fifo alike data
structure. This is needed to keep the reading away from the blocking output to the console.
The reading process is fast at reading the named pipe and monitors the fill level of the pipes buffer continuously.
A second thread in that second process finally dequeues the data structure by fprintf(stdout,...) to the console.
So I have two processes with at least two threads each, a named pipe between them, and fifo alike data structures on both sides of the pipe to avoid blocking in the event of pipe buffer full.
That is a lot of stuff to just make sure that console output is "non-blocking". But the result is
not too bad. My main program can write complex fprintf(stdout,...) within just a few microseconds.
Maybe I should have asked earlier: Is there some other (easier!) way to have nonblocking console output?
I think the timing problem has to do with the fact that console is line buffered by default. This means that every time you write a '\n' character to it, your entire output buffer is sent to the console, which is a rather costly operation. This is the price that you pay for the line to appear in the output immediately.
You can change this default behavior by changing the buffering strategy to full buffering. The consequence is that the output will be sent to console in chunks that are equal to the size of your buffer, but individual operations will complete faster.
Make this call before you first write to console:
char buf[10000];
setvbuf(stdout, buf, _IOFBF, sizeof(buf));
The timing of individual writes should improve, but the output will not appear in the console immediately. This is not too useful for debugging, but the timing will improve. If you set up a thread that calls fflush(stdout) on regular time intervals, say, once every second, you should get a reasonable balance between the performance of individual writes and the delay between your program writing the output and the time when you can actually see it on the console.

Is it possible to know how many bytes have been printed to a file stream such as standard output?

Is it possible for a caller program in C to know how many bytes it has printed to a file stream such as stdout without actually counting and adding up the return values of printf?
I am trying to implement control of the quantity of output of a C program which uses libraries to print, but the libraries don't report the amount of data they have printed out.
I am interested in either a general solution or a Unix-specific one.
POSIX-specific: redirect stdout to a file, flush after all writing is done, then stat the file and look at st_size (or use the ls command).
Update: You say you're trying to control the quantity of output of a program. The POSIX head command will do that. If that's not satisfactory, then state your requirements clearly.
It's rather a heavyweight solution, but the following will work:
Create a pipe by calling pipe()
Spawn a child process
In the parent: redirect stdout to the write-side of the pipe, and close the read side (and the old stdout)
In the child: keep reading from the read side of the pipe, and copying the data to the inherited stdout (which is the original stdout) - counting it as it goes past
In the parent, keep writing to stdout (which is now the pipe) as usual
Use some form of IPC to communicate the result of the count from the child to the parent.
Basically the idea is to spawn a child process, and pipe all output through it, and have the child process count all the data as it goes through.
The precise form of IPC to use may vary - for example, shared memory (with atomic reads/writes on each side) would work well for fast transfer of data, but other methods (such as sockets, more pipes etc) are possible, and offer better scope for synchronisation.
The trickiest part is the synchronisation, i.e. ensuring that, at the time the child tells the parent how much data has been written, it has already processed all the data that the parent said (and there is none left in the pipe, for example). How important this is will depend on exactly what your aim is - if an approximate indication is all that's required, then you may be able to get away with using shared memory for IPC and not performing any explicit synchronisation; if the total is only required at the end then you can close stdout from the parent, and have the child indicate in the shared memory when it has received the eof notification.
If you require more frequent readouts, which must be exact, then something more copmlex will be required, but this can be achieved by designing some sort of protocol using sockets, pipes, or even condvars/semaphores/etc in the shared memory.
printf returns the number of bytes written.
Add them up.
No idea how reliable this is, but you can use ftell on stdout:
long int start = ftell(stdout);
printf("abcdef\n");
printf("%ld\n", ftell(stdout) - start); // >> 7
EDIT
Checked this on Ubuntu Precise: it does not work if the output goes to the console, but does work if it is redirected to a file.
$ ./a.out
abcdef
0
$ ./a.out >tt
$ cat tt
abcdef
7
$ echo `./a.out`
abcdef 0
$ echo `cat tt`
abcdef 7

Resources