C: Why is a fprintf(stdout,....) so slow?

C: Why is a fprintf(stdout,....) so slow? - c

I still often use console output to get ideas what's going on in my code.
I know this may be a bit old fashion, but I also use this to "pipe" stdout
into log files etc.
However, it turns out that the output to the console is slowed down for some
reason. I was wondering if someone can explain why an fprintf() to a console
window appears to be sort of blocking.
What I've done/diagnosed so far:
I measured the time a simple
fprintf(stdout,"quick fprintf\n");
It needs: 0.82ms (in average). This is considered by far too long since a vsprintf_s(...) writes the same output into a string in just a few microseconds. Therefore there must be some blocking specifically to the console.
In oder to escape from the blocking I have used vsprintf_s(...) to copy my output into a fifo alike data structure. The data structure is protected by a critical section object. A separate thread is then unqueing the data structure by putting the queued output to the console.
One further improvement I could obtain by the introduction of pipe services.
The output of my program (supposed to end up in a console window) goes the following way:
A vsprintf_s(...) formats the output to simple strings.
The strings are queued into a fifo alike data structure, a linked list sructure for example. This data structure is protected by a critical section object.
A second thread dequeues the data structure by sending the output strings to a named pipe.
A second process reads the named pipe and puts the strings again into a fifo alike data
structure. This is needed to keep the reading away from the blocking output to the console.
The reading process is fast at reading the named pipe and monitors the fill level of the pipes buffer continuously.
A second thread in that second process finally dequeues the data structure by fprintf(stdout,...) to the console.
So I have two processes with at least two threads each, a named pipe between them, and fifo alike data structures on both sides of the pipe to avoid blocking in the event of pipe buffer full.
That is a lot of stuff to just make sure that console output is "non-blocking". But the result is
not too bad. My main program can write complex fprintf(stdout,...) within just a few microseconds.
Maybe I should have asked earlier: Is there some other (easier!) way to have nonblocking console output?

I think the timing problem has to do with the fact that console is line buffered by default. This means that every time you write a '\n' character to it, your entire output buffer is sent to the console, which is a rather costly operation. This is the price that you pay for the line to appear in the output immediately.
You can change this default behavior by changing the buffering strategy to full buffering. The consequence is that the output will be sent to console in chunks that are equal to the size of your buffer, but individual operations will complete faster.
Make this call before you first write to console:
char buf[10000];
setvbuf(stdout, buf, _IOFBF, sizeof(buf));
The timing of individual writes should improve, but the output will not appear in the console immediately. This is not too useful for debugging, but the timing will improve. If you set up a thread that calls fflush(stdout) on regular time intervals, say, once every second, you should get a reasonable balance between the performance of individual writes and the delay between your program writing the output and the time when you can actually see it on the console.

Related

Getting user input in 2 child processes in C

I'm just wondering if it is possible to create two children processes, and in both of these processes ask the user for input? Or would it be stuck waiting forever?

It depends on the precise implementation of “asking user for input”. If this is readline, which implements shell-like input with the prompt and editing, it won’t work. The reason is that the library messes up with the terminal configuration. If two processes are doing the same thing simultaneously they will step on each other’s foot.
If we are talking about simply reading from standard input, that will work, but with a few quirks. First, without external synchronization it’s not known in which order processes are going to consume the input. It is even possible that process A grabs a few chunks from the input line, while process B grabs the rest.
Second, standard streams are buffered, therefore a process might consume more input than immediately obvious. E.g. the program reads a single line of input, but internally more data is read from the OS, since there’s no way to ask for bytes until the new line. The other process reading input simultaneously won’t get the input the other process consumed, even if the later only done it due to buffering and didn’t use the input.
To conclude, probably better to avoid having multiple processes consume input simultaneously.

Capturing stdout/stderr separately and simultaneously from child process results in wrong total order (libc/unix)

I'm writing a library that should execute a program in a child process, capture the output, and make the output available in a line by line (string vector) way. There is one vector for STDOUT, one for STDERR, and one for "STDCOMBINED", i.e. all output in the order it was printed by the program. The child process is connected via two pipes to a parent process. One pipe for STDOUT and one for STDERR. In the parent process I read from the read-ends of the pipes, in the child process I dup2()'ed STDOUT/STDERR to the write ends of the pipes.
My problem:
I'd like to capture STDOUT, STDERR, and "STDCOMBINED" (=both in the order they appeared). But the order in the combined vector is different to the original order.
My approach:
I iterate until both pipes show EOF and the child process exited. At each iteration I read exactly one line (or EOF) from STDOUT and exactly one line (or EOF) from STDERR. This works so far. But when I capture out the lines as they come in the parent process, the order of STDOUT and STDERR is not the same as if I execute the program in a shell and look at the output.
Why is this so and how can I fix this? Is this possible at all? I know in the child process I could redirect STDOUT and STDERR both to a single pipe but I need STDOUT and STDERR separately, and "STDCOMBINED".
PS: I'm familiar with libc/unix system calls, like dup2(), pipe(), etc. Therefore I didn't post code. My question is about the general approach and not a coding problem in a specific language. I'm doing it in Rust against the raw libc bindings.
PPS: I made a simple test program, that has a mixup of 5 stdout and 5 stderr messages. That's enough to reproduce the problem.

At each iteration I read exactly one line (or EOF) from STDOUT and exactly one line (or EOF) from STDERR.
This is the problem. This will only capture the correct order if that was exactly the order of output in the child process.
You need to capture the asynchronous nature of the beast: make your pipe endpoints nonblocking, select* on the pipes, and read whatever data is present, as soon as select returns. Then you'll capture the correct order of the output. Of course now you can't be reading "exactly one line": you'll have to read whatever data is available and no more, so that you won't block, and maintain a per-pipe buffer where you append new data, extract any lines that are present, shove the unprocessed output to the beginning, and repeat. You could also use a circular buffer to save a little bit of memcpy-ing, but that's probably not very important.
Since you're doing this in Rust, I presume there's already a good asynchronous reaction pattern that you could leverage (I'm spoiled with go, I guess, and project the hopes on the unsuspecting).
*Always prefer platform-specific higher-performance primitives like epoll on Linux, /dev/poll on Solaris, pollset &c. on AIX
Another possibility is to launch the target process with LD_PRELOAD, with a dedicated library that it takes over glibc's POSIX write, detects writes to the pipes, and encapsulates such writes (and only those) in a packet by prepending it with a header that has an (atomically updated) process-wide incrementing counter stored in it, as well as the size of the write. Such headers can be easily decoded on the other end of the pipe to reorder the writes with a higher chance of success.

I think it's not possible to strictly do what you want to do.
If you think about how it's done when running a command in an interactive shell, what happens is that both stdout and stderr point to the same file descriptor (the TTY), so the total ordering is correct by means of synchronization against the same file.
To illustrate, imagine what happens if the child process has 2 completely independent threads, one only writing to stderr, and to other only writing to stdout. The total ordering would depend on however the scheduler decided to schedule these threads, and if you wanted to capture that, you'd need to synchronize those threads against something.
And of course, something can write thousands of lines to stdout before writing anything to stderr.
There are 2 ways to relax your requirements into something workable:
Have the user pass a flag waiving separate stdout and stderr streams in favor of a correct stdcombined, and then redirect both to a single file descriptor. You might need to change the buffering settings (like stdbuf does) before you execute the process.
Assume that stdout and stderr are "reasonably interleaved", an assumption pointed out by #Nate Eldredge, in which case you can use #Unslander Monica's answer.

Using STDIN and STDOUT subsequently

I'm writing a chat in C that can be used in terminal...
For receiving text messages, i have a thread that will print that message out on STDOUT
Another thread is reading from stdin...
The problem is, if a new message is printed to stdout while typing, it will be printed between what i typed.
I researched several hours an experimented with GNU readline to prevent this problem. I thougth the "Redisplay" function will help me here.. but I could not compile my program on Mac OSX if I used certain redisplay functions (it said ld: undefined symbols) whereas other functions worked properly... I compiled this program on an Ubuntu machine and there it worked... i really have no idea why...
Nevertheless, how can achieve that everything that is written to stdout will be above the text i'm currently writing?

You have basically two solutions.
The first is to use something that helps you dividing your screen in different pieces and as #Banthar said ncurses is the standard solution.
The second is to synchronize your writings and readings. The thread that is reading from the network and writing to the console may simply postpone the messages until you entered something from the keyboard, at that time you can simply flush your messages-buffer by writing all the messages at once. Caveat: this solution may cause your buffer to be overflow, you may either forget too old messages or flush the buffer when full.

If your requirement it to use only stdin and stdout (ie a dumb terminal), you will have to first configure you console input as not line buffered which is the default (stty -icanon on Unix like systems). Unfortunately I could not find a portable way to do that programatically, but you will find more on that in this other question on SO How to avoid press enter with any getchar().
Then, you will have to collate next outgoing message character by character. So when an input message is ready to be delivered in the middle of the writing of an output one, you jump on line, write the output message, (eventually jump one other line or do what is normally done for prompting) and rewrite be input buffer so the user has exactly the characters he allready typed.
You will have to use a kind of mutual exclusion to avoid that the input thread makes any access to the input buffer while the output thread does all that work.

How to write a file in multiple processes using standard I/O functions on Linux?

I'm working on a feature which requires writing a single log file(identified by its path) in multiple processes. Previously, each process used to call printf to stream the log on terminal(standard output). Now I need to change the output destination to a file. So I tried using freopen to redirect the stdout to the file in each process.
freopen(file_path, "a", stdout); //
But it seems it doesn't work well. Some log is missing.
What's the common practice to achieve this ?
B.T.W In our requirement, the user should be allowed to switch logging destination between file and standard output, so the first argument "file_path" could be tty when switched back to terminal. Is that OK to call freopen(tty, "a", stdout)?

You have many options:
1) the simplest approach would be for every process to simply write to the same log independently. The problem, of course, is that the file would get scrambled if any two processes wrote different messages at the same time.
2) You could instead have the processes send messages to one "master logger", which would then output messages one at a time, in the order received. The "master logger" might use sockets. If all the processes were on the same host, you might instead use a message queue or a named pipe.
3) Even simpler, you could have a system-wide semaphore to insure only one message at a time gets written.
4) Yet another approach could be to use an open-source logger such as log4j or syslog-ng

Writes in O_APPEND mode will do what you want as long as they are less than PIPE_BUF bytes, which is usually plenty of room (about 4k).
So, set the newly freopen()ed file to line buffered (_IOLBF below), and then make sure your writes contain a newline to flush the buffer:
freopen(file_path, "a", stdout);
setvbuf(stdout, (char *)NULL, _IOLBF, 0); // a.k.a. setlinebuf(stdout) under BSD
...
printf("Some log line\n"); // Note the newline!

Pipe your output to a file handle called, say, output using fprintf. Since file handle's are just pointers, just set output = stdout or output = yourFile. Then every fprintf(output, "sometext") winds up going wherever that handle is set at the moment. You can even have a function to dynamically redirect the output on the fly based upon user input.

Writing and reading from terminal using pthreads

I want to create a multithreaded application in C using pthreads. I want to have a number of worker threads doing stuff in the background, but every once in a while, they will have to print something to the terminal so I suppose they will have to
"acquire the output device" (in this case stdout)
write to it
release the output device
rinse and repeat.
Also, I want the user to be able to "reply" to the output. For the sake of simplicity, I'm going to assume that nothing new will be written to the terminal until the user gives an answer to a thread's output, so that new lines are only written after the user replies, etc. I have read up on waiting for user input on the terminal, and it seems that ncurses is the way to go for this.
However, now I have read that ncurses is not thread-safe, and I'm unsure how to proceed. I suppose I could wrap everything terminal-related with mutexes, but before I do that I'd like to know if there's a smarter and possibly more convenient way of going about this, maybe a solution with condition variables? I'm somewhat lost here, so any help is welcome.

Why not just have a thread whose job is to interact with the terminal?
If other threads want to send message or get replies from the terminal, they can create a structure reflecting that request, acquire a mutex, and add that structure to a linked list if structures. The terminal thread will walk the linked list, outputting data as needed and getting replies as needed.
You can use a condition variable to signal the terminal thread that there's now data that needs to be output. The structure in the linked list can include a response condition variable that the terminal thread can signal when it has the reply, if any.
For output that gets no reply, the terminal thread can delete the structure after it outputs its contents. For output that gets a reply, the terminal thread can signal the thread that's interested in the output and then let that thread delete the structure once it has copied the output.

You can use fprintf on terminal. fprintf takes care of the concurrency issues, like it will use mutex locks on stdout before writing to the output device.