Capturing program output

Capturing program output - c

i am making a small library that will basically capture the standard outputs of a program (such as printf()) into a separate process/thread...this process should then perform certain tasks (lets say write these captured outputs to a file)...i am just beginning to do serious C programming so i am still learning.
i wanted to know what is the best way to do this, i mean using process or a thread...how do i capture these printf() statements...also this library must handle any child process if spawned by the programs...the general assumption is the program that uses it is a threaded one so may be what sort of approach should i take.

If you want your program or library to launch the program and capture its output, look at popen(3). It will give you a FILE pointer where you can read the output from the program.

The easiest way to capture the STDOUT from another program is to simply pipe it into the STDIN of your program (via the command-line ">" or "|" operator). So basically, in your C library, you should just read from STDIN with scanf, or gets, or whatever STDIN function you're using.
This is a pretty standard convention in the Unix/Linux world - programs read from STDIN and write to STDOUT in some well-formatted way, so that you can pipeline different programs together by simply adding pipes to the command line, e.g.:
grep "somestring" file1 file2 file3 | cut -d, -f1 | sort | uniq

Related

Why redirection to stdout and stdin so common in PIPE programming

I am very much new to IPC programming in C. I had a very basic question, why many of our C codes use dup2 to make stdout as write head and stdin as read head for the PIPE. Is there any benefit, comapred to an array of integer type and using the array as an input to pipe call?

Many C programs are written as filters, which (by default) read from standard input and write to standard output. The plumbing with pipes exploits, and supports, idioms of sending the output from one program to the input of another, as with:
ls | wc -l
That's why you very often end up with code connecting the pipe file descriptors to standard input or standard output. If you needed to make programs read from, or write to, arbitrary file descriptors, you'd have to supply control arguments to tell them what to do. Granted, these days on systems such as Linux with the /dev/fd file system, it would be doable, but that's a recent innovation that was not available when many programs were first written. You could get nearly the same result as above using:
ls | wc -l /dev/fd/0
but wc would echo the file name in this case, whereas it does not echo the file name when no name is given as in the first example.

Replace pipe-shellscript with C-program

I have the following Bash script:
cat | command1 | command2 | command3
The commands never change.
For performance reasons, I want to replace it with a small C-program, that runs the commands and creates and assings the pipes accordingly.
Is there a way to do that in C?

As others said, you probably won't get a significant performance benefit.
It's reasonable to assume that the commands you run take most of the time, not the shell script gluing them together, so even if the glue becomes faster, it will change almost nothing.
Having said that, if you want to do it, you should use the fork(), pipe, dup2() and exec() functions.
fork will give you multiple processes.
pipe will give you a pair of file descriptors - what you write into one, you can read from the other.
dup2 can be used to change file descriptor numbers. You can take one side of a pipe and make it become file descriptor 1 (stdout) in one process, and the other side you'll make file descriptor 0 (stdin) in another (don't forget to close the normal stdin, stdout first).
exec (or one of its variants) will be used to execute the programs.
There are lots of details to fill in. Have fun.

Here is an example that does pretty much this.
There is no performance benefit for the processing itself, just a couple of milliseconds in initialization. Obviously we don't know the context in which you're doing this, but just using dash instead of bash would probably have gotten you 80% of those milliseconds from a single character change in your #!

Executing bash command and getting the output in C

Hello I have seen some solutions on the internet, all of them are basically creating a file, however I want to store them in an array of char. Speed is really important for me and I don't want to spend any time for working on hard drive. So popen() is not a real solution for me.

Here is a working code snippet:
char bash_cmd[256] = "ls -l";
char buffer[1000];
FILE *pipe;
int len;
pipe = popen(bash_cmd, "r");
if (NULL == pipe) {
perror("pipe");
exit(1);
}
fgets(buffer, sizeof(buffer), pipe);
len = strlen(buffer);
buffer[len-1] = '\0';
pclose(pipe);

If you would read the manpage of popen, you would notice the following:
The popen() function opens a process by creating a pipe, forking,
and invoking the shell. [...] The return value from popen() is a
normal standard I/O stream in all respects save that it must be
closed with pclose() rather than fclose(3). [...] reading from a
"popened" stream reads the command's standard output, and the
command's standard input is the same as that of the process that
called popen().
(emphasis mine)
As you can see, a call to popen results in the stdout of the command being piped into your program through an I/O stream, which has nothing to do with disk I/O at all, but rather with interprocess communication managed by the operating system.
(As a sidenote: It's generally a good idea to rely on the basic functionality of the operating system, within reason, to solve common problems. And since popen is part of POSIX.1-2001 you can rely on it to be available on all standards compliant operarting systems, even windows)
EDIT: if you want to know more, read this: http://linux.die.net/man/3/popen

Never forget Knuth's saying that "premature optimization is the root of all evil". Don't worry about performance until it matters, and then measure before doing anything. Except for very rare situations, the value of your time is much higher than the cost of the program runs.
Jon Bentley's "Writing efficient programs" (sadly out of print, in his "Programming Pearls" one chapter is a summary) is a detailed discussion on how to make programs run faster (if it is worthwhile); and only as the very last measure, to squeeze out the last possible 2% of performance (after cutting run time down by half) it recommends using changes like you propose. The cited book includes some very entertaining war stories of "performance optimizations" that were a complete waste (optimize code that isn't ever used, oprimize the code run while the operating system twiddles its thumbs, ...).

If speed is important to you, you can write your own version of popen.
It may make sense, since popen()
- creates a pipe
- forks
- executes the shell (very expensive!)
- the shell than creates a pipe, forks, executes your program
Your customized version could reduce the procedure to:
- creates a pipe
- forks
- executes your program
You could even extend popen to control the commands STDOUT, STDERR and STDIN seperately.
I wrote such a routine, see https://github.com/rockdaboot/mget/blob/master/libmget/pipe.c
It is GPL'ed.
You call mget_popen3() with FILE pointers or mget_fd_popen3() with file descriptors.
At least, it should give you an idea on how to do it.

Do you mind having more than one C programs?If you don't ,you can make use of the command line arguments. In the fist C program you can do the following
system("YourCommand | SecondProgram");
The SecondProgram will be the "executable" of the second C program you will be writing. In the second C program you can receive the output of the command YourCommand as a command line argument in the SecondProgram. For that purpose you may begin the main() of second C program as below
main(int argc,char *argv[])
The array argv will have the output of the YourCommand and argc will contain the number of elements in the array argv.

Launching a console app from gui front-end

Could you please tell me the best way to do it? I can use popen, but it is nesessary to create a large buffer for arguments every time I need to launch my application. I can use fork + execv, but then the program writes to stdout and I cant read the output ( to display it in the text field ) Is there any other solution?

Could you please tell me the best way to do it? I can use popen, but it is nesessary to create a large buffer for arguments every time I need to launch my application.
popen() is one good standard way if you only need one way communication with the child application, like writing to its stdin or reading from stdout, but not both.
When using C one needs to be comfortable with strings. It helps a lot to use a string library for C to ease string operations, such as string concatenation in your case, because the standard C library provides only basic low-level functions for that.
I can use fork + execv, but then the program writes to stdout and I cant read the output ( to display it in the text field )
popen() gives you a FILE* pointer to the child program's stdout from which you can read its output using the standard C I/O function fread() or fscanf(). Again, the standard C library has this functionality and it pays to familiarize yourself with it.
Is there any other solution?
You can make the child program write to a file and then read that file, but in any case you need to be able to construct the command line string and read the file.

How does a pipe work in Linux?

How does piping work? If I run a program via CLI and redirect output to a file will I be able to pipe that file into another program as it is being written?
Basically when one line is written to the file I would like it to be piped immediately to my second application (I am trying to dynamically draw a graph off an existing program). Just unsure if piping completes the first command before moving on to the next command.
Any feed back would be greatly appreciated!

If you want to redirect the output of one program into the input of another, just use a simple pipeline:
program1 arg arg | program2 arg arg
If you want to save the output of program1 into a file and pipe it into program2, you can use tee(1):
program1 arg arg | tee output-file | program2 arg arg
All programs in a pipeline are run simultaneously. Most programs typically use blocking I/O: if when they try to read their input and nothing is there, they block: that is, they stop, and the operating system de-schedules them to run until more input becomes available (to avoid eating up the CPU). Similarly, if a program earlier in the pipeline is writing data faster than a later program can read it, eventually the pipe's buffer fills up and the writer blocks: the OS de-schedules it until the pipe's buffer gets emptied by the reader, and then it can continue writing again.
EDIT
If you want to use the output of program1 as the command-line parameters, you can use the backquotes or the $() syntax:
# Runs "program1 arg", and uses the output as the command-line arguments for
# program2
program2 `program1 arg`
# Same as above
program2 $(program1 arg)
The $() syntax should be preferred, since they are clearer, and they can be nested.

Piping does not complete the first command before running the second. Unix (and Linux) piping run all commands concurrently. A command will be suspended if
It is starved for input.
It has produced significantly more output than its successor is ready to consume.
For most programs output is buffered, which means that the OS accumulates a substantial amount of output (perhaps 8000 characters or so) before passing it on to the next stage of the pipeline. This buffering is used to avoid too much switching back and forth between processes and kernel.
If you want output on a pipeline to be sent right away, you can use unbuffered I/O, which in C means calling something like fflush() to be sure that any buffered output is immediately sent on to the next process. Unbuffered input is also possible but is generally unnecessary because a process that is starved for input typically does not wait for a full buffer but will process any input you can get.
For typical applications unbuffered output is not recommended; you generally get the best performance with the defaults. In your case, however, where you want to do dynamic graphing immediately the first process has the info available, you definitely want to be using unbuffered output. If you're using C, calling fflush(stdout) whenever you want output sent will be sufficient.

If your programs are communicating using stdin and stdout, then make sure that you are either calling fflush(stdout) after you write or find some way to disable standard IO buffering. The best reference that I can think of that really describe how to best implement pipelines in C/C++ is Advanced Programming in the UNIX Environment or UNIX Network Programming: Volume 2. You could probably start with a this article as well.

If your two programs insist on reading and writing to files and do not use stdin/stdout, you may find you can use a named pipe instead of a file.
Create a named pipe with the mknod(1) command:
$ mknod /tmp/named-pipe p
Then configure your programs to read and write to /tmp/named-pipe (use whatever path/name you feel is appropriate).
In this case, both programs will run in parallel, blocking as necessary when the pipe becomes full/empty as described in the other answers.