Redirecting the output of a child process - c

There are several ways of redirecting the output of a child process:
using freopen(3)
using dup(3)
using popen(3)
...
What should one pick if all is wanted is to execute a child process and have it output saved in a given file, pretty much like the ls > files.txt works?
What is normally used by shells?

You can discover what your favorite shell uses by strace(1)ing your shell.
In one terminal:
echo $$
In another terminal:
strace -o /tmp/shell -f -p [PID from the first shell]
In the first terminal again:
ls > files.txt
In the second terminal, ^C your strace(1) command and then edit the /tmp/shell output file to see what system calls it made to do the redirection.
freopen(3) manipulates the C standard IO FILE* pointers. All this will be thrown away on the other side of the execve(2) call, because it is maintained in user memory. You could use this after the execve(2) call, but that would be awkward to use generically.
popen(3) opens a single unidirectional pipe(7). This is useful, but extremely limited -- you get either the standard output descriptor or the standard input descriptor. This would fail for something like ls | grep foo | sort where both input and output must be redirected. So this is a poor choice.
dup2(2) will manage file descriptors -- a kernel-implemented resource -- so it will persist across execve(2) calls and you can set up as many file descriptors as you need, which is nice for ls > /tmp/output 2> /tmp/error or handling both input and output: ls | sort | uniq.
There is another mechanism: pty(7) handling. The forkpty(3), openpty(3), functions can manage a new pseudo-terminal device created specifically to handle another program. The Advanced Programming in the Unix Environment, 2nd edition book has a very nice pty example program in its source code, though if you're having trouble understanding why this would be useful, take a look at the script(1) program -- it creates a new pseudo-terminal and uses it to record all input and output to and from programs and stores the transcript to a file for later playback or documentation. You can also use it to script actions in interactive programs, similar to expect(1).

I would expect to find dup2() used mainly.
Neither popen() nor freopen() is designed to handle redirections such as 3>&7. Up to a point, dup() could be used, but the 3>&7 example shows where dup() starts to creak; you'd have to ensure that file descriptors 4, 5, and 6 are open (and 7 is not) before it would handle what dup2() would do without fuss.

Related

Disallowing printf in child process

I've got a cmd line app in C under Linux that has to run another process, the problem is that the child process prints a lot in a comand line and the whole app gets messy.
Is it possible to disallow child process to print anything in cmd line from parent process? It would be very helpful to for example being able to define a command that allows or disallows printing by a child process.
There's the time-honoured tradition of just redirecting the output to the bit bucket(a), along the lines of:
system("runChild >/dev/null 2>&1");
Or, if you're doing it via fork/exec, simply redirect the file handles using dup2 between the fork and exec.
It won't stop a determined child from outputting to your standard output but it will have to be very tricky to do that.
(a) I'm not usually a big fan of that, just in case something goes wrong. I'd prefer to redirect it to a real file which can be examined later if need be (and deleted eventually if not).
Read Advanced Linux Programming then syscalls(2).
On recent Linux, every executable is in ELF format (except init or systemd; play with pstree(1) or proc(5)) is running in a process started by fork(2) (or clone(2)...) and execve(2).
You might use cleverly dup2(2) with open(2) to redirect STDOUT_FILENO to /dev/null (see null(4), stdout(3), fileno(3))
I've got a cmd line app in C under Linux that has to run another process, the problem is that the child process prints a lot in a comand line
I would instead provide a way to selectively redirect the child process' output. You could use program arguments or environment variables (see getenv(3) and/or environ(7)) to provide such an option to your user.
An example of such a command program starting and redirecting subprocesses and redirecting them is your GCC compiler (see gcc(1); it runs cc1 and as(1) and ld(1)...). Consider downloading and studying its source code.
Study also -for inspiration- the source code of some shell (e.g. sash), or write your own one.

Another Linux command output (Piped) as input to my C program

I'm now working on a small C program in Linux. Let me explain you what I want to do with a sample Linux command below
ls | grep hello
The above command is executed in the below passion (Let me know if I've got this wrong)
ls command will be executed first
Output will be given to grep command which will again generate output by matching "hello"
Now I would like to write a C program which takes the piped output of one command as input. Means, In the similar passion of how "grep" program was able to get the input from ls command (in my example above).
Similar question has been asked by another user here, but for some reason this thread has been marked as "Not a valid question"
I initially thought we can get this as a command line argument to C program. But this is not the case.
If you pipe the output from one command into another, that output will be available on the receiving process's standard input (stdin).
You can access it using the usual scanf or fread functions. scanf and the like operate on stdin by default (in the same way that printf operates on stdout by default; in the absence of a pipe, stdin is attached to the terminal), and the C standard library provides a FILE *stdin for functions like fread that read from a FILE stream.
POSIX also provides a STDIN_FILENO macro in unistd.h, for functions that operate one file descriptors instead. This will essentially always be 0, but it's bad form to rely on that being the case.
If fact, ls and grep starts at the same time.
ls | grep hello means, use ls's standard output as grep's standard input. ls write results to standard output, grep waits and reads any output from standard input at once.
Still have doubts? Do an experiment. run
find / | grep usr
find / will list all files on the computer, it should take a lot of time.
If ls runs first, then OS gives the output to grep, we should wait a long time with blank screen until find finished and grep started. But, we can see the results at once, that's a proof for that.

Writing a shell - how to execute commands

I'm trying to write a shell that will eventually take advantage of concurrency. Right now I've got a working shell parser but I'm having trouble figuring out how to execute commands. I've looked a bit at exec (execvp etc.) and it looks promising but I have some questions.
Can exec handle file input/output redirection? Can I set up pipes using exec?
I'm also wondering about subshells. What should subshells return; the exit status of its last statement? Can subshells be a part of a pipe?
These might seem like really dumb questions but please bear with my inexperience.
Can exec handle file input/output redirection?
No, you do that with open() and dup() or dup2() (and close()).
Can I set up pipes using exec?
No, you do that with pipe(), dup() or dup2() and lots of close() calls.
I'm also wondering about subshells. What should subshells return, the exit status of its last statement?
That's the normal convention, yes.
Can subshells be a part of a pipe?
Yes. In a normal shell, you can write something like:
(cd /some/where; find . -name '*.png') | sed 's/xyz/prq/' > mapped.namelist
If you want to get scared, you could investigate posix_spawn() and its support functions. Search for 'spawn' at the POSIX 2008 site, and be prepared to be scared. I think it is actually easier to do the mapping work ad hoc than to codify it using posix_spawn() and its supporters.
The standard technique for a shell is to use fork-exec. In this model, to execute an application the shell uses fork to create a new process that is a copy of itself, and then uses one of the exec variants to replace its own code, data, etc. with the information specified by an executable file on disk.
The nice thing about this model is that the shell can do a little extra work with file descriptors (which are not thrown away by exec) before it changes out its address space. So to implement redirection, it changes out the file descriptors 0, 1, and 2 (stdin, stdout, and stderr, respectively) to point to another open file, instead of console I/O. You use dup2 to change out the meaning of one of these file descriptors, and then exec to start the new process.

How to redirect the output of a c program to a file?

I am trying to redirect the output of a c program to file, even when it generates some errors because of problems with the input data. I can send the output but the error messages to a file.
Does somebody know how to do it?
From within C source code, you can redirect outputs using freopen():
General outputs:
freopen("myfile.txt", "w", stdout);
Errors:
freopen("myfile_err.txt", "w", stderr);
(This answer applies to bash shell, and similar flavors. You didn't specify your environment and this sort of question needs that detail.)
I assume you know about basic redirection with ">". To also capture STDERR in addition to STDOUT, use the following syntax:
command > file-name 2>&1
For some more background on standard streams and numbers:
http://en.wikipedia.org/wiki/Standard_streams#Standard_input_.28stdin.29
This depends on what you mean and what platform you are using. Very often you can accomplish this from the command line, which has been covered in another answer. If you use this method to accomplish this you should be aware that FILE * stderr is typically written immediately (unbuffered) while FILE * stdout may be buffered (usually line buffered) so you could end up with some of your error messages appearing to have been printed earlier than some other messages, but actually the other messages are just being printed late.
From within a C program you can also do something similar within the stdio system using freopen, which will effect the FILE *, so you could make fprintf(stderr, "fungus"); print to something besides what stderr normally would print to.
But if you want to know how to make a program redirect the actual file descriptors under a unix like system you need to learn about the dup and dup2 system calls. They allow you to duplicate a file descriptor.
int fd = open("some_file", O_WRONLY);
dup2(2,fd);
close(fd);
This code will make "some_file" the new stderr at the OS level. The dup2 call will close and replace file descriptor 2 (stderr, which is usually used by FILE * stderr but not necessarily if you call freopen(x,y,stderr) since that may make FILE *stderr use a different file descriptor).
This is how shell programs redirect input and output of programs. The open all of the files that the new program will need, fork, then the child uses dup2 to set up the files descriptors for the new program, then it closes any files that the new program won't need (usually just leaving 0, 1, and 2 open), and then uses one of the exec functions to become the program that the shell was told to run. (some of this isn't entirely accurate because some shells may rely on close on exe flags)
Using a simple linux command you can save the output into the file. here is a simple linux terminal command.
ls > file.txt
The output of this command will be stored into the file.
same as you can store the output of the program like this suppose, object file name is a, run the following command to save output in a file:
./a > file.txt

can a process create extra shell-redirectable file descriptors?

Can a process 'foo' write to file descriptor 3, for example, in such a way that inside a bash shell one can do
foo 1>f1 2>f2 3>f3
and if so how would you write it (in C)?
You can start your command with:
./foo 2>/dev/null 3>file1 4>file2
Then if you ls -l /proc/_pid_of_foo_/fd you will see that file descriptors are created, and you can write to them via eg.:
write(3,"abc\n",4);
It would be less hacky perhaps if you checked the file descriptor first (with fcntl?).
The shell opens the file descriptors for your program before executing it. Simply use them like you would any other file descriptor, e.g. write(3, buf, len); etc. You may want to do error checking to make sure they were actually opened (attempting to dup them then closing the duplicate would be one easy check).
No.
The file descriptors are opened by the shell and the child process inherits them. It is not the child process which opens these command-line accessible file descriptors, it is the bash process.
There might be a way to convince bash to open additional file descriptors on behalf of the process. That wouldn't be portable to other shells, and I'm not sure if a mechanism even exists -- I am just speculating.
The point is that you can't do this from coding the child process in a special way. The shell would have to abide your desires.
Can a process 'foo' write to file descriptor 3, for example, in such a way that inside a bash shell one can do [...] and if so how would you write it (in C)?
I'm not sure what you are precisely after, but whatever it is, starting point going to be the man dup/man dup2 - this is how the shells make out of a random file descriptor a file descriptor with given number.
But obviously, the process foo has to know somehow that it can write to the file descriptor 3. POSIX only specifies 0, 1 and 2: shell ensures that whatever is started gets the file descriptors and libc in application's context also expects them to be the stdin/stdout/stderr. Starting from 3 and beyond - is up to application developer.

Resources