Is it possible to pipe() within a program in C? - c

Let's say that there is an existing program that listens on stdin for it's inputs. I want to create a pthread within the same program that is now the one to listen to stdin, and depending on what comes through, let it go through to the original program.
For this, I would create a pipe(), and configure the pthread to write to the input file descriptor, and the original program to listen to the output descriptor. Is this a correct way to have this done? I understand piping between processes, but is it possible to pipe like this within a single process?

Sure, you can use pipe(), but the data has to pass through the kernel even though both the end points are within the same process.
If you have source code for this (which I assume you have) and you don't mind making non-trivial changes, and performance is a priority for you, I would suggest using shared memory to send the data to the original program. It will be much faster than using pipe()

Related

Capturing stdout/stderr separately and simultaneously from child process results in wrong total order (libc/unix)

I'm writing a library that should execute a program in a child process, capture the output, and make the output available in a line by line (string vector) way. There is one vector for STDOUT, one for STDERR, and one for "STDCOMBINED", i.e. all output in the order it was printed by the program. The child process is connected via two pipes to a parent process. One pipe for STDOUT and one for STDERR. In the parent process I read from the read-ends of the pipes, in the child process I dup2()'ed STDOUT/STDERR to the write ends of the pipes.
My problem:
I'd like to capture STDOUT, STDERR, and "STDCOMBINED" (=both in the order they appeared). But the order in the combined vector is different to the original order.
My approach:
I iterate until both pipes show EOF and the child process exited. At each iteration I read exactly one line (or EOF) from STDOUT and exactly one line (or EOF) from STDERR. This works so far. But when I capture out the lines as they come in the parent process, the order of STDOUT and STDERR is not the same as if I execute the program in a shell and look at the output.
Why is this so and how can I fix this? Is this possible at all? I know in the child process I could redirect STDOUT and STDERR both to a single pipe but I need STDOUT and STDERR separately, and "STDCOMBINED".
PS: I'm familiar with libc/unix system calls, like dup2(), pipe(), etc. Therefore I didn't post code. My question is about the general approach and not a coding problem in a specific language. I'm doing it in Rust against the raw libc bindings.
PPS: I made a simple test program, that has a mixup of 5 stdout and 5 stderr messages. That's enough to reproduce the problem.
At each iteration I read exactly one line (or EOF) from STDOUT and exactly one line (or EOF) from STDERR.
This is the problem. This will only capture the correct order if that was exactly the order of output in the child process.
You need to capture the asynchronous nature of the beast: make your pipe endpoints nonblocking, select* on the pipes, and read whatever data is present, as soon as select returns. Then you'll capture the correct order of the output. Of course now you can't be reading "exactly one line": you'll have to read whatever data is available and no more, so that you won't block, and maintain a per-pipe buffer where you append new data, extract any lines that are present, shove the unprocessed output to the beginning, and repeat. You could also use a circular buffer to save a little bit of memcpy-ing, but that's probably not very important.
Since you're doing this in Rust, I presume there's already a good asynchronous reaction pattern that you could leverage (I'm spoiled with go, I guess, and project the hopes on the unsuspecting).
*Always prefer platform-specific higher-performance primitives like epoll on Linux, /dev/poll on Solaris, pollset &c. on AIX
Another possibility is to launch the target process with LD_PRELOAD, with a dedicated library that it takes over glibc's POSIX write, detects writes to the pipes, and encapsulates such writes (and only those) in a packet by prepending it with a header that has an (atomically updated) process-wide incrementing counter stored in it, as well as the size of the write. Such headers can be easily decoded on the other end of the pipe to reorder the writes with a higher chance of success.
I think it's not possible to strictly do what you want to do.
If you think about how it's done when running a command in an interactive shell, what happens is that both stdout and stderr point to the same file descriptor (the TTY), so the total ordering is correct by means of synchronization against the same file.
To illustrate, imagine what happens if the child process has 2 completely independent threads, one only writing to stderr, and to other only writing to stdout. The total ordering would depend on however the scheduler decided to schedule these threads, and if you wanted to capture that, you'd need to synchronize those threads against something.
And of course, something can write thousands of lines to stdout before writing anything to stderr.
There are 2 ways to relax your requirements into something workable:
Have the user pass a flag waiving separate stdout and stderr streams in favor of a correct stdcombined, and then redirect both to a single file descriptor. You might need to change the buffering settings (like stdbuf does) before you execute the process.
Assume that stdout and stderr are "reasonably interleaved", an assumption pointed out by #Nate Eldredge, in which case you can use #Unslander Monica's answer.

Pass pointer to integer on an execv call in C

I am writing a rudimentary shell program in C which uses a parent process to handle shell events and fork() to create child processes that call execv on another executable (also C).
I am trying to keep a process counter on the parent process. And as such I thought of the possibility of creating a pointer to a variable that keeps track of how many processes are running.
However, that seems to be impossible since the arguments execv (and the program executed by it) takes are of type char * const argv[].
I have tried to keep track of the amount of processes using mmap for shared memory between processes, but couldn't get that to work since after the execv call the process simply dies and doesn't let me update the process counter.
In summary, my question is: Is there a way for me to pass a pointer to an integer on an execv call to another program?
Thank you in advance.
You cannot meaningfully pass a pointer from one process to another because the pointer is meaningless in the other process. Each process has its own memory, and the address is relative to that memory space. In other words, the virtual memory manager lets every process pretend it has the entire machine's memory; other processes are simply invisible.
However, you do have a few options for setting up communications between related processes. The most obvious one is a pipe, which you've presumably already encountered. That's more work, though, because you need to make sure that some process is always listening for pipe communications.
Another simple possibility is to just leave a file descriptor open when you fork and exec (see the close-on-exec flag to see how to accomplish the latter); although mmap is not preserved by exec, you can remap the memory to the open fd in the child process. If you don't want to pass the fd, you can mmap the memory to a temporary file, and use an environment variable to record the name of the temporary file.
Another possibility is Posix shared memory. Again, you might want to communicate the shm name through an environment variable, rather than hard-coding it in to the application.
Note that neither shared mmaps nor shared memory are atomic. If you're incrementing a counter, you'll need to use some locking mechanism to avoid race conditions.
For possibly a lot more information than you really wanted, you can read ESR's overview of interprocess communication techniques in Chapter 7 of The Art of Unix Programming.

Running multiple forked processes and constantly reading their standard out, while printing to their standard in

I'm looking to run X amount of processes that I'm able to iterate through in order to run programs where there's a master and 'slaves' that take the masters orders and return a string.
I'm writing in C. I'm wondering how I'd be able to set up pipes and forking between there processes to read from standard in and out. I'm currently able to have them work one at a time until the are killed, but I would like to simply read one line then move to the next process. Any help?
Generally, the common strategy for this sort of programming is to set up an event loop.
You would set up pipes and connect them to stdin and stdout of your program.
You don't specify what language you're using.
In C, you would create two pipes, one for reading, and one for writing.
Then you would fork. After the fork, in the child, you close stdin and stdout, and you use the dup2 system call to copy one end of the pipe filedescriptors to the child.
In the parent, you connect each process to an event loop, which lets you know when one of your FDs is ready for reading or writing.
Take a look at these class notes for discussion of using pipes and dup2.
Here's an introduction to libevent, one of the common event loops for C.
For other languages you'll do something similar. For example for Python, take a look at the asyncio support for subprocesses.

Can we pass variables from one c program to another c program?

So I want to pass a variable from one c program to another c program.
For example:
main()
{
char str[]="Hello,there!";
system("program2.exe");
}
And I want to use str[] in program2.exe. Is there a way to pass a variable to another program?
I used files to write data from first program and read data from second program but I want to know is there any other way to do this?
Is it good to use files for passing data from program to another?
You can't literally pass a variable between two processes because each process on a system will generally have it's own memory space - each variable belongs to a process and therefore can't be accessed from another process (or so I believe). But you can pass data between processes using pipes.
Pipes are buffers that are implemented by the OS and are a much more effective method of sharing data between processes than files (yes, you can use files for inter-process communication). This is because files must be written to a disk before being accessed which makes them slow for inter-process communication. You'd also have to implement some kind of method for ensuring the two processes don't corrupt a file when reading and writing to it.
Also, pipes can be used to ensure continuous communication between two processes, making them useful in many situations. When using half duplex pipes (linked above), you can have a pipe for each process to establish a communication channel between them (i.e. a one way communication channel for each).
you can:
1) pass arguments to a program.
2) use sockets to communicate between processes.

how to reboot a Linux system when a fatal error occurs (C programming)

I am writing a C program for an embedded Linux (debian-arm) device. In some cases, e.g. if a fatal error occurs on the system/program, I want the program to reboot the system by system("reboot");after logging the error(s) via syslog(). My program includes multithreads, UDP sockets, severalfwrite()/fopen(), malloc() calls, ..
I would like to ask a few question what (how) the program should perform processes just before rebooting the system apart from the syslog. I would appreciate to know how these things are done by the experienced programmers.
Is it necessary to close the open sockets (UDP) and threads just before rebooting? If it is the case, is there a function/system call that closes the all open sockets and threads? If the threads needs to be closed and there is no such global function/call to end them, how I suppose to execute pthread_exit(NULL); for each specific threads? Do I need go use something like goto to end the each threads?
How should the program closes files that fopen and fwrite uses? Is there a global call to close the files in use or do I need to find out the files in use manually then use fclose for the each file? I see see some examples on the forums fflush(), flush(), sync(),.. are used, which one(s) would you recommend to use? In a generic case, would it cause any problem if all of these functions are used (although these could be used unnecessary)?
It is not necessary to free the variables that malloc allocated space, is it?
Do you suggest any other tasks to be performed?
The system automatically issues SIGTERM signals to all processes as one of the steps in rebooting. As long as you correctly handle SIGTERM, you need not do anything special after invoking the reboot command. The normal idiom for "correctly handling SIGTERM" is:
Create a pipe to yourself.
The signal handler for SIGTERM writes one byte (any value will do) to that pipe.
Your main select loop includes the read end of that pipe in the set of file descriptors of interest. If that pipe ever becomes readable, it's time to exit.
Furthermore, when a process exits, the kernel automatically closes all its open file descriptors, terminates all of its threads, and deallocates all of its memory. And if you exit cleanly, i.e. by returning from main or calling exit, all stdio FILEs that are still open are automatically flushed and closed. Therefore, you probably don't have to do very much cleanup on the way out -- the most important thing is to make sure you finish generating any output files and remove any temporary files.
You may find the concept of crash-only software useful in figuring out what does and does not need cleaning up.
The only cleanup you need to do is anything your program needs to start up in a consistent state. For example, if you collect some data internally then write it to a file, you will need to ensure this is done before exiting. Other than that, you do not need to close sockets, close files, or free all memory. The operating system is designed to release these resources on process exit.

Resources