I have a program in C that launches 100 child processes with fork() and then waits for them to fininsh using a wait in a loop. I would like to wait a maximum amount of time for them all to finish, so the parent process doesn't stay blocked if one of them is, and if that time is over, kill the unfinished ones.
Which would be the best way to do that?
Set an alarm for the desired time. If the alarm fires, kill whichever of the list of children you original had have not yet died, and send them appropriate 'go away' signals.
I recommend sending SIGTERM or SIGHUP first; then collect the bodies. If there are any left over after another short delay, then send the SIGKILL signal. If you get too dramatic (SIGKILL) too quickly, the programs do not get an opportunity to clean up any mess they've made.
The child processes need to signal that they are finished to the main process in some way (you can have them pass a message back to the main process, have them create a file stating that they are finished, or whatever other way is easiest for you). Once that mechanism is in place, have the main process check for the signal that the processes have finished, if it has received them all then continue on, otherwise wait some amount of time before checking again. In this loop add a check to see if you max timeout has been reached, and if so then continue.
Related
I am working on signals in C programming. I have parent process and 5 child processes, I am trying to send SIGUSR2 from child processes to parent process when they are done some calculations while parent waits for them with sigsuspend(). When all 5 child processes sends SIGUSR2, parent process continues its work. I increment a global variable in the signal handler function to do so. Sometimes it runs fine but sometimes parent process gets stuck waiting.
Can one of you guys help me with a better solution approach rather than counting the signals received (I must use signals for synchronization)?
To the best of my knowledge, you can't use signals for that. If two signals of the same kind are sent to a process before it gets scheduled to handle the first one, it will only see one signal. Think of it as a bit mask, there is one bit for each pending signal, and when the process gets scheduled it will receive them all. But if it is waiting for some other process, and a signal for which the bit in the mask is already set, then nothing more happens.
A better solution would probably be to open a pipe to each subprocess, and each of them writes a message when done. When the parent has read the message from all children, it can continue. There are other synchronisation methods, but this would probably be the simplest.
I want to write a program that uses only SIGUSR1/SIGUSR2 signals for pausing and resuming a multiple number of child processes that work on a same problem simultaneously. If I use a signal handler to send an info that a child process has paused of course when its multiples are sent they will merge into one. Since I am using sigsuspend, is there a way to know when at least the last process finished so that i don't signal a parent before the last child finishes. Also, if that is not possible is it possible to somehow find out that child process is suspended by checking some of those 3 files made when a process is created. Thanks in advance!
When I call kill() on a process, it returns immediately, because it just send a signal. I have a code where I am checking some (foreign, not written nor modifiable by me) processes in a loop infinitely and if they exceed some limits (too much ram eaten etc) it kills them (and write to a syslog etc).
Problem is that when processes are heavily swapped, it takes many seconds to kill them, and because of that, my process executes the same check against same processes multiple times and attempts to send the signal many times to same process, and write this to syslog as well. (this is not done on purpose, it's just a side effect which I am trying to fix)
I don't care how many times it send a signal to process, but I do care how many times it writes to syslog. I could keep a list of PID's that were already sent the kill signal, but in theory, even if there is low probability, there could be another process spawned with same pid as previously killed one had, which might also be supposed to be killed and in this case, the log would be missing.
I don't know if there is unique identifier for any process, but I doubt so. How could I kill a process either synchronously, or keep track of processes that got signal and don't need to be logged again?
Even if you could do a "synchronous kill", you still have the race condition where you could kill the wrong process. It can happen whenever the process you want to kill exits by its own volition, or by third-party action, after you see it but before you kill it. During this interval, the PID could be assigned to a new process. There is basically no solution to this problem. PIDs are inherently a local resource that belongs to the parent of the identified process; use of the PID by any other process is a race condition.
If you have more control over the system (for example, controlling the parent of the processes you want to kill) then there may be special-case solutions. There might also be (Linux-specific) solutions based on using some mechanisms in /proc to avoid the race, though I'm not aware of any.
One other workaround may be to use ptrace on the target process as if you're going to debug it. This allows you to partially "steal" the parent role, avoiding invalidation of the PID while you're still using it and allowing you to get notification when the process terminates. You'd do something like:
Check the process info (e.g. from /proc) to determine that you want to kill it.
ptrace it, temporarily stopping it.
Re-check the process info to make sure you got the process you wanted to kill.
Resume the traced process.
kill it.
Wait (via waitpid) for notification that the process exited.
This will make the script wait for process termination.
kill $PID
while [ kill -0 $PID 2>/dev/null ]
do
sleep 1
done
kill -0 [pid] tests the existence of a process
The following solution works for most processes that aren't debuggers or processes being debugged in a debugger.
Use ptrace with argument PTRACE_ATTACH to attach to the process. This stops the process you want to kill. At this point, you should probably verify that you've attached to the right process.
Kill the target with SIGKILL. It's now gone.
I can't remember whether the process is now a zombie that you need to reap or whether you need to PTRACE_CONT it first. In either case, you'll eventually have to call waitpid to reap it, at which point you know it's dead.
If you are writing this in C you are sending the signal with the kill system call. Rather than repeatedly sending the terminating signal just send it once and then loop (or somehow periodically check) with kill(pid, 0); The zero value of signal will just tell you if the process is still alive and you can act appropriately. When it dies kill will return ESRCH.
when you spawn these processes, the classical waitpid(2) family can be used
when not used anywhere else, you can move the processes going to be killed into an own cgroup; there can be notifiers on these cgroups which get triggered when process is exiting.
to find out, whether process has been killed, you can chdir(2) into /proc/<pid> or open(2) this directory. After process termination, the status files there can not be accessed anymore. This method is racy (between your check and the action, the process can terminate and a new one with the same pid be spawned).
I have a fork occurring in a loop, and above the fork I prompt for a user's input. In my forked process, there's also some printing. Because there's no guarantee to the order the processes will run in, I often (or always) get lines from the child process printing between my prompt to the user and the place where they can enter information.
I.e., I get something like this:
Enter info: <OUTPUT FROM CHILD>
_
(where the _ indicates that the user is free to enter an input.)
Since I'm trying to allow my parent process to fork many children process (each based on piece of information given by the user) that run simultaneously, I can't wait for the child to end before letting the parent continue. Is there a way to make the parent wait for part of the child to complete before moving on?
A lot depends on what you're really trying to do, but you can't use waitpid() or wait() to wait for part of a process to finish. The wait family of functions wait on moribund processes, or processes that have been stopped due to a signal (SIGSTOP, SIGTTIN, SIGTTOU, etc).
Some questions:
Should the output from the child processes be sent to the screen, which leads to this confusion, or should it be sent to a file?
Or, should the program have a pipe from each child so that it can read the output from the child and display it on an appropriate portion of the screen when it is convenient?
Or, in a windowing environment, should the children's messages be sent to a different window (like the console window)?
Or should the children write to the syslog daemon?
Or should the children be made to hang on a SIGTTOU signal?
A lot depends on the purpose of the messages, and the importance of immediate display of the messages.
The other answers are definitely more general, and the proper way to solve this problem would involve some kind of pipe, but my case was actually very simple, and just needed the parent to wait for a while, so I added a usleep() line, to make the parent wait a few milliseconds for the child to finish printing. It's definitely not perfect, but it worked.
I want to be able to handle many signals of the same type (SIGCHLD), but, I want to make sure that if a signal is arriving while I'm still handling the previous one, I will finish handling the first to arrive, and only after I finish handling it, I'll handle the next ones.
There may be more than one signals waiting to be handled.
Also, does a process sends SIGCHLD if it's terminated or killed (using SIGTERM/SIGKILL) by the parent process?
As long as you use sigaction and not the problematic signal function to setup your signal handler, you can be sure (unless you specify otherwise) that your signal handler will not be interrupted by another occurrence of the signal it's handling. However it's possible if many child processes all die at once that you might not receive a signal for each. On each SIGCHLD, the normal procedure is to attempt to wait for children until your wait-family function says there are no children left to wait for. At this point, you can be sure that any further child termination will give you a new SIGCHLD.
Also, since you're very restricted as to what functions you can use from a signal handler, you'd probably be better off just setting some sort of flag or otherwise notifying your main program loop that it should check for terminated children via one of the wait interfaces.
And finally, yes, a SIGCHLD is delivered regardless of the reason the child terminated - including if it was killed by the parent.