I have some code that forks/waits, but it also might end up using some third party code that may also fork/wait. To limit the amount of processes I fork, I want to wait for a process to exit, if too many have been forked already. If I wait for any process though, I might wait on a process that third party code then expects to be able to wait on, leaving that third party code with a failure result and no information on exit status. My own code will also not work right, since I'll end up with a negative amount of active processes, if I end up waiting for more processes than I fork.
I was going to try to keep my forking limited to a process group, so I could wait on that, but where do I get a special "my code" process group, to use in my blocking version of fork? I can't get third party code to set a special process group themselves, and I can't use any process group except for the pid of the process doing all these forks, which third party code will also use. I could use one of the child processes as the process group leader, but then when that child exits I'm hosed, since I'll have to wait on two process groups now, then three, and so on. Should I just realloc a growing array of process groups that still have child processes in them? I could fork a process that immediately exits, then use that "zombie" process as the process group leader, but then when I wait on any process in that group, it'll clean up the zombie process leaving me once again with no process group leader. I'd use setrusage to limit subprocesses, but then when fork fails from too many subprocesses, I have no way to wait for any of those subprocesses to exit before trying to fork again.
My best idea so far is a heap allocated growing list of lists of subprocesses, each with a possibly dead process group leader. Can you still wait on a process group if the leader has exited though? If the pids overflow and cycle around, and a new process happens to get that pid, will it just magically become the process group leader? Should I be using something with semaphores? Two processes with every fork, one to wait on the other then increment the semaphore? A heap allocated growing list of pids to wait for individually, just randomly guessing which pid will exit first? I have to keep my own custom "zombie process" table, right? So that I can "wait" for a process that's already been waited for and still get the exit status? Am I just forbidden from using third party code in any process that forks, and need to always use the code in child processes so the parent can't inadvertently wait on any internal forks?
What I ended up doing ...seems like it was effective. "No good solutions" etc. but what I did was:
process A forks process B, then process A just waits on B
process B sets its own process group to itself (B)
process B can have a special fork function then, that sets the process group to A after forking (A being the "grandparent" process)
any naive fork will just use the process group B
if this system uses itself, then B will fork C, and C's subprocesses will use B as a process group. So not even that will interfere with process group A
if B counts too many processes, it just waits on group A, to get any of (and only) the child processes that have been counted
One problem is that shells rely on process groups for killing a process tree. They won't kill any subprocesses that set a different process group. So I had to use the non-platform-specific prctl(PR_SET_PDEATHSIG, ...) to have subprocesses kill themselves when the parent process dies. And furthermore, because PDEATHSIG gives you the thread ID, not the process ID, I had to use PR_SET_CHILD_SUBREAPER on process B, so that anything getting a PDEATHSIG would get one for when B dies, but can ignore when the thread within B exits.
A platform independent way to do this might be just poll kill(getppid(), 0) before every fork, to see whether you should die rather than fork. Checking the return value of setpgid might work too, but I don't know if it forbids you from using process A as a process group, if A has died, the PID number has cycled around, and a totally unrelated process happened to get A's old PID.
Related
I am just going to post pseudo code,
but my question is I have a loop like such
for(i<n){
createfork();
if(child)
/*
Exit so I can control exact amount of forks
without children creating more children
*/
exit
}
void createfork(){
fork
//execute other methods
}
Does my fork create a process do what it is suppose to do and exit then create another process and repeat? And if so what are some ways around this, to get the processes running concurrently?
Your pseudocode is correct as written and does not need to be modified.
The processes are already executing in parallel, all six of them or however many you spawn. As written, the parent process does not wait for the children to finish before spawning more children. It calls fork(), checks if (child) (which is skipped), then immediately proceeds to the next for loop iteration and forks again.
Notably, there's no wait() call. If the parent were to call wait() or waitpid() to wait for each child to finish then that would introduce the serialism you're trying to avoid. But there is no such call, so you're good.
When a process successfully performs a POSIX fork(), that process and the new child process are initially both eligible to run. In that sense, they will run concurrently until one or the other blocks. Whether there will be any periods of time when both are executing machine instructions (on different processing units) depends at least on details of hardware capabilities, OS scheduling, the work each process is performing, and what other processes there are in the system and what they are doing.
The parent certainly does not, in general, automatically wait for the child to terminate before it proceeds with its own work (there is a family of functions to make it wait when you want that), nor does the child process automatically wait for any kind of signal from the parent. If the next thing the parent does is fork another child, then that will under many circumstances result in the parent running concurrently with both (all) children, in the sense described above.
I cannot speak to specifics of the behavior of your pseudocode, because it's pseudocode.
I have a PID of process that may contain childs. How can I get the PID of all child processes? I make my own PTY handler, so when user run a shell in this handler he may run anymore programs ( directly from shell ), every ran program becomes a child of shell. So, when I press Ctrl+C I need to send signal to the most new process, so need to know PID of that last one.
You should keep explicitly all the pids (result of fork(2)...) of your child processes (and remove a pid once you waited it successfully with wait(2) etc...)
It is up to you to choose the data structures to keep these pids.
Any other approach (e.g. using proc(5)... which is what ps and pstree are doing.) is not very portable and inefficient.
So the basic rule is that every time you call fork you should explicitly keep its result (and test for the 3 cases: 0 if in child process, >0 if in parent process, <0 on error) and use that at wait time.
Read Advanced Linux Programming; it has many pages relevant to that subject.
You might also be interested by process groups and sessions. See setpgrp(2), setsid(2), daemon(3), credentials(7) etc. Notice that with a negative or zero pid kill(2) can send a signal to a process group, and that you could also use killpg(2) for that purpose.
In C language,I have a child thread(using pthreads),
Is there any way to restrict this child, so that we can't call fork inside this thread?
If we write fork inside, program should not compile.
I can also have a child process instead of child thread, as long as it cannot fork further.
Basically how can I have a child process or child thread, which cannot fork a process any further.
You can always try to play games with pthread_atfork: http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html
Basically, you can use pthread_atfork() to install a "child" callback which always calls exit(). This way, your threads may still fork, but the forked process will exit immediately, so no harm will be done (and only a minimal overhead incurred).
With processes it may be somewhat more complicated. Linux allows you to limit a number of processes per user (so called RLIMIT_NPROC when set with setrlimit()). When this limit is reached, no further forks are possible for a given user id. Thus, you can create a parent process with a CAP_SETUID capability and a dummy user, having the RLIMIT_NPROC set to 1. This way, you can fork from parent, change the child uid to that of the "limited" user you've created in advance and drop the CAP_SETUID capability. At this point, child will have no possible way to fork itself.
When I call kill() on a process, it returns immediately, because it just send a signal. I have a code where I am checking some (foreign, not written nor modifiable by me) processes in a loop infinitely and if they exceed some limits (too much ram eaten etc) it kills them (and write to a syslog etc).
Problem is that when processes are heavily swapped, it takes many seconds to kill them, and because of that, my process executes the same check against same processes multiple times and attempts to send the signal many times to same process, and write this to syslog as well. (this is not done on purpose, it's just a side effect which I am trying to fix)
I don't care how many times it send a signal to process, but I do care how many times it writes to syslog. I could keep a list of PID's that were already sent the kill signal, but in theory, even if there is low probability, there could be another process spawned with same pid as previously killed one had, which might also be supposed to be killed and in this case, the log would be missing.
I don't know if there is unique identifier for any process, but I doubt so. How could I kill a process either synchronously, or keep track of processes that got signal and don't need to be logged again?
Even if you could do a "synchronous kill", you still have the race condition where you could kill the wrong process. It can happen whenever the process you want to kill exits by its own volition, or by third-party action, after you see it but before you kill it. During this interval, the PID could be assigned to a new process. There is basically no solution to this problem. PIDs are inherently a local resource that belongs to the parent of the identified process; use of the PID by any other process is a race condition.
If you have more control over the system (for example, controlling the parent of the processes you want to kill) then there may be special-case solutions. There might also be (Linux-specific) solutions based on using some mechanisms in /proc to avoid the race, though I'm not aware of any.
One other workaround may be to use ptrace on the target process as if you're going to debug it. This allows you to partially "steal" the parent role, avoiding invalidation of the PID while you're still using it and allowing you to get notification when the process terminates. You'd do something like:
Check the process info (e.g. from /proc) to determine that you want to kill it.
ptrace it, temporarily stopping it.
Re-check the process info to make sure you got the process you wanted to kill.
Resume the traced process.
kill it.
Wait (via waitpid) for notification that the process exited.
This will make the script wait for process termination.
kill $PID
while [ kill -0 $PID 2>/dev/null ]
do
sleep 1
done
kill -0 [pid] tests the existence of a process
The following solution works for most processes that aren't debuggers or processes being debugged in a debugger.
Use ptrace with argument PTRACE_ATTACH to attach to the process. This stops the process you want to kill. At this point, you should probably verify that you've attached to the right process.
Kill the target with SIGKILL. It's now gone.
I can't remember whether the process is now a zombie that you need to reap or whether you need to PTRACE_CONT it first. In either case, you'll eventually have to call waitpid to reap it, at which point you know it's dead.
If you are writing this in C you are sending the signal with the kill system call. Rather than repeatedly sending the terminating signal just send it once and then loop (or somehow periodically check) with kill(pid, 0); The zero value of signal will just tell you if the process is still alive and you can act appropriately. When it dies kill will return ESRCH.
when you spawn these processes, the classical waitpid(2) family can be used
when not used anywhere else, you can move the processes going to be killed into an own cgroup; there can be notifiers on these cgroups which get triggered when process is exiting.
to find out, whether process has been killed, you can chdir(2) into /proc/<pid> or open(2) this directory. After process termination, the status files there can not be accessed anymore. This method is racy (between your check and the action, the process can terminate and a new one with the same pid be spawned).
I have a shell script that launches 4 other binaries. I am sending SIGSTOP to the shell script. Does this stop all other 4 processes also? If not, what should I do to forward the SIGSTOP to these processes? Similar is the case with SIGCONT.
I have the C source code for all the 4 binaries.
You can call setpgid() in the forked child process that will execute the shell script. That will give any spawned processes from that shell script the same group ID as the child process. You can then use killpg() to send a signal to the entire group that all processes in that group will receive.
For instance, if inside the child process you called setpgid(0, 0), this would setup a special instance where the child-process' group ID will be set to the same value as the child's PID. Then any processes overlaid on the child process using one of the exec family of functions will have the same group-ID value that the child had. In addition, any processes that the newly overlaid process may spawn will also have the same group ID (i.e., your shell-script). You can then, using killpg(), send a signal to any processes sharing a group ID value using just the child's PID value that fork() returned since the group ID of the child process is the same value as the child's PID after the setpgid(0, 0) call.
If you are using fork(), depending on how quickly you need to send signals to the group from the parent process may create some synchronization issues ... for example, you want to immediately send a signal to the process group right after forking the child process. There are two work-arounds for this: Either 1) use vfork() instead of fork, so that the parent is suspended until the child has changed it's group-ID and successfully called exec, or 2) call setpgid() in the parent process as well as in the child-process, but in the parent, rather than using setgpid(0, 0) like you would in the child, you can use setpgid(CHILD_PID, CHILD_PID). Then it won't matter which call was successful (one of them will be successful, and the other will fail with a EACCES), and any successive signals sent from the parent wil now go to a valid group ID.
If your processes form a group, you can use standard kill(1). man kill has the following info:
pid...
Specify the list of processes that kill should signal. Each pid can be one of five things:
n
where n is larger than 0. The process with pid n will be signaled.
All processes in the current process group are signaled.
-1
All processes with pid larger than 1 will be signaled.
-n
where n is larger than 1. All processes in process group n are signaled. When an argument of the form '-n' is given, and it is meant to denote a process group, either the signal must be specified first, or the argument must be preceded by a '--' option, otherwise it will be taken as the signal to send.
commandname
It seems to me that the '-n' specification might help you
kill -STOP -- "-$(pgrep myparentproc)"