c/unix: abort process that runs for too long - c

I need to kill such user processes that are taking longer time than a said expected interval on UNIX (Solaris) operating system. This needs to be done inside the process that is currently being executed.
Please suggest how this can be achieved in C or in UNIX?

See the alarm() system call. It provides you with a SIGALRM signal which your process can handle, and use to quit.

As long as killing without warning the process in overtime is acceptable, one alternative is to use ulimit -t <time> at the time of launching the process.

With setrlimit, you can limit the amount of CPU time used by the process. Your process will receive a SIGXCPU once the limit is exceeded.
#include <sys/resource.h>
struct rlimit limits = {42, RLIM_INFINITY};
setrlimit(RLIMIT_CPU, &limits);

At one time I had to solve this exact same problem.
My solution was as follows:
Write a controller program that does the following:
Fork a child process that starts the process you want to control.
Back in the parent, fork a second child process that sleeps for the maximum time allowed and then exits.
In the parent, wait for the children to complete and whichever finishes first causes the parent to kill the other.

There's an easier way. Launch a worker thread to do the work, then call workerThread.join(timeoutInMS) in the main thread. That will wait for that long. If that statement returns and the worker thread is still running, then you can kill it and exit.

Related

Fork multiple process simultaneously without blocking main thread

This is more of a general question than a coding question and I would appreciate some directions or a general approach.
My programming task is to implement a simple job scheduler that will execute non-interactive jobs. At any given time only 4 jobs should be executing. If more than 4 jobs are submitted, then these additional jobs must wait until one of the 4 executing jobs are completed. A prompt will keep asking the user to enter a command that should be executed.
This means that the main thread, or to be precise the main function that runs the infinite loop asking the user to enter a command, should never be blocked or waiting on a process to finish. Using fork() and exec() and wait() will cause my main process to wait which is not the desired behavior. Therefore, I thought of omitting the wait() in the parent process and using a signal handler for SIGCHLD to catch the instant when the forked process terminated. I would have a global variable holding the value of how many process are running at any given time.
Is this the right approach or is there a better/more elegant solution to that?
Thanks a lot in advance!

How to kill a process synchronously on Linux?

When I call kill() on a process, it returns immediately, because it just send a signal. I have a code where I am checking some (foreign, not written nor modifiable by me) processes in a loop infinitely and if they exceed some limits (too much ram eaten etc) it kills them (and write to a syslog etc).
Problem is that when processes are heavily swapped, it takes many seconds to kill them, and because of that, my process executes the same check against same processes multiple times and attempts to send the signal many times to same process, and write this to syslog as well. (this is not done on purpose, it's just a side effect which I am trying to fix)
I don't care how many times it send a signal to process, but I do care how many times it writes to syslog. I could keep a list of PID's that were already sent the kill signal, but in theory, even if there is low probability, there could be another process spawned with same pid as previously killed one had, which might also be supposed to be killed and in this case, the log would be missing.
I don't know if there is unique identifier for any process, but I doubt so. How could I kill a process either synchronously, or keep track of processes that got signal and don't need to be logged again?
Even if you could do a "synchronous kill", you still have the race condition where you could kill the wrong process. It can happen whenever the process you want to kill exits by its own volition, or by third-party action, after you see it but before you kill it. During this interval, the PID could be assigned to a new process. There is basically no solution to this problem. PIDs are inherently a local resource that belongs to the parent of the identified process; use of the PID by any other process is a race condition.
If you have more control over the system (for example, controlling the parent of the processes you want to kill) then there may be special-case solutions. There might also be (Linux-specific) solutions based on using some mechanisms in /proc to avoid the race, though I'm not aware of any.
One other workaround may be to use ptrace on the target process as if you're going to debug it. This allows you to partially "steal" the parent role, avoiding invalidation of the PID while you're still using it and allowing you to get notification when the process terminates. You'd do something like:
Check the process info (e.g. from /proc) to determine that you want to kill it.
ptrace it, temporarily stopping it.
Re-check the process info to make sure you got the process you wanted to kill.
Resume the traced process.
kill it.
Wait (via waitpid) for notification that the process exited.
This will make the script wait for process termination.
kill $PID
while [ kill -0 $PID 2>/dev/null ]
do
sleep 1
done
kill -0 [pid] tests the existence of a process
The following solution works for most processes that aren't debuggers or processes being debugged in a debugger.
Use ptrace with argument PTRACE_ATTACH to attach to the process. This stops the process you want to kill. At this point, you should probably verify that you've attached to the right process.
Kill the target with SIGKILL. It's now gone.
I can't remember whether the process is now a zombie that you need to reap or whether you need to PTRACE_CONT it first. In either case, you'll eventually have to call waitpid to reap it, at which point you know it's dead.
If you are writing this in C you are sending the signal with the kill system call. Rather than repeatedly sending the terminating signal just send it once and then loop (or somehow periodically check) with kill(pid, 0); The zero value of signal will just tell you if the process is still alive and you can act appropriately. When it dies kill will return ESRCH.
when you spawn these processes, the classical waitpid(2) family can be used
when not used anywhere else, you can move the processes going to be killed into an own cgroup; there can be notifiers on these cgroups which get triggered when process is exiting.
to find out, whether process has been killed, you can chdir(2) into /proc/<pid> or open(2) this directory. After process termination, the status files there can not be accessed anymore. This method is racy (between your check and the action, the process can terminate and a new one with the same pid be spawned).

Running/pausing child processes in C?

I'm running child processes in C and I want to pause and then run the same child process. Not really sure how to describe my problem in a better way since I'm new at this but here's a shot.
So I know that you can run a process after another process exits by using waitpid. But what if the process I'm waiting on doesn't exist at the creation of the process that does the waiting. So in this case, I'm thinking of pausing the process that does the waiting and when the process that is waited is created and then finishes, it would call on the process that does the waiting to run again. So how would you do this? Again, I'm not familiar with this, so I don't know if this is the proper way to do this.
edit: What I'm trying to do
I'm using child processes to run command via execvp() in parallel so if I have a sequence sleep 1; sleep 1;, the total sleep time will be 1 second. However there are cases where I try to parallel echo blah > file; cat < file; in which case I'm assuming cat reads the file after echo inputs blah into file. Therefore, I have to wait for echo to finish to do cat. There are more specifics to this, but generally assume that for any command with an output to a file must be waited on by any command that reads the file later in the script.
In Linux: You can set an alarm() before you waitpid() so you can wakeup after a certain number of seconds and waitpid() should return EINTR so you would know the situation and can kill the misbehaving one. Another way would be to use a mutex and having a block like this in the waiting process:
if (pthread_mutex_trylock(&mutex) {
sleep(some seconds);
if (pthread_mutex_trylock(&mutex) {
kill the process
}
}
and the process that is monitored:
ENTRY-POINT:
pthread_mutex_lock(&mutex);
do_stuff();
pthread_mutex_unlock(&mutex);
Any application (process) can only wait with waitpid() on its own direct children. It can't wait on grandchildren or more distant descendants, and it can wait on neither siblings nor ancestors nor on unrelated processes.
If your application is single-threaded, you can't wait on a process that will be created after the waitpid() call starts because there is nothing to do the necessary fork() to create the child.
In a multi-threaded process, you could have one thread waiting for dying children and another thread could be creating the children. For example, you could then have the waitpid() call in thread 1 start at time T0, then have thread 2 create a child at T1 (T1 > T0), and then the child dies at T2, and the waitpid() would pick up the corpse of the child at T3, even though the child was created after the waitpid() started.
Your higher level problem is probably not completely tractable. You can't tell which processes are accessing a given file just by inspecting the command lines in a 'shell script'. You can see those that probably are using it (because the file name appears on the command line); but there may be other processes that have the name hardwired into them and you can't see that by inspecting the command line.

Linux: Whether calling wait() from one thread will cause all other threads also to go to sleep?

"The wait() system call suspends execution of the current process until one of its children terminates" . Waitpid also is similar.
My Question is whether calling wait() from one thread will cause all other threads (in the same process) also to go to sleep ? Do the behavior is same for detached threads also?
This is just a bug in the manual. wait suspends the calling thread, not the process. There is absolutely no way to suspend the whole process short of sending it SIGSTOP or manually suspending each thread one at a time.
As far as I know, calling wait from any thread will cause all threads which are associated with that process to halt.
But don't hold me to that. Best thing to do would be to test it.
Should only stop the current thread. If you want to make people ill when they look at your code and cause yourself a lot of problems you can use this for jury rigged thread synchronization. I wouldn't reccommend it though.

Yielding a process in Linux

I want to yield a multithreaded process in Linux. I know a thread can be yielded by calling sched_yield. I guess, on the other hand, the whole process can be yielded by calling sleep(0), since sleep works at process level. Am I right?
sched_yield will yield the thread that is currently running, relinquishing the rest of its timeslice. The processor then context switches to the next thread. Whether that thread is another which belongs to your process is unknown. It could be, it might not be.
To yield the whole process you would therefore need to yield each thread that existed in that process. sleep works similarly. It will sleep for that particular thread, not the whole process.
Wrong.
sleep(3)
sleep() makes the calling thread sleep until seconds seconds have
elapsed or a signal arrives which is not ignored.
EDIT
From the comments I see people use an outdated site for the manual pages. Stop using that site, use the kernel.org pages which should be up-to-date.

Resources