I'm writing a Linux shell-like program in C.
Among others, I'm implementing two built-in commands: jobs, history.
In jobs, I print the list of currently working commands (in the background).
In history I print the list of all commands history until now, specifying for each command if it's RUNNING or DONE.
To implement the two, my idea was to have a list of commands, mapping the command name to their PID. Once the jobs/history command is called, I run through them, check which ones are running or done, and print accordingly.
I read online that the function: waitpid(pid, &status, WNOHANG), can detect from "PID" whether a process is still running or done, without stopping the process.
It works well, except for this:
When a program is alive, the function returns it.
When a program is done, the first time I call it returns done, and from there on, if called again with the same PID, it returns -1 (ERROR).
For example, it would look like this: (the & symbolizes background command)
$ sleep 3 &
$ jobs
sleep ALIVE
$ jobs (withing the 3 seconds)
sleep ALIVE
$ jobs (after 3 seconds)
sleep DONE
$ jobs
sleep ERROR
$ jobs
sleep ERROR
....
Also, these are not influenced by other command calls I might do before or after, it seems the behavior described above is independent of other commands.
I read online various reasons why waitpid might return -1, but I wasn't able to identify the reason in my case. Also, I tried looking for how to understand what type of waitpid error is it, but again unsuccessfully.
My questions are:
Why do you think this behavior is happening
If you have a solution (the ideal thing would it for it to keep returning DONE)
If you have a better idea of how to implement the jobs/history command is well accepted
One solution for this problem is that as soon as I get "DONE", I sign the command as DONE, and don't perform the waitid anymore on it before printing it. This would solve the issue, but I would remain in the dark as to WHY is this happening
You should familiarize yourself with how child processes are handled on Unix environments. In particular read about Zombie processes.
When a process dies, it enters a 'zombie' state, so that its PID is still reserved and uniquely identifies the now-dead process. A successful wait on a zombie process frees up the process descriptor and its PID. Consequently subsequent calls to wait on the same PID will fail cause there's no more process with that PID (unless a new process is allocated the same PID, in which case waiting on it would be a logical error).
You should restructure your program so that if a wait is successful and reports that a process is DONE, you record that information in your own data structure and never call wait on that PID again.
For comparison, once a process is done, bourne shell reports it one last time and then removes it from the list of jobs:
$ sleep 10 &
$ jobs
[1] + Running sleep 10
$ jobs
[1] + Running sleep 10
$ jobs
[1] Done sleep 10
$ jobs
$
Related
There is a somewhat famous Unix brain-teaser: Write an if expression to make the following program print Hello, world! on the screen. The expr in if must be a legal C expression and should not contain other program structures.
if (expr)
printf("Hello, ");
else
printf("world!\n");
The answer is fork().
When I was younger, I just had a laugh and forgot about it. But rethinking it, I find I couldn't understand why this program is surprisingly reliable than it should be. The order of execution after fork() is not guaranteed and a race condition exists, but in practice, you almost always see Hello, world!\n, never world!\nHello,.
To demonstrate it, I ran the program for 100,000 rounds.
for i in {0..100000}; do
./fork >> log
done
On Linux 5.9 (Fedora 32, gcc 10.2.1, -O2), after 100001 executions, the child only won 146 times, the parent has a winning probability of 99.9985%.
$ uname -a
Linux openwork 5.9.14-1.qubes.x86_64 #1 SMP Tue Dec 15 17:29:47 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ wc -l log
100001 log
$ grep ^world log | wc -l
146
The result is similar on FreeBSD 12.2 (clang 10.0.1, -O2). The child only won 68 times, or 0.00067% of the time, meanwhile the parent won 99.993% of all executions.
An interesting side-note is that ktrace ./fork instantly changes the dominant result to world\nHello, (because only the parent is traced), demonstrating the Heisenbug nature of the problem. Nevertheless, tracing both processes via ktrace -i ./fork reverts the behavior back, because both processes are traced and equally slow.
$ uname -a
FreeBSD freebsd 12.2-RELEASE-p1 FreeBSD 12.2-RELEASE-p1 GENERIC amd64
$ wc -l log
100001 log
$ grep ^world log | wc -l
68
Independence from Buffering?
An answer suggests that buffering can influence the behavior of this race condition. But the behavior still presents after removing \n from printf().
if (expr)
printf("Hello");
else
printf("World");
And turning off stdout's buffering via stdbuf on FreeBSD.
for i in {0..10000}; do
stdbuf -i0 -o0 -e0 ./fork >> log
echo > log
done
$ wc -l log
10001 log
$ grep -v "^HelloWorld" log | wc -l
30
Why does printf() in the parent almost always win the race condition after fork() in practice? Is it related to the internal implementation details of printf() in the C standard library? The write() system call? Or process scheduling in the Unix kernels?
When fork is executed, the process executing it (the new parent) is executing (of course), and the newly created child is not. For the child to run, either the parent must be stopped and the child given the processor, or the child must be started on another processor, which takes time. Meanwhile, the parent continues execution.
Unless some unrelated event occurs, such as the parent exhausting the time slice it was given for sharing the processor, it wins the race.
When you execute printf(3) to output a string to the terminal (to any tty device, this is checked inside the stdio package, by means of a isatty(3) call), the stdio package works in line mode buffering, which means that the internal buffer that accumulates output before writing it to the terminal flushes the buffer:
if the buffer fills up completely (this is not going to happen, as the string is too short, while the buffer is typically a best performance size or around 16kb ---this is the value for ufs2 filesystems in BSD unix), or...
if the output contains a \n line separator (this is only happening in the parent code, see below) the flush occurs at the position of the \n.
As your parent code (the one that received the pid_t process id of the child) is the one that executes the printf(3) with the included \n character, it's buffer is flushed at the time of the execution of the printf() call, while the buffer of the child will be flushed at the time of the exit(3) system call, as part of the atexit(3) processing. You can test this by calling _exit(2) (the exit(3) version that doesn't call the at-exit handlers) in both, the parent and the child, and you will see that only the parent output is seen on the screen.
There is, as you tell, a race condition, so in case the child is executed to its end, before the parent has had time to execute its printf(3) then you can get the parent's output at the end (just put a sleep(3) call in the parent code, before the printf(3), and you will see the correct order. The most important thing is that the first process that starts it's write(2) system call will be the winner (because the inode is locked during the execution of the write(2) syscal, and the output is sequenced). But the parent process only executes it's code without any system call in between, while the sequence for the child process is to store the string in the buffer and to flush it when the list of atexit(3) functions is called after returning from main(). This can involve several system calls in the mean time, that can even block the process for a while.
You can also put a \n in the child code and it is probable that you can see the child process being scheduled and starting the write() before the parent, although it is still probable that the parent will continue winning because it is very possible that it gets scheduled before the child is allowed to start (this is because the parent starting the fork(2) executes only the first part of it, e.g. checking for permissions to create a child and create the new process table entry that gives it the child's pid number needed to return from the fork, allowing the parent's fork(2) to return as soon as the child process id is known, while the allocation of memory segments to the new process and prepare it to execute is done in the child's fork() second half. This means that most probably the child will return from the fork() call when the parent is already running at top speed to the printf() call. But you cannot control this.
I want to check the exit code of a foreground process using C code running on linux. As I understand, wait() and waitpid() are useful for child processes, but in my case, it is a foreign process. I am able to read information from /proc/<pid>/stat for that process while it is active, but as the process closes, reading from /proc/<pid>/ becomes problematic and I didn't find any information relating to exit code.
Other things I've tried:
popen() some bash commands. echo $? always returned 0, even when process of interest exited with an error code. I am not sure it targeted the process of interest. Another bash command I tried to call, was wait <pid> but this command returned immediately, while the process was still running.
If you have access to the foreground process' code, you can send a message via a message queue or even a socket (e.g. udp multicast) - and that will make the solution more general (your c program can run on a different machine).
Another option is to use a loggging service (syslog or something like that). it has some
useful interfaces that enable processes to log their exit codes.
This is more of a general question than a coding question and I would appreciate some directions or a general approach.
My programming task is to implement a simple job scheduler that will execute non-interactive jobs. At any given time only 4 jobs should be executing. If more than 4 jobs are submitted, then these additional jobs must wait until one of the 4 executing jobs are completed. A prompt will keep asking the user to enter a command that should be executed.
This means that the main thread, or to be precise the main function that runs the infinite loop asking the user to enter a command, should never be blocked or waiting on a process to finish. Using fork() and exec() and wait() will cause my main process to wait which is not the desired behavior. Therefore, I thought of omitting the wait() in the parent process and using a signal handler for SIGCHLD to catch the instant when the forked process terminated. I would have a global variable holding the value of how many process are running at any given time.
Is this the right approach or is there a better/more elegant solution to that?
Thanks a lot in advance!
I'm trying to detect a fork bomb, and through that process I am trying to caluclate the descendant count of each process. However, I only want to calculate the descendant count for non-system processes, as the fork bomb will be a non-system process. However, I'm unsure how do do that. This is what I have so far:
struct task_struct *pTask;
for_each_process(pTask)
{
struct task_struct *p;
*p = *pTask;
//trace back to every ancestor
for(p = current; p != &init_task; p->parent)
{
//increment the descendant count of p's parent
}
}
This loop goes up to the init task correct, since it's &init_task? Is there any way to instead to go to the first system process and stop? Because for example, the fork bomb will be the immediate child of a system process. Any help would be greatly appreciated, thank you!!
[EDIT]
And by system process, I mean like things like for example, bash. I should have explained more clearly, but at the basic level, I don't want to delete any process that runs from boot-up. Any processes orginating from user-space that is run after boot up is fair game, but other processes are not. And I will not be checking for anything like tomcat,httpd, because I know 100% that those processes will not be running.
A login bash shell is execed by another process (which process depends on whether it is a console login shell, ssh login shell, gnome-terminal shell, etc.) The process execing bash is execed by init or some other process launched by init, not by the kernel.
A user can easily create a bash script that forks itself, so if you exempt /bin/bash from your checking then fork bombs written in bash will not be detected. For example, the following script, put in a file called foo and executed in the current directory, will create a fork bomb.
#!/bin/bash
while [ /bin/true ]
do
./foo &
done
Take a look at ulimit in bash(1) or setrlimit(2) if you want to limit the number of processes a particular user can run.
Or, you could set a really high threshold for the descendant count that triggers killing a process. If the chain of parents back to init is several hundred deep, then probably something fishy is going on.
You can use the logind / ConsoleKit2 D-Bus APIs to determine the session of a process by its PID. "System" processes (IIRC) will have no session.
Obviously this requires that logind (part of Systemd) or ConsoleKit2 are running.
Without such a service that tracks user sessions, there may be no reliable way to differentiate a "system" process from a user process, except perhaps by user-id (assuming that your fork bomb won't be running as a system user). On many distributions system user IDs are less than 1000, and regular user IDs are >= 1000, except for the "nobody" user (normally 65534).
Current scenario, I launch a process that forks, and after a while it aborts().
The thing is that both the fork and the original process print to the shell, but after the original one dies, the shell "returns" to the prompt.
I'd like to avoid the shell returning to the prompt and keep as if the process didn't die, having the child handle the situation there.
I'm trying to figure out how to do it but nothing yet, my first guess goes somewhere around tty handling, but not sure how that works.
I forgot to mention, the shell takeover for the child could be done on fork-time, if that makes it easier, via fd replication or some redirection.
I think you'll probably have to go with a third process that handles user interaction, communicating with the "parent" and "child" through pipes.
You can even make it a fairly lightweight wrapper, just passing data back and forth to the parent and terminal until the parent dies, and then switching to passing to/from the child.
To add a little further, as well, I think the fundamental problem you're going to run into is that the execution of a command by the shell just doesn't work that way. The shell is doing the equivalent of calling system() -- it's going to wait for the process it just spawned to die, and once it does, it's going to present the user with a prompt again. It's not really a tty issue, it's how the shell works.
bash (and I believe other shells) have the wait command:
wait: wait [n]
Wait for the specified process and report its termination status. If
N is not given, all currently active child processes are waited for,
and the return code is zero. N may be a process ID or a job
specification; if a job spec is given, all processes in the job's
pipeline are waited for.
Have you considered inverting the parent child relationship?
If the order in which the new processes will die is predictable, run the code that will abort in the "child" and the code that will continue in the parent.