On Windows do I *have to* call WaitForSingleObject() after calling CreateProcess()? - c

On Linux, I have to call wait() after fork() on the parent process, otherwise the child process will stay zombie after completion until the parent process completes.
I wonder whether I must follow similar steps on Windows,
i.e. whether I must call WaitForSingleObject() after calling CreateProcess().
I know that Windows' CreateProcess() is different from Linux's fork() and it seems that 'zombie' is a UNIX/Linux concept that does not exist on Windows. But maybe I still must call WaitForSingleObject() to free some OS resources allocated for CreateProcess(), similar to the Linux case.

If CreateProcess succeeds you must close the two handles in PROCESS_INFORMATION but you don't have to wait for the child process first, the handles can be closed at any point if you don't need them.
A open handle to a process will keep the process object alive in a zombie state after it has finished running.

Related

A C program process is waited by some OS routine?

Well, I'm learning about processes using the C language, and I have seen that when you call the exit function a process is terminated and without waiting for it, it will become a zombie process. My question is, if the first process created when executing the program is a process itself, is there a 0S routine that wait for it after an exit() call, avoiding that it becomes a zombie process? I'm curious about it.
For Unix systems at least (and I expect Windows is similar), when the system boots, it creates one special first process. Every process after that is created by some existing process.
When you log into a windowed desktop interface, there is some desktop manager process (that has been created by the first process or one of its descendants) managing windows. When you start a program by clicking on it, that desktop manager or one of its children (maybe some file manager software) creates a process to run the program. When you start a program by executing a command in a terminal window, there is a command line shell process that is interpreting the things you type, and it creates a process to run the program.
So, in all cases, your user program has a parent process, either a command-line shell or some desktop software.
If a child process creates another child (even as the first instruction) then the parent also has to wait for it or it becomes a zombie.
Basically processes always become zombie until they are removed from the process table, the OS (via the process init) will handle and wait() for orphans (zombies without parents), it does that periodically so normally you won't have orphans running for very long.
On Linux, the top most (parent) process is init. This is the only process, which has no parent. Any other process (without any exception) do have a parent and hence is a child of another process.
See:
init
Section NOTES on wait
A child that terminates, but has not been waited for becomes a
"zombie". The kernel maintains a minimal set of information
about the zombie process (PID, termination status, resource usage
information) in order to allow the parent to later perform a wait
to obtain information about the child. As long as a zombie is
not removed from the system via a wait, it will consume a slot in
the kernel process table, and if this table fills, it will not be
possible to create further processes. If a parent process
terminates, then its "zombie" children (if any) are adopted by
init(1), ... init(1) automatically performs a wait to remove the
zombies.

Calling exec() or spawn(P_OVERLAY) unblocks the console; any way to keep the console blocked?

Ina console application, passing _P_OVERLAY to a spawn function (which has the same effect as calling exec), destroys the current process.
This would be the desired behavior if it were not for the fact that doing so causes the calling process (which is often cmd.exe) to assume its callee has returned, whereas in reality the spawned sub-process of that callee is still running and therefore the caller should wait for that callee to terminate before continuing to use the console.
So, if the caller is cmd.exe (the command prompt), what happens is that as soon as the callee spawns the sub-process, the user is immediately prompted with the C:\Users\User> prompt, and becomes free to type in more commands, even though the sub-process is still running.
The best solution I have is to avoid terminating the current process until the child has terminated, but I'm wondering: is there any way to have the calling process wait on the spawned sub-processes before continuing when the callee has terminated?
No, there is no way to do this - if you want cmd.exe to wait for your child to exit, then you need to wait for your child to exit.
The reason is that when cmd.exe launches your process it receives a process handle; it then waits for that process handle to become signaled. Most other parents (for example, the C runtime library) will behave the same way. Process handles are signaled when the process they refer to exits, and there is no way to change that behaviour.
Workaround: presumably you are using _P_OVERLAY because you're porting from UNIX code. If there is too much code to conveniently change all of the instances to wait for the child before exiting, you could start a child process as soon as your process starts, and run all of the UNIX-based code in the child. In this model, the only thing the top-level process does is to wait for the rest of the process tree to exit. (You can use a job object to keep track of the process tree.)

Wait for child exec

Short quesion:
I want wait in the parent for the child to be replaced with some exec call, not wait for terminate.
How can I do it?
(c language, linux platform)
Basile's answer is incorrect.
While it is true that there's no real way to wait for an exec after a call to fork(2), this is not the only way to create a child process. What you can do instead is use the vfork(2) call. This will block in the parent until the child calls either _exit or one of the exec functions.
Note that part of the reason this works the way it does is that the child process from vfork(2) does not, in fact, clone the entirety of the parent's address space. This means it is undefined behaviour to modify data in the child process before exec. If you need to do anything weird, you may be better off with for example using pause(2) and installing a signal handler for SIGUSR1 or some other signal of your choice, then using that signal immediately before the exec, or using some other IPC mechanism as mentioned above.
If you don't need to do anything special at all, and only want to call fork/exec right after one another, but want to be sure that execution of the child process has started, you can instead use posix_spawn(3), which should also start an external program immediately, effectively blocking the parent until after the exec.
You can't wait in a parent for the child to do some exec, except by having some convention about IPC, e.g. deciding to send something (in the child) on a pipe(7) just before the exec. You'll set up the pipe(2) before the fork(2). You might also use the Linux specific eventfd(2) for such IPC.
After the fork(2) and before any exec you are running (in the child process) the same code as the parent. So it is up to you to implement such conventional communications.
BTW, generally, the child process does not do a lot of things after the fork and before the exec, so waiting for the exec to happen is useless.... In the unlikely case an error happens -including failure of exec- you just _exit (usually with an exit code like 127).
You might consider ptrace(2) (with PTRACE_SYSCALL ...) but I would not do it that way.
Read Advanced Linux Programming and study the source code of some free software shells (sash or bash). Use also strace to understand what is happening in a shell.

How to kill a process synchronously on Linux?

When I call kill() on a process, it returns immediately, because it just send a signal. I have a code where I am checking some (foreign, not written nor modifiable by me) processes in a loop infinitely and if they exceed some limits (too much ram eaten etc) it kills them (and write to a syslog etc).
Problem is that when processes are heavily swapped, it takes many seconds to kill them, and because of that, my process executes the same check against same processes multiple times and attempts to send the signal many times to same process, and write this to syslog as well. (this is not done on purpose, it's just a side effect which I am trying to fix)
I don't care how many times it send a signal to process, but I do care how many times it writes to syslog. I could keep a list of PID's that were already sent the kill signal, but in theory, even if there is low probability, there could be another process spawned with same pid as previously killed one had, which might also be supposed to be killed and in this case, the log would be missing.
I don't know if there is unique identifier for any process, but I doubt so. How could I kill a process either synchronously, or keep track of processes that got signal and don't need to be logged again?
Even if you could do a "synchronous kill", you still have the race condition where you could kill the wrong process. It can happen whenever the process you want to kill exits by its own volition, or by third-party action, after you see it but before you kill it. During this interval, the PID could be assigned to a new process. There is basically no solution to this problem. PIDs are inherently a local resource that belongs to the parent of the identified process; use of the PID by any other process is a race condition.
If you have more control over the system (for example, controlling the parent of the processes you want to kill) then there may be special-case solutions. There might also be (Linux-specific) solutions based on using some mechanisms in /proc to avoid the race, though I'm not aware of any.
One other workaround may be to use ptrace on the target process as if you're going to debug it. This allows you to partially "steal" the parent role, avoiding invalidation of the PID while you're still using it and allowing you to get notification when the process terminates. You'd do something like:
Check the process info (e.g. from /proc) to determine that you want to kill it.
ptrace it, temporarily stopping it.
Re-check the process info to make sure you got the process you wanted to kill.
Resume the traced process.
kill it.
Wait (via waitpid) for notification that the process exited.
This will make the script wait for process termination.
kill $PID
while [ kill -0 $PID 2>/dev/null ]
do
sleep 1
done
kill -0 [pid] tests the existence of a process
The following solution works for most processes that aren't debuggers or processes being debugged in a debugger.
Use ptrace with argument PTRACE_ATTACH to attach to the process. This stops the process you want to kill. At this point, you should probably verify that you've attached to the right process.
Kill the target with SIGKILL. It's now gone.
I can't remember whether the process is now a zombie that you need to reap or whether you need to PTRACE_CONT it first. In either case, you'll eventually have to call waitpid to reap it, at which point you know it's dead.
If you are writing this in C you are sending the signal with the kill system call. Rather than repeatedly sending the terminating signal just send it once and then loop (or somehow periodically check) with kill(pid, 0); The zero value of signal will just tell you if the process is still alive and you can act appropriately. When it dies kill will return ESRCH.
when you spawn these processes, the classical waitpid(2) family can be used
when not used anywhere else, you can move the processes going to be killed into an own cgroup; there can be notifiers on these cgroups which get triggered when process is exiting.
to find out, whether process has been killed, you can chdir(2) into /proc/<pid> or open(2) this directory. After process termination, the status files there can not be accessed anymore. This method is racy (between your check and the action, the process can terminate and a new one with the same pid be spawned).

Operating system inside

I have three questions which are causing me a lot of doubts:
If one thread in a program calls fork(), does the new process
duplicate all threads, or is the new process single-threaded?
If a thread invokes exec(), will the program specified in the parameter
to exec() replace the entire process including ALL the threads?
Are system calls preemptive? For example whether a process can be scheduled in middle of a system call?
For exec, from man execve:
All threads other than the calling thread are destroyed during an execve().
From man fork:
The child process is created with a single thread — the one that called fork().
W.r.t. #3: Yes, you can invoke a system call that directly or indirectly makes another thread ready to run. And if that thread has a greater priority than the current and the system is designed to schedule it right then, it can do so.

Resources