Does a process have to have at least one thread in it? Is it possible for a process to be void of any threads, or does this not make sense?
A process usually has at least one thread. Wikipedia has the definition:
a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process.
The MSDN backs this up:
A processor executes threads, not processes, so each application has at least one process, and a process always has at least one thread of execution, known as the primary thread.
Though it does go on to say:
A process can have zero or more single-threaded apartments and zero or one multithreaded apartment.
Which implies that if both the number of single-threaded apartments and multithreaded apartments could be zero. However, the process wouldn't do much :)
In Unix-like operating systems, it's possible to have a zombie process, where an entry still exists in the process table even though there are no (longer) any threads.
You can choose not to use an explicit threading library, or an operating system that has no concept of threads (and so doesn't call it a thread), but for most modern programming all programs have at least one thread of execution (generally referred to as a main thread or UI thread or similar). If that exits, so does the process.
Thought experiment: what would a process with zero threads of execution do?
In theory, I don't see why not. But it would be impossible with the popular operating systems.
A process typically consists of a few different parts:
Threads
Memory space
File discriptors
Environment (root directory, current directory, etc.)
Privileges (UID, etc.)
Et cetera
In theory, a process could exist with no threads as an RPC server. Other processes would make RPC calls which spawn threads in the server process, and then the threads disappear when the function returns. I don't know of any operating systems that work this way.
On most OSs, the process exits either when the last thread exits, or when the main thread exits.
Note: This ignores the "useless" cases such as zombie processes, which have no threads but don't do anything.
"main" itself is thread. Its a thread that gets executed. So, every process runs on at least one thread.
Related
If I have a program running with threads and call fork() on a unix-based system, are the threads copied? I know that the virtual memory for the current process is copied 1:1 to the new process spawned. I know that threads have their own stack in the virtual memory of a process. Thus, at least the stack of threads should be copied too. However, I do not know if there is anything more to threads that does not reside in virtual memory and is thus NOT copied over. If there is not, do the two processes share the threads or are they independent copies?
No.
Threads are not copied on fork(). POSIX specification says (emphasize is mine):
fork - create a new process
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
To circumvent this problem, there exists a pthread_atfork() function to help.
man fork:
The child process is created with a single thread—the one that called fork(). The entire virtual address space of the parent is replicated in the child, including the states of mutexes, condition variables, and other pthreads objects; the use of pthread_atfork(3) may be helpful for dealing with problems that this can cause.
From The Open Group Base Specifications Issue 7, 2018 edition's fork:
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
When the application calls fork() from a signal handler and any of the fork handlers registered by pthread_atfork() calls a function that is not async-signal-safe, the behavior is undefined.
Originally, "fork" was achieved by writing the task to disk and then, rather than reading in a different thread (which would be done if swapping the task with a different one), modifying the task ID of the image still in memory and continuing with its execution (as the new task). This was a very simple modification to the basic task switching mechanism, where only one task would occupy RAM memory at a time.
Of course, as memory management got more elaborate this scheme was modified to suit the new environment.
It is expected that threads, on which pthread_detach() was not called, should be pthread_join()ed before the main thread returns from main() or calls exit().
However, what happens when this requirement is not met? What happens when a process terminates when it still contains unjoined and not detached threads?
I would find it odd to learn that these other threads’ resources will not be reclaimed until system reboot. However, if these resources will be reclaimed, then there may be little need to bother about joining or detaching, mightn’t it?
It is up to the operating system. Typical modern operating systems will indeed reclaim the memory and descriptors (handles) used by abandoned threads. This is similar to how dynamically allocated memory works: typical modern systems will reclaim it when a process exits, even if the process never explicitly freed the memory. For certain unusual programs, this can be a meaningful performance optimization, because freeing lots of small resources takes time and the OS may be able to do it more quickly.
However, what happens when this requirement is not met? What happens when a process terminates when it still contains unjoined and not detached threads?
On any system with POSIX threads that is not ancient, the non-joined threads simply "evaporate" into space when the SYS_exit system call is performed by the main thread.
I would find it odd to learn that these other threads’ resources will not be reclaimed until system reboot.
They will be.
However, if these resources will be reclaimed, then there may be little need to bother about joining or detaching, mightn’t it?
It depends on what these threads do. The danger is at-exit data races.
In C++, global variables are destructed (usually via atexit or equivalent registration mechanism), FILE handles are deleted, etc. etc.
If non-joined thread tries to access any such resource, it will likely crash with SIGSEGV, possibly producing core dump, and an unclean process exit code, both of which are often quite undesirable.
I have a Linux C program that runs well in a Raspberry 3. When I run it in a low memory situation in another sbc (Raspberry Zero) it runs about 2-3 days then freezes. I believe it's a stack overflow situation.
I've put a thread to check periodically when the main program has frozen. Unfortunately it appears that if the main process crashes, it takes down all of the other threads in the process.
I can avoid this by having another process checking upon the first process, but I'd prefer a thread. Is it possible to have thread that is safe and does not freeze it the main process freezes?
Easily no, it's not possible because per thread definition they share memory and they are part of the main process and it own them all. So everything afflict the main process afflict all the threads.
I'm going to write a program in which the main thread creates new thread and then the new thread creates a child process. Since I have a hard time keeping track of the new thread and forked process, I'd like to gain a wise answer from someone.
My question is
1. Does a created process in a thread start to execute codes after pthread_create?
2. If 1 is not, where does the forked process start from if a call of fork in a thread occurs?
Thank you for reading my question.
Some of this is a bit OS-dependent, as different systems have different POSIX thread implementations and this can expose internals.
POSIX offers pthread_atfork as a somewhat blunt instrument for dealing with some of the issues, but it still looks pretty messy to me.
If your system uses a one-to-one map between "user land thread" and "kernel thread" using clone or rfork to achieve proper user-space sharing of data between threads, then fork will merely duplicate the (single) thread that calls it. However, if your system has a many-to-many style mapping (so that one user process is handling multiple threads, at least before they enter into blocking syscalls), fork may internally duplicate multiple threads. POSIX says it should look like it only duplicated one thread, so that's not supposed to be visible, but I'm not sure how well all systems implement this.
There's some general advice at http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them (Linux-centric, obviously, but still useful).
Is there some particular reason you want to fork inside a thread but not exec? In general, if you just want to run more code in parallel, you just spin off yet another thread (i.e., once you choose to run any threads, you do everything in threads, except if you have to fork for exec; if the exec fails, just _exit).
Is something like the following possible in C on Linux platform:
I have a thread say A reading system calls(intercepting system calls) made by application processes. For each process A creates a thread, which performs the required system call and then sleeps till A wakes it up with another system call which was made by its corresponding application process. When a process exits, it worker thread ceases to exist.
So its like a number of processes converzing on a thread which then fans out to many threads with one thread per process.
Thanks
If you are looking for some kind of threadpool implementation and are not strictly limited to C I would recommend threadpool (which is almost Boost). Its easy to use and quite lean. The only logic you now need is the catching of the system event and then spawn a new task thread that will execute the call. The threadpool will keep track of all created threads and assign work automatically to the threads.
EDIT
Since you are limited to C, try this implementation. It looks fairly complete and rather simple, but it will basically do the job.