Multiple fork() Concurrency - c

How do you use the fork() command in such a way that you can spawn 10 processes and have them do a small task concurrently.
Concurrent is the operative word, many places that show how to use fork only use one call to fork() in their demos. I thought you would use some kind of for loop but i tried and it seems in my tests that the fork()'s are spawning a new process, doing work, then spawning a new process. So they appear to be running sequentially but how can I fork concurrently and have 10 processes do the work simultaneously if that makes sense?
Thanks.
Update: Thanks for the answers guys, I think I just misunderstood some aspects of fork() initially but i understand it now. Cheers.

Call fork() in a loop:
Adding code to wait for children per comments:
int numberOfChildren = 10;
pid_t *childPids = NULL;
pid_t p;
/* Allocate array of child PIDs: error handling omitted for brevity */
childPids = malloc(numberOfChildren * sizeof(pid_t));
/* Start up children */
for (int ii = 0; ii < numberOfChildren; ++ii) {
if ((p = fork()) == 0) {
// Child process: do your work here
exit(0);
}
else {
childPids[ii] = p;
}
}
/* Wait for children to exit */
int stillWaiting;
do {
stillWaiting = 0;
for (int ii = 0; ii < numberOfChildren; ++ii) {
if (childPids[ii] > 0) {
if (waitpid(childPids[ii], NULL, WNOHANG) != 0) {
/* Child is done */
childPids[ii] = 0;
}
else {
/* Still waiting on this child */
stillWaiting = 1;
}
}
/* Give up timeslice and prevent hard loop: this may not work on all flavors of Unix */
sleep(0);
}
} while (stillWaiting);
/* Cleanup */
free(childPids);

When you fork off processes the WILL be running concurrently. But note that unless you have enough available idle processors, they might not actually be executing concurrently, which shouldn't really matter...
Your second paragraph makes it seem like you aren't understanding how fork works, you have to check the return code to see if you are in the parent or in the forked process. So you would have the parent run a loop to fork off 10 processes, and in the children you do whatever you wanted to do concurrently.

Just loop in the "main" process spawning one child after another with each assign a particular task.

You might also want to look into POSIX Threads (or pthreads). Here is a tutorial:
https://computing.llnl.gov/tutorials/pthreads/

Related

How can i spawn n child processes to run concurrently, measure their execution time and prevent each one of them of exceeding a max execution time?

So my objective is to spawn n child processes and let them run concurrently(each one exec's a different program).
The tricky thing is that, for each one of them, I have to assure that they don't exceed a predetermined amount of time executing.(not globally, but relative to it's start time).
I only have a working code that spawns the processes, execs each one and then waits for all of them to finish.(based on this answer).
I've tried to use SIGALRM but I can not figure out how to set one alarm per fork so it does timeout each process relative to its start time and not based on the parent's starting time.
Regarding time measuring I'm not sure how could I get the execution time of each fork.
In a regular situation I would just get the delta of start and finishing time inside the child code, but in this case and if I'm not mistaken all of them execute something so I lose any reference of the following code.
for (int j = 0; j < prog_count; j++) {
pid_t monitor_pid = fork();
if(monitor_pid==0){
execvp(programs[j]->executable, programs[j]->exec_args);
}
}
while ((wpid = wait(&status)) > 0);
I've seen a lot of examples that spawn one child and control it's execution time with a sleeping timer process that runs in parallel, but i can't figure out how to extend that solution to my situation.
From man alarm
Application Usage
[...] A new process image created by one of the exec functions inherits the time left to an alarm signal in the old process' image.
So you can try to write something like:
#define LIMIT_TIME_SEC 10
for (int j = 0; j < prog_count; j++) {
pid_t monitor_pid = fork();
if(monitor_pid==0){
alarm(LIMIT_TIME_SEC);
execvp(programs[j]->executable, programs[j]->exec_args);
}
}
while ((wpid = wait(&status)) > 0);
In this code, the alarm function is called after the fork so it affects only the child process. When time indicated to alarm is reached, the SIGALARM is send to the process launched by exec function which will terminates it if it doesn't handle the signal.
If each sub-program has a different timeout, you can replace LIMIT_TIME_SEC by some array indexed like programs array.
If you want to know how much time took child processes to execute, you can built something like that:
record the child pid in program struct add the necessary member (pid_t pid;)
record the time at which the child has been launched, from parent, add member (time_t launch_time;)
for (int j = 0; j < prog_count; j++) {
pid_t monitor_pid = fork();
if (monitor_pid==0)
{
/** in child **/
/* set a limit time of execution */
alarm(LIMIT_TIME_SEC);
/* execute the program */
execvp(programs[j]->executable, programs[j]->exec_args);
}
else
{
/** in parent **/
/* record the starting time */
programs[j]->launch_time = time(NULL);
/* record the pid */
programs[j]->pid = monitor_pid;
}
}
while ((wpid = wait(&status)) > 0)
{
/* a child terminates, let's find it */
for (int j = 0; j < prog_count; ++tj)
{
if (programs[j]->pid == wpid)
{
/* print process information */
printf("program %s terminated in %df seconds\n",
programs[j]->executable,
difftime(time(NULL) - programs[j]->launch_time));
/* process found, not necessary to go further */
break;
}
}
}

C How to share informations between processes?

I would need some help with some C code.
Basically I have n processes which execute some code. Once they're almost done, I'd like the "Manager Process" (which is the main function) to send to each of the n processes an int variable, which may be different for every process.
My idea was to signal(handler_function, SIGALRM) once all processes started. When process is almost done, it uses kill(getpid(), SIGSTOP) in order to wait for the Manager Process.
After SIM_TIME seconds passed, handler_function sends int variable on a Message Queue then uses kill(process_pid, SIGCONT) in order to wake up waiting processes. Those processes, after being woken up should receive that int variable from Message Queue, print it and simply terminate, letting Manager Process take control again.
Here's some code:
/**
* Child Process creation using fork() system call
* Parent Process allocates and initializes necessary variables in shared memory
* Child Process executes Student Process code defined in childProcess function
*/
pid_t runChild(int index, int (*func)(int index))
{
pid_t pid;
pid = fork();
if (pid == -1)
{
printf(RED "Fork ERROR!\n" RESET);
exit(EXIT_FAILURE);
}
else if (pid == 0)
{
int res = func(index);
return getpid();
}
else
{
/*INSIGNIFICANT CODE*/
currentStudent = createStudent(pid);
currentStudent->status = FREE;
students[index] = *currentStudent;
currentGroup = createGroup(index);
addMember(currentStudent, currentGroup);
currentGroup->closed = FALSE;
groups[index] = *currentGroup;
return pid;
}
}
Code executed by each Process
/**
* Student Process Code
* Each Student executes this code
*/
int childProcess(int index)
{
/*NOTICE: showing only relevant part of code*/
printf("Process Index %d has almost done, waiting for manager!\n", index);
/* PROGRAM GETS STUCK HERE!*/
kill(getpid(), SIGSTOP);
/* mex variable is already defines, it's a struct implementing Message Queue message struct*/
receiveMessage(mexId, mex, getpid());
printf(GREEN "Student %d has received variable %d\n" RESET, getpid(), mex->variable);
}
Handler Function:
* Handler function
* Will be launched when SIM_TIME is reached
*/
void end_handler(int sig)
{
if (sig == SIGALRM)
{
usleep(150000);
printf(RED "Time's UP!\n" RESET);
printGroups();
for(int i = 0; i < POP_SIZE; i++){
mex->mtype = childPids[i];
mex->variable = generateInt(18, 30);
sendMessage(mexId, mex);
//childPids is an array containing PIDs of all previously launched processes
kill(childPids[i], SIGCONT);
}
}
I hope my code is understandable.
I have an issue though, Using provided code the entire program gets stuck at kill(getpid(), SIGSTOP) system call.
I also tried to launch ps in terminal and no active processes are detected.
I think handler_function doesn't send kill(childPids[i], SIGCONT) system call for some reason.
Any idea how to solve this problem?
Thank you
You might want to start by reading the manual page for mq_overview (man mq_overview). It provides a portable and flexible communication mechanism between processes which permits sync and async mechanisms to communicate.
In your approach, there is a general problem of “how does one process know if another is waiting”. If the process hasn’t stopped itself, the SIGCONT is ignored, and when it subsequently suspends itself, nobody will continue it.
In contrast, message-based communication between the two can be viewed as a little language. For simple exchanges (such as yours), the completeness of the grammar can be readily hand checked. For more complex ones, state machines or even nested state machines can be constructed to analyze their behaviour.

A struct for each child process and accessing the members

So I'm forking a couple of child processes and each of them is supposed to take a line that I've read from a file and do operations on them.
What I have is a struct containing the lines like :
struct query {
char lines[LINESIZE];
};
and I have an array of structs. So each struct serves to one child process.
This is how I forked my child processes :
for(i=0; i<5; i++) {
n = fork();
}
And say I have five structs to serve for each of these processes.
struct query query[5];
So First processes takes query[0].lines and do some operations on it, second process gets query[1].lines and does the same operations on it and so on ...
Should I use pipe to pass values between processes? I feel like there's a much simpler solution to this but my lack of practice and knowledge in C is really slowing me down.
I suppose you're trying to spawn 5 processes, but in the code that you posted you'll end up creating way more than 5 processes, in fact in:
for(i = 0; i < 5; ++i) {
n = fork();
}
when i = 0 you'll fork a process, since the forked process is an exact copy of the parent it will continue in the for loop, so at that point you'll have two processes each one having i = 1 and forking each one a new process, then you'll have at this point 4 processes, when the loop is complete you have created 160 processes.
Allocating and initializing the array "query" before the forking it is perfectly fine what you have to fix is the spawning. The fork() call returns 0 in the child process, the process id of the child to the parent process or -1 if there was a error. Knowing if the current process is the parent or the child we can continue or break out of the loop and do the computation:
for(i = 0; i < 5; ++i) {
if(fork() == 0) {
/* child process */
process_query(query[i]);
exit();
}
}

understanding forking - simple

if I have a program like this:
int i = 0;
int status;
bool result = true;
for(i = 0; i < numfiles; i++) { // LOOP 1
if (fork() == 0) {/* Child */
if (substLines(s1, s2, filenames[i])) {
exit(0);
} else {
exit(2);
}
}
}
for(i = 0; i < numfiles; i++) { // LOOP 2
wait(&status);
....
}
return result;
}
I have the following question.
what happens if a child process exists, before the program even knows about the wait(). I guess my question is regarding how a program is 'read'. Again, for example. If I exit from the first child, whilst still going through LOOP 1, what happens (does it even know about LOOP 2 at this point)?
is this a concurrent program? the parent seems to be waiting on the children after is forked them all, so i would say yes?
The man page of wait says
If a child has already changed state, then these calls return immediately. Otherwise they block until either a child changes state or a signal handler interrupts the call
so question1 doesn't matter
and question2, the answer is no.
Concurrency means they are running at the same time. It needs mutil-core CPU or more than one computer such as distributed system.
your program is multi-process, it is just Parallelism, which means they are running one by one under the schedule of CPU, for more info: Scheduling_(computing)
Just an addition to #simon_xia's excellent answer.
A killed or exited process becomes a zombie until its parent calls wait for it. And yes, this is the official terminology. :-) In zombie state everything is cleaned up (memory pages, open files, env, etc), just the exit status or killing signal number are kept.

Why does GNU script use two forks instead of select and one fork?

I just realised that the "script" binary on GNU linux is using two forks instead of one.
It could simply use select instead of doing a first fork(). Why would it use two forks ?
Is it simply because select did not exist at the time it has been coded and nobody had the motivation to recode it or is there a valid reason ?
man 1 script: http://linux.die.net/man/1/script
script source: http://pastebin.com/raw.php?i=br8QXRUT
The clue is in the code, which I have added some comments to.
child = fork();
sigprocmask(SIG_SETMASK, &unblock_mask, NULL);
if (child < 0) {
warn(_("fork failed"));
fail();
}
if (child == 0) {
/* child of first fork */
sigprocmask(SIG_SETMASK, &block_mask, NULL);
subchild = child = fork();
sigprocmask(SIG_SETMASK, &unblock_mask, NULL);
if (child < 0) {
warn(_("fork failed"));
fail();
}
if (child) {
/* child of second fork runs 'dooutput' */
if (!timingfd)
timingfd = fdopen(STDERR_FILENO, "w");
dooutput(timingfd);
} else
/* parent of second fork runs 'doshell' */
doshell();
} else {
sa.sa_handler = resize;
sigaction(SIGWINCH, &sa, NULL);
}
/* parent of first fork runs doinput */
doinput();
There are thus three process running:
dooutput()
doshell()
doinput()
I think you are asking why use three processes, not one process and select(). select() has existed since ancient UNIX history, so the answer is unlikely to be that select() did not exist. The answer is more prosaic. doshell() needs to be in a separate process anyway, as what it does is exec the shell with appropriately piped fds. You thus need at least one fork. Writing dooutput() and doinput() within a select() loop looks to me perfectly possible, but it is actually easier to use blocking I/O rather than have to worry about using select etc. As fork() is relatively lightweight (given UNIX's CoW semantics) and there is little need for communication between the two processes, why use select() when fork() is perfectly good and produces smaller code? IE the real answer is 'why not?'

Resources