Gracefully exit with MPI - c

I'm trying to gracefully exit my program after if Rdinput returns an error.
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#define MASTER 0
#define Abort(x) MPI_Abort(MPI_COMM_WORLD, x)
#define Bcast(send_data, count, type) MPI_Bcast(send_data, count, type, MASTER, GROUP) //root --> MASTER
#define Finalize() MPI_Finalize()
int main(int argc, char **argv){
//Code
if( rank == MASTER ) {
time (&start);
printf("Initialized at %s\n", ctime (&start) );
//Read file
error = RdInput();
}
Bcast(&error, 1, INT); Wait();
if( error = 1 ) MPI_Abort(1);
//Code
Finalize();
}
Program output:
mpirun -np 2 code.x
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
Initialized at Wed May 30 11:34:46 2012
Error [RdInput]: The file "input.mga" is not available!
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 7369 on
node einstein exiting improperly. There are two reasons this could occur:
//More error message.
What can I do to gracefully exit an MPI program without printing this huge error message?

If you have this logic in your code:
Bcast(&error, 1, INT);
if( error = 1 ) MPI_Abort(1);
then you're just about done (although you don't need any kind of wait after a broadcast). The trick, as you've discovered, is that MPI_Abort() does not do "graceful"; it basically is there to shut things down in whatever way possible when something's gone horribly wrong.
In this case, since now everyone agrees on the error code after the broadcast, just do a graceful end of your program:
MPI_Bcast(&error, 1, MPI_INT, MASTER, MPI_COMM_WORLD);
if (error != 0) {
if (rank == 0) {
fprintf(stderr, "Error: Program terminated with error code %d\n", error);
}
MPI_Finalize();
exit(error);
}
It's an error to call MPI_Finalize() and keep on going with more MPI stuff, but that's not what you're doing here, so you're fine.

Related

Recoding Strace, why I can't catch the "write syscall ?"

I am currently recode the Strace command.
I understand the goal of this command and I can catch some syscalls from an exectuable file.
My question is : Why I don't catch the "write" syscall ?
this is my code :
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <sys/user.h>
#include <wait.h>
int main(int argc, char* argv[]) {
int status;
pid_t pid;
struct user_regs_struct regs;
int counter = 0;
int in_call =0;
switch(pid = fork()) {
case -1:
perror("fork");
exit(1);
case 0:
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
execvp(argv[1], argv + 1);
break;
default:
wait(&status);
while (status == 1407) {
ptrace(PTRACE_GETREGS, pid, NULL, &regs);
if(!in_call) {
printf("SystemCall %lld called with %lld, %lld, %lld\n",regs.orig_rax,
regs.rbx, regs.rcx, regs.rdx);
in_call=1;
counter ++;
}
else
in_call = 0;
ptrace(PTRACE_SYSEMU, pid, NULL, NULL);
wait(&status);
}
}
printf("Total Number of System Calls = %d\n", counter);
return 0;
}
This is the output using my program :
./strace ./my_program
SystemCall 59 called with 0, 0, 0
SystemCall 60 called with 0, 4198437, 5
Total Number of System Calls = 2
59 represents the execve syscall.
60 represents the exit syscall.
This is the output using the real strace :
strace ./my_program
execve("./my_program", ["./bin_asm_write"], 0x7ffd2929ae70 /* 67 vars */) = 0
write(1, "Toto\n", 5Toto
) = 5
exit(0) = ?
+++ exited with 0 +++
As you can see, my program don't catch the write syscall.
I don't understrand why, do you have any idea ?
Thank You for your answer.
Your while loop is set up rather strangely -- you have this in_call flag that you toggle back and forth between 0 and 1, and you only print the system call when it is 0. The net result is that while you catch every system call, you only print every other system call. So when you catch the write call, the flag is 1 and you don't print anything.
Another oddness is that you're using PTRACE_SYSEMU rather than PTRACE_SYSCALL. SYSEMU is intended for emulating system calls, so the system call won't actually run at all (it will be skipped); normally your ptracing program would do whatever the systme call is supposed to do itself and then call PTRACE_SETREGS to set the tracee's registers with the appropriate return values before calling PTRACE_SYSEMU again to run to the next system call.
Your in_call flagging would make more sense if you were actually using PTRACE_SYSCALL, as that will stop twice for each syscall -- once on entry to the syscall and a second time when the call returns. However, it will also stop for signals, so you need to be decoding the status to see if a signal has occurred or not.

Im learning MPI with C and don't understand why my code don't work

I'm new to MPI and I want my C program, which needs to be launched with two process, to output this:
Hello
Good bye
Hello
Good bye
... (20 times)
But when I start it with mpirun -n 2 ./main it gives me this error and the program don't work:
*** An error occurred in MPI_Send
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[laptop:3786023] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
*** An error occurred in MPI_Send
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[laptop:3786024] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[61908,1],0]
Exit code: 1
--------------------------------------------------------------------------
Here's my code:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <unistd.h>
#include <string.h>
#define N 32
void send (int rank)
{
char mess[N];
if (rank == 0)
sprintf(mess, "Hello");
else
sprintf(mess, "Good bye");
MPI_Send(mess, strlen(mess), MPI_CHAR, !rank, 0, MPI_COMM_WORLD);
}
void receive (int rank)
{
char buf[N];
MPI_Status status;
MPI_Recv(buf, N, MPI_CHAR, !rank, 0, MPI_COMM_WORLD, &status);
printf("%s\n", buf);
}
int main (int argc, char **argv)
{
if (MPI_Init(&argc, &argv)) {
fprintf(stderr, "Erreur MPI_Init\n");
exit(1);
}
int size, rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
for (size_t i = 0; i < 20; i ++) {
if (rank == 0) {
send(rank);
receive(rank);
} else {
receive(rank);
send(rank);
}
}
MPI_Finalize();
return 0;
}
I don't understand that error and I don't know how to debug that.
Thank you if you can help me to solve that (probably idiot) problem !
The main error is your function is called send(), and that conflicts with the function of the libc, and hence results in this bizarre behavior.
In order to avoid an other undefined behavior caused by using uninitialized data, you also have to send the NULL terminating character, e.g.
MPI_Send(mess, strlen(mess)+1, MPI_CHAR, !rank, 0, MPI_COMM_WORLD);

Why should we check WIFEXITED after wait in order to kill child processes in Linux system call?

I came across some code in C where we check the return value of wait and if it's not an error there's yet another check of WIFEXITED and WIFEXITSTATUS. Why isn't this redundant? As far as I understand wait returns -1 if an error occurred while WIFEXITED returns non-zero value if wait child terminated normally. So if there wasn't any error in this line if ( wait(&status) < 0 ) why would anything go wrong durng WIFEXITED check?
This is the code:
#include <stdio.h>
#include <signal.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <unistd.h>
#define CHILDREN_NUM 5
int main () {
int i, status, pid, p;
for(i = 0; (( pid = fork() ) < 0) && i < CHILDREN_NUM;i++)
sleep(5);
if ( pid == 0 )
{
printf(" Child %d : successfully created!\n",i);
exit( 0 ); /* Son normally exits here! */
}
p = CHILDREN_NUM;
/* The father waits for agents to return succesfully */
while ( p >= 1 )
{
if ( wait(&status) < 0 ) {
perror("Error");
exit(1);
}
if ( ! (WIFEXITED(status) && (WEXITSTATUS(status) == 0)) ) /* kill all running agents */
{
fprintf( stderr,"Child failed. Killing all running children.\n");
//some code to kill children here
exit(1);
}
p--;
}
return(0);
}
wait returning >= 0 tells you a child process has terminated (and that calling wait didn't fail), but it does not tell you whether that process terminated successfully or not (or if it was signalled).
But, here, looking at your code, it's fairly obvious the program does care about whether the child process that terminated did so successfully or not:
fprintf( stderr,"Child failed. Killing all running children.\n");
So, the program needs to do further tests on the status structure that was populated by wait:
WIFEXITED(status): did the process exit normally? (as opposed to being signalled).
WEXITSTATUS(status) == 0: did the process exit with exit code 0 (aka "success"). For more information, see: Meaning of exit status 1 returned by linux command.
wait(&status) waits on the termination of a child process. The termination may be due to a voluntary exit or due to the receipt of an unhandled signal whose default disposition is to terminate the process.
WIFEXITED(status) and WIFSIGNALED(status) allow you to distinguish the two* cases, and you can later use either WEXITSTATUS or WTERMSIG to retrieve either the exit status (if(WIFEXITED(status)) or termination signal (if(WIFSIGNALED(status)).
*with waitpid and special flags (WUNTRACED, WCONTINUED), you can also wait on child process stops and resumptions, which you can detect with WIFSTOPPED or WIFCONTINUED (linux only) respectively. See waitpid(2) for more information.

Wait until slaves called MPI_finalize

I have a problem with the following codes:
Master:
#include <iostream>
using namespace std;
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define PB1 1
#define PB2 1
int main (int argc, char *argv[])
{
int np[2] = { 2, 1 }, errcodes[2];
MPI_Comm parentcomm, intercomm;
char *cmds[2] = { "./slave", "./slave" };
MPI_Info infos[2] = { MPI_INFO_NULL, MPI_INFO_NULL };
MPI_Init(NULL, NULL);
#if PB1
for(int i = 0 ; i<2 ; i++)
{
MPI_Info_create(&infos[i]);
char hostname[] = "localhost";
MPI_Info_set(infos[i], "host", hostname);
}
#endif
MPI_Comm_spawn_multiple(2, cmds, MPI_ARGVS_NULL, np, infos, 0, MPI_COMM_WORLD, &intercomm, errcodes);
printf("c Creation of the workers finished\n");
#if PB2
sleep(1);
#endif
MPI_Comm_spawn_multiple(2, cmds, MPI_ARGVS_NULL, np, infos, 0, MPI_COMM_WORLD, &intercomm, errcodes);
printf("c Creation of the workers finished\n");
MPI_Finalize();
return 0;
}
Slave:
#include "mpi.h"
#include <stdio.h>
using namespace std;
int main( int argc, char *argv[])
{
int rank;
MPI_Init(0, NULL);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
printf("rank = %d\n", rank);
MPI_Finalize();
return 0;
}
I do not know why when I run mpirun -np 1 ./master, my program stops with the following mesage when I set PB1 and PB2 to 1 (it works well when I set of of them to 0):
There are not enough slots available in the system to satisfy the 2
slots that were requested by the application: ./slave Either
request fewer slots for your application, or make more slots available
for use.
For instance, when I set PB2 to 0, the program works well. Thus, I suppose that it is because the MPI_finalize does not finish its job ...
I googled, but I did not find any answer for my problem. I tried various things as: call MPI_comm_disconnect, add a barrier, ... but nothing worked.
I work on Ubuntu (15.10) and use the OpenMPI version 1.10.2.
The MPI_Finalize on the first set of salves will not finish until MPI_Finalize is called on the master. MPI_Finalize is collective over all connected processes. You can work around that by manually disconnecting the first batch of salves from the intercommunicator before calling MPI_Finalize. This way, the slaves will actually finish complete and exit - freeing the "slots" for the new batch of slaves. Unfortunately I don't see a standardized way to really ensure the slaves are finished in a sense that their slots are freed, because that would be implementation defined. The fact that OpenMPI freezes in the MPI_Comm_spawn_multiple instead of returning an error is unfortunate and one might consider that a bug. Anyway here is a draft of what you could do:
Within the master, each time is done with its slaves:
MPI_Barrier(&intercomm); // Make sure master and slaves are somewhat synchronized
MPI_Comm_disconnect(&intercomm);
sleep(1); // This is the ugly unreliable way to give the slaves some time to shut down
The slave:
MPI_Comm parent;
MPI_Comm_get_parent(&parent); // you should have that already
MPI_Comm_disconnect(&parent);
MPI_Finalize();
However, you still need to make sure OpenMPI knows how many slots should be reserved for the whole application (universe_size). You can do that for example with a hostfile:
localhost slots=4
And then mpirun -np 1 ./master.
Now this is not pretty and I would argue that your approach to dynamically spawning MPI workers isn't really what MPI is meant for. It may be supported by the standard, but that doesn't help you if implementations are struggling. However, there is not enough information on how you intend to communicate with the external processes to provide a cleaner, more ideomatic solution.
One last remark: Do check the return codes of MPI functions. Especially MPI_Comm_spawn_multiple.

c - Proper range of return status / value

Recently, when reading a book about linux programming, I got a message that:
The status argument given to _exit() defines the termination status of the process, which is available to the parent of this process when it calls wait(). Although defined as an int, only the bottom 8 bits of status are actually made available to the parent. And only 0 ~ 127 is recommanded to use, because 128 ~ 255 could be confusing in shell due to some reason. Due to that -1 will become 255 in 2's complement.
The above is about the exit status of a child process.
My question is:
Why the parent process only get the 8 bits of the child process's exit status?
What about return value of normal functions? Does it reasonable or befinit to use only 0 ~ 127 ? Because I do use -1 as return value to indicate error sometimes, should I correct that in future.
Update - status get by wait() / waitpid():
I read more chps in the book (TLPI), and found there are more trick in the return status & wait()/waitpid() that worth mention, I should have read more chps before ask the question. Anyhow, I have add an answer by myself to describe about it, in case it might help someone in future.
Why the parent process only get the 8 bits of the child process's exit status?
Because POSIX says so. And POSIX says so because that's how original Unix worked, and many operating system derived from it and modeled after it continue to work.
What about return value of normal functions?
They are unrelated. Return whatever is reasonable. -1 is as good as any other value, and is in fact a standard way to indicate an error in a huge lot of standard C and POSIX APIs.
The answer from #n.m. is good.
But later on, I read more chps in the book (TLPI), and found there are more trick in the return status & wait()/waitpid() that worth mention, and that might be another important or root reason why child process can't use full bits of int when exit.
Wait status
basicly:
child process should exit with value within range of 1 byte, which is set as part of the status parameter of wait() / waitpid(),
and only 2 LSB bytes of the status is used,
byte usage of status:
event byte 1 byte 0
============================================================
* normal termination exit status (0 ~ 255) 0
* killed by signal 0 termination signal (!=0)
* stopped by signal stop signal 0x7F
* continued by signal 0xFFFF
*
dissect returned status:
header 'sys/wait.h', defines a set of macros that help to dissect a wait status,
macros:
* WIFEXITED(status)
return true if child process exit normally,
*
* WIFSIGNALED(status)
return true if child process killed by signal,
* WTERMSIG(status)
return signal number that terminate the process,
* WCOREDUMP(status)
returns ture if child process produced a core dump file,
tip:
this macro is not in SUSv3, might absent on some system,
thus better check whether it exists first, via:
#ifdef WCOREDUMP
// ...
#endif
*
* WIFSTOPPED(status)
return true if child process stopped by signal,
* WSTOPSIG(status)
return signal number that stopp the process,
*
* WIFCONTINUED(status)
return true if child process resumed by signal SIGCONT,
tip:
this macro is part of SUSv3, but some old linux or some unix might didn't impl it,
thus better check whether it exists first, via:
#ifdef WIFCONTINUED
// ...
#endif
*
Sample code
wait_status_test.c
// dissect status returned by wait()/waitpid()
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/wait.h>
#define SLEEP_SEC 10 // sleep seconds of child process,
int wait_status_test() {
pid_t cpid;
// create child process,
switch(cpid=fork()) {
case -1: // failed
printf("error while fork()\n");
exit(errno);
case 0: // success, child process goes here
sleep(SLEEP_SEC);
printf("child [%d], going to exit\n",(int)getpid());
_exit(EXIT_SUCCESS);
break;
default: // success, parent process goes here
printf("parent [%d], child created [%d]\n", (int)getpid(), (int)cpid);
break;
}
// wait child to terminate
int status;
int wait_flag = WUNTRACED | WCONTINUED;
while(1) {
if((cpid = waitpid(-1, &status, wait_flag)) == -1) {
if(errno == ECHILD) {
printf("no more child\n");
exit(EXIT_SUCCESS);
} else {
printf("error while wait()\n");
exit(-1);
}
}
// disset status
printf("parent [%d], child [%d] ", (int)getpid(), (int)cpid);
if(WIFEXITED(status)) { // exit normal
printf("exit normally with [%d]\n", status);
} else if(WIFSIGNALED(status)) { // killed by signal
char *dumpinfo = "unknow";
#ifdef WCOREDUMP
dumpinfo = WCOREDUMP(status)?"true":"false";
#endif
printf("killed by signal [%d], has dump [%s]\n", WTERMSIG(status), dumpinfo);
} else if(WIFSTOPPED(status)) { // stopped by signal
printf("stopped by signal [%d]\n", WSTOPSIG(status));
#ifdef WIFCONTINUED
} else if(WIFCONTINUED(status)) { // continued by signal
printf("continued by signal SIGCONT\n", WSTOPSIG(status));
#endif
} else { // this should never happen
printf("unknow event\n");
}
}
return 0;
}
int main(int argc, char *argv[]) {
wait_status_test();
return 0;
}
Compile:
gcc -Wall wait_status_test.c
Execute:
./a.out and wait it to terminate normally, child process id is printed after fork(),
./a.out, then kill -9 <child_process_id> before it finish sleep,
./a.out, then kill -STOP <child_process_id> before it finish sleep, then kill -CONT <child_process_id> to resume it,

Resources