Subject: Re: [OMPI users] Did you break MPI_Abort recently?
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-06-27 22:05:33

For anyone else on the list who is interested:

There definitely was a bug in the system that was causing Open MPI to
not forcibly terminate all processes when one called abort. However,
we also have a backup mechanism that should catch things even if our
primary method fails.

MPI processes in Open MPI will automatically terminate whenever their
local daemon dies unexpectedly. However, the processes have to call
into the MPI library before this can happen.

This particular example has all the processes immediately enter an
infinite loop once a process aborts. Thus, they never called into the
MPI library, and hence never aborted.

I have fixed the bug that caused the primary method to fail. Just
wanted to explain why the backup method also failed - the fact that
OMPI cannot locally respond to failures until/unless the app enters
the MPI library is an important point to remember.


On Jun 27, 2009, at 7:37 PM, Mostyn Lewis wrote:

> Thank you.
> DM
> On Fri, 26 Jun 2009, Ralph Castain wrote:
>> Man, was this a PITA to chase down. Finally found it, though. Fixed
>> on trunk as of r21549
>> Thanks!
>> Ralph
>> So something else is wrong.
>> On Jun 25, 2009, at 3:19 PM, Mostyn Lewis wrote:
>>> Just local machine - direct from the command line wth a script like
>>> the one below. So, no launch mechanism.
>>> Fails on SUSE Linux Enterprise Server 10 (x86_64) - SP2 and
>>> Fedora release 10 (Cambridge), for example.
>>> DM
>>> On Thu, 25 Jun 2009, Ralph Castain wrote:
>>>> Sorry - should have been more clear. Are you using rsh, qrsh
>>>> (i.e., SGE), SLURM, Torque, ....?
>>>> On Jun 25, 2009, at 2:54 PM, Mostyn Lewis wrote:
>>>>> Something like:
>>>>> #!/bin/ksh
>>>>> set -x
>>>>> export PATH=$OPENMPI_GCC_SVN/bin:$PATH
>>>>> MCA="--mca btl tcp,self"
>>>>> mpicc -g -O6 mpiabort.c
>>>>> NPROCS=4
>>>>> mpirun --prefix $PREFIX -x LD_LIBRARY_PATH $MCA -np $NPROCS -
>>>>> machinefile fred ./a.out
>>>>> DM
>>>>> On Thu, 25 Jun 2009, Ralph Castain wrote:
>>>>>> Using what launch environment?
>>>>>> On Jun 25, 2009, at 2:29 PM, Mostyn Lewis wrote:
>>>>>>> While using the BLACS test programs, I've seen that with
>>>>>>> recent SVN checkouts
>>>>>>> (including todays) the MPI_Abort test left procs running. The
>>>>>>> last SVN I
>>>>>>> have where it worked was 1.4a1r20936. By 1.4a1r21246 it fails.
>>>>>>> Works O.K. in the standard 1.3.2 release.
>>>>>>> A test program is below. GCC was used.
>>>>>>> DM
>>>>>>> #include <stdio.h>
>>>>>>> #include <sys/types.h>
>>>>>>> #include <unistd.h>
>>>>>>> #include <math.h>
>>>>>>> #include <mpi.h>
>>>>>>> #define NUM_ITERS 100000
>>>>>>> /* Prototype the function that we'll use below. */
>>>>>>> static double f(double);
>>>>>>> int
>>>>>>> main(int argc, char *argv[])
>>>>>>> {
>>>>>>> int iter, rank, size, i;
>>>>>>> int foo;
>>>>>>> double PI25DT = 3.141592653589793238462643;
>>>>>>> double mypi, pi, h, sum, x;
>>>>>>> double startwtime = 0.0, endwtime;
>>>>>>> int namelen;
>>>>>>> char processor_name[MPI_MAX_PROCESSOR_NAME];
>>>>>>> /* Normal MPI startup */
>>>>>>> MPI_Init(&argc, &argv);
>>>>>>> MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>>>>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>>>>> MPI_Get_processor_name(processor_name, &namelen);
>>>>>>> printf("Process %d of %d on %s\n", rank, size, processor_name);
>>>>>>> /* Do approximations for 1 to 100 points */
>>>>>>> /* sleep(5); */
>>>>>>> for (iter = 2; iter < NUM_ITERS; ++iter) {
>>>>>>> h = 1.0 / (double) iter;
>>>>>>> sum = 0.0;
>>>>>>> /* A slightly better approach starts from large i and works
>>>>>>> back */
>>>>>>> if (rank == 0)
>>>>>>> startwtime = MPI_Wtime();
>>>>>>> for (i = rank + 1; i <= iter; i += size) {
>>>>>>> x = h * ((double) i - 0.5);
>>>>>>> sum += f(x);
>>>>>>> }
>>>>>>> mypi = h * sum;
>>>>>>> if(iter == (NUM_ITERS - 1000)){
>>>>>>> MPI_Barrier(MPI_COMM_WORLD);
>>>>>>> if(rank == 2){
>>>>>>> MPI_Abort(MPI_COMM_WORLD, -1);
>>>>>>> } else {
>>>>>>> /* Just loop */
>>>>>>> foo = 1;
>>>>>>> while(foo == 1){
>>>>>>> foo = foo + 3;
>>>>>>> foo = foo - 2;
>>>>>>> foo = foo - 1;
>>>>>>> }
>>>>>>> }
>>>>>>> }
>>>>>>> MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0,
>>>>>>> MPI_COMM_WORLD);
>>>>>>> }
>>>>>>> /* All done */
>>>>>>> if (rank == 0) {
>>>>>>> printf("%d points: pi is approximately %.16f, error = %.16f\n",
>>>>>>> iter, pi, fabs(pi - PI25DT));
>>>>>>> endwtime = MPI_Wtime();
>>>>>>> printf("wall clock time = %f\n", endwtime - startwtime);
>>>>>>> fflush(stdout);
>>>>>>> }
>>>>>>> MPI_Finalize();
>>>>>>> return 0;
>>>>>>> }
>>>>>>> static double
>>>>>>> f(double a)
>>>>>>> {
>>>>>>> return (4.0 / (1.0 + a * a));
>>>>>>> }
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
> _______________________________________________
> users mailing list
> users_at_[hidden]