Subject: Re: [OMPI users] Problem getting OpenMPI to run
From: Prentice Bisbal (prentice_at_[hidden])
Date: 2009-06-26 13:54:47


Jeff Layton wrote:
> Jeff Squyres wrote:
>> On Jun 1, 2009, at 2:04 PM, Jeff Layton wrote:
>>
>>> error: executing task of job 3084 failed: execution daemon on host
>>> "compute-2-2.local" didn't accept task
>>>
>>
>> This looks like an error message from the resource manager/scheduler
>> -- not from OMPI (i.e., OMPI tried to launch a process on a node and
>> the launch failed because something rejected it).
>>
>> Which one are you using?
>
> SGE

Jeff,

This sounds like an SGE problem to me, too. You might have better luck
on the SGE mailing list with this one. Does SGE show any of the queue
instances in an error state? Have you tried restarting sge_execd on
compute-2-2.local?

I would try sending a non-MPI program to compute-2-2.local, and see if
that gets rejected to. A simple "Hello, world" shell script should work.
That will help you determine whether it's a problem with SGE in general,
or with the configuration of your parallel environment (PE).

You should be able to specify the host to run on with

qsub -l hostname=compute-2-2.local submit.sh

--
Prentice