From: Steven Truong (midair77_at_[hidden])
Date: 2007-05-18 20:15:44


Hi, Jeff. Ok. After reading through the FAQ, I modified .bashrc to
set PATH and LD_LIBRARY_PATH and now I could execute:

[struong_at_neptune ~]$ ssh node07 which orted /usr/local/openmpi-1.2.1/bin/orted
[struong_at_neptune ~]$ /usr/local/openmpi-1.2.1/bin/mpirun --host node07
hostname node07.nanostellar.com

Thank you.
Steven.

On 5/18/07, Steven Truong <midair77_at_[hidden]> wrote:
> Hi, Jeff. Thanks so very much for all your helps so far. I decided
> that I needed to go back and check whether openmpi even works for
> simple cases, so here I am.
>
> So my shell might have exited when it detect that I ran
> non-interactively. But then again, how this parameter
> MCA pls: parameter "pls_rsh_agent" (current value: "ssh :rsh")
>
> affect my outcome? How am I going to set PATH and LD_LIBRARY_PATH to
> be like those in .bash_profile in my Torque job files?
>
> Could you give me some tips here?
>
> Below is my current bash shell's settings.
>
> Thanks,
> Steven.
>
> [struong_at_neptune ~]$ echo $SHELL
> /bin/bash
> [struong_at_neptune ~]$ cat .bash_profile | grep -v ^#
>
> if [ -f ~/.bashrc ]; then
> . ~/.bashrc
> fi
>
> umask 027
> PATH=/opt/intel/fce/9.1.043/bin:/usr/local/openmpi-1.2.1/bin:/opt/c3-4:/opt/bin:/usr/local/torque/bin:/usr/local/torque/sbin:/usr/local/maui/bin:/usr/local/maui/sbin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/usr/local/rrdtool-1.2.12/bin:~/bin
> BASH_ENV=$HOME/.bashrc
> FC=/opt/intel/fce/9.1.043/bin/ifort
> F90=$FC
> F77=$FC
> F77_GETARGDECL=" "
> LD_LIBRARY_PATH=/usr/local/openmpi-1.2.1/lib
> RSHCOMMAND=/usr/bin/ssh
> PBS_DEFAULT="neptune"
> PBSLOGLEVEL=7
> BUILD_DIR=/tmp/rrdbuil
> INSTALL_DIR=/usr/local/rrdtool-1.2.12
> source /usr/local/ecce/scripts/runtime_setup.sh
> export F77 USERNAME BASH_ENV PATH RSHCOMMAND FC F90 PBS_DEFAULT
> BUILD_DIR INSTALL_DIR LD_LIBRARY_PATH
>
> [struong_at_neptune ~]$ ssh node07 which orted
> which: no orted in (/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin)
>
> [struong_at_neptune ~]$ /usr/local/openmpi-1.2.1/bin/mpirun --host node07
> node07 hostname
> --------------------------------------------------------------------------
> Failed to find the following executable:
>
> Host: node07.nanostellar.com
> Executable: node07
>
> Cannot continue.
> --------------------------------------------------------------------------
>
>
> On 5/18/07, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> > On May 18, 2007, at 4:38 PM, Steven Truong wrote:
> >
> > > [struong_at_neptune 4cpu4npar10nsim]$ mpirun --mca btl tcp,self -np 1
> > > --host node07 hostname
> > > bash: orted: command not found
> >
> > As you noted later in your mail, this is the key problem: orted is
> > not found on the remote node.
> >
> > Notice that you are currently using the rsh launcher, not the Torque
> > launcher (presumably because you are not inside a Torque job). What
> > you want to check is:
> >
> > rsh node07 which orted
> >
> > (or use ssh -- whatever is correct for your cluster)
> >
> > I suspect that orted will not be found, and that you'll need to
> > modify your shell startup files to set PATH / LD_LIBRARY_PATH
> > properly. Note that some shell startup files will exit early if they
> > detect that they are running on a non-interactive login. See http://
> > www.open-mpi.org/faq/?category=running#adding-ompi-to-path for more
> > details.
> >
> > Alternatively, you can simply use the absolute pathname to mpirun,
> > which Open MPI will interpret to mean that you want OMPI to set the
> > PATH/LD_LIBRARY_PATH on the remote node for you. Something like this:
> >
> > /usr/local/openmpi-1.2.1/bin/mpirun --host node07 hostname
> >
> > (note that the "btl" MCA parameter is only relevant for MPI executables)
> >
> > --
> > Jeff Squyres
> > Cisco Systems
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>