include("../../include/msg-header.inc"); ?>
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-05-09 15:55:45
I have mailed the VASP maintainer asking for a copy of the code.
Let's see what happens.
On May 9, 2007, at 2:44 PM, Steven Truong wrote:
> Hi, Jeff. Thank you very much for looking into this issue. I am
> afraid that I can not give you the application/package because it is a
> comercial software. I believe that a lot of people are using this
> VASP software package http://cms.mpi.univie.ac.at/vasp/.
>
> My current environment uses MPICH 1.2.7p1, however, because a new set
> of dual core machines has posed a new set of challenges and I am
> looking into replacing MPICH with openmpi on these machines.
>
> Could Mr. Radican, who wrote that he was able to run VASP with
> openMPI, provide a lot more detail regarding how he configure openmpi,
> how he compile and run VASP job and anything relating to this issue?
>
> Thank you very much for all your helps.
> Steven.
>
> On 5/9/07, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>> Can you send a simple test that reproduces these errors?
>>
>> I.e., if there's a single, simple package that you can send
>> instructions on how to build, it would be most helpful if we could
>> reproduce the error (and therefore figure out how to fix it).
>>
>> Thanks!
>>
>>
>> On May 9, 2007, at 2:19 PM, Steven Truong wrote:
>>
>>> Oh, no. I tried with ACML and had the same set of errors.
>>>
>>> Steven.
>>>
>>> On 5/9/07, Steven Truong <midair77_at_[hidden]> wrote:
>>>> Hi, Kevin and all. I tried with the following:
>>>>
>>>> ./configure --prefix=/usr/local/openmpi-1.2.1 --disable-ipv6
>>>> --with-tm=/usr/local/pbs --enable-mpirun-prefix-by-default
>>>> --enable-mpi-f90 --with-threads=posix --enable-static
>>>>
>>>> and added the mpi.o in my VASP's makefile but i still got error.
>>>>
>>>> I forgot to mention that our environment has Intel MKL 9.0 or
>>>> 8.1 and
>>>> my machines are dual proc dual core Xeon 5130 .
>>>>
>>>> Well, I am going to try acml too.
>>>>
>>>> Attached is my makefile for VASP and I am not sure if I missed
>>>> anything again.
>>>>
>>>> Thank you very much for all your helps.
>>>>
>>>> On 5/9/07, Steven Truong <midair77_at_[hidden]> wrote:
>>>>> Thank Kevin and Brook for replying to my question. I am going to
>>>>> try
>>>>> out what Kevin suggested.
>>>>>
>>>>> Steven.
>>>>>
>>>>> On 5/9/07, Kevin Radican <radicak_at_[hidden]> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> We use VASP 4.6 in parallel with opemmpi 1.1.2 without any
>>>>>> problems on
>>>>>> x86_64 with opensuse and compiled with gcc and Intel fortran and
>>>>>> use
>>>>>> torque PBS.
>>>>>>
>>>>>> I used standard configure to build openmpi something like
>>>>>>
>>>>>> ./configure --prefix=/usr/local --enable-static --with-threads
>>>>>> --with-tm=/usr/local --with-libnuma
>>>>>>
>>>>>> I used the ACLM math lapack libs and built Blacs and Scalapack
>>>>>> with them
>>>>>> too.
>>>>>>
>>>>>> I attached my vasp makefile, I might of added
>>>>>>
>>>>>> mpi.o : mpi.F
>>>>>> $(CPP)
>>>>>> $(FC) -FR -lowercase -O0 -c $*$(SUFFIX)
>>>>>>
>>>>>> to the end of the make file, It doesn't look like it is in the
>>>>>> example
>>>>>> makefiles they give, but I compiled this a while ago.
>>>>>>
>>>>>> Hope this helps.
>>>>>>
>>>>>> Cheers,
>>>>>> Kevin
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, 2007-05-08 at 19:18 -0700, Steven Truong wrote:
>>>>>>> Hi, all. I am new to OpenMPI and after initial setup I tried
>>>>>>> to run
>>>>>>> my app but got the followign errors:
>>>>>>>
>>>>>>> [node07.my.com:16673] *** An error occurred in MPI_Comm_rank
>>>>>>> [node07.my.com:16673] *** on communicator MPI_COMM_WORLD
>>>>>>> [node07.my.com:16673] *** MPI_ERR_COMM: invalid communicator
>>>>>>> [node07.my.com:16673] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>>> [node07.my.com:16674] *** An error occurred in MPI_Comm_rank
>>>>>>> [node07.my.com:16674] *** on communicator MPI_COMM_WORLD
>>>>>>> [node07.my.com:16674] *** MPI_ERR_COMM: invalid communicator
>>>>>>> [node07.my.com:16674] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>>> [node07.my.com:16675] *** An error occurred in MPI_Comm_rank
>>>>>>> [node07.my.com:16675] *** on communicator MPI_COMM_WORLD
>>>>>>> [node07.my.com:16675] *** MPI_ERR_COMM: invalid communicator
>>>>>>> [node07.my.com:16675] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>>> [node07.my.com:16676] *** An error occurred in MPI_Comm_rank
>>>>>>> [node07.my.com:16676] *** on communicator MPI_COMM_WORLD
>>>>>>> [node07.my.com:16676] *** MPI_ERR_COMM: invalid communicator
>>>>>>> [node07.my.com:16676] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>>> mpiexec noticed that job rank 2 with PID 16675 on node node07
>>>>>>> exited
>>>>>>> on signal 60 (Real-time signal 26).
>>>>>>>
>>>>>>> /usr/local/openmpi-1.2.1/bin/ompi_info
>>>>>>> Open MPI: 1.2.1
>>>>>>> Open MPI SVN revision: r14481
>>>>>>> Open RTE: 1.2.1
>>>>>>> Open RTE SVN revision: r14481
>>>>>>> OPAL: 1.2.1
>>>>>>> OPAL SVN revision: r14481
>>>>>>> Prefix: /usr/local/openmpi-1.2.1
>>>>>>> Configured architecture: x86_64-unknown-linux-gnu
>>>>>>> Configured by: root
>>>>>>> Configured on: Mon May 7 18:32:56 PDT 2007
>>>>>>> Configure host: neptune.nanostellar.com
>>>>>>> Built by: root
>>>>>>> Built on: Mon May 7 18:40:28 PDT 2007
>>>>>>> Built host: neptune.my.com
>>>>>>> C bindings: yes
>>>>>>> C++ bindings: yes
>>>>>>> Fortran77 bindings: yes (all)
>>>>>>> Fortran90 bindings: yes
>>>>>>> Fortran90 bindings size: small
>>>>>>> C compiler: gcc
>>>>>>> C compiler absolute: /usr/bin/gcc
>>>>>>> C++ compiler: g++
>>>>>>> C++ compiler absolute: /usr/bin/g++
>>>>>>> Fortran77 compiler: /opt/intel/fce/9.1.043/bin/ifort
>>>>>>> Fortran77 compiler abs: /opt/intel/fce/9.1.043/bin/ifort
>>>>>>> Fortran90 compiler: /opt/intel/fce/9.1.043/bin/ifort
>>>>>>> Fortran90 compiler abs: /opt/intel/fce/9.1.043/bin/ifort
>>>>>>> C profiling: yes
>>>>>>> C++ profiling: yes
>>>>>>> Fortran77 profiling: yes
>>>>>>> Fortran90 profiling: yes
>>>>>>> C++ exceptions: no
>>>>>>> Thread support: posix (mpi: no, progress: no)
>>>>>>> Internal debug support: no
>>>>>>> MPI parameter check: runtime
>>>>>>> Memory profiling support: no
>>>>>>> Memory debugging support: no
>>>>>>> libltdl support: yes
>>>>>>> Heterogeneous support: yes
>>>>>>> mpirun default --prefix: yes
>>>>>>> MCA backtrace: execinfo (MCA v1.0, API v1.0,
>>>>>>> Component v1.2.1)
>>>>>>> MCA memory: ptmalloc2 (MCA v1.0, API v1.0,
>>>>>>> Component v1.2.1)
>>>>>>> MCA paffinity: linux (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA maffinity: first_use (MCA v1.0, API v1.0,
>>>>>>> Component v1.2.1)
>>>>>>> MCA maffinity: libnuma (MCA v1.0, API v1.0,
>>>>>>> Component v1.2.1)
>>>>>>> MCA timer: linux (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA installdirs: env (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA installdirs: config (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA allocator: basic (MCA v1.0, API v1.0, Component
>>>>>>> v1.0)
>>>>>>> MCA allocator: bucket (MCA v1.0, API v1.0, Component
>>>>>>> v1.0)
>>>>>>> MCA coll: basic (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA coll: self (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA coll: sm (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA coll: tuned (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA io: romio (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA mpool: rdma (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA mpool: sm (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA pml: cm (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA pml: ob1 (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA bml: r2 (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA rcache: vma (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA btl: self (MCA v1.0, API v1.0.1, Component
>>>>>>> v1.2.1)
>>>>>>> MCA btl: sm (MCA v1.0, API v1.0.1, Component
>>>>>>> v1.2.1)
>>>>>>> MCA btl: tcp (MCA v1.0, API v1.0.1, Component
>>>>>>> v1.0)
>>>>>>> MCA topo: unity (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA osc: pt2pt (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA errmgr: hnp (MCA v1.0, API v1.3, Component
>>>>>>> v1.2.1)
>>>>>>> MCA errmgr: orted (MCA v1.0, API v1.3, Component
>>>>>>> v1.2.1)
>>>>>>> MCA errmgr: proxy (MCA v1.0, API v1.3, Component
>>>>>>> v1.2.1)
>>>>>>> MCA gpr: null (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA gpr: proxy (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA gpr: replica (MCA v1.0, API v1.0,
>>>>>>> Component v1.2.1)
>>>>>>> MCA iof: proxy (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA iof: svc (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA ns: proxy (MCA v1.0, API v2.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA ns: replica (MCA v1.0, API v2.0,
>>>>>>> Component v1.2.1)
>>>>>>> MCA oob: tcp (MCA v1.0, API v1.0, Component
>>>>>>> v1.0)
>>>>>>> MCA ras: dash_host (MCA v1.0, API v1.3,
>>>>>>> Component v1.2.1)
>>>>>>> MCA ras: gridengine (MCA v1.0, API v1.3,
>>>>>>> Component v1.2.1)
>>>>>>> MCA ras: localhost (MCA v1.0, API v1.3,
>>>>>>> Component v1.2.1)
>>>>>>> MCA ras: slurm (MCA v1.0, API v1.3, Component
>>>>>>> v1.2.1)
>>>>>>> MCA ras: tm (MCA v1.0, API v1.3, Component
>>>>>>> v1.2.1)
>>>>>>> MCA rds: hostfile (MCA v1.0, API v1.3,
>>>>>>> Component v1.2.1)
>>>>>>> MCA rds: proxy (MCA v1.0, API v1.3, Component
>>>>>>> v1.2.1)
>>>>>>> MCA rds: resfile (MCA v1.0, API v1.3,
>>>>>>> Component v1.2.1)
>>>>>>> MCA rmaps: round_robin (MCA v1.0, API v1.3,
>>>>>>> Component v1.2.1)
>>>>>>> MCA rmgr: proxy (MCA v1.0, API v2.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA rmgr: urm (MCA v1.0, API v2.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA rml: oob (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA pls: gridengine (MCA v1.0, API v1.3,
>>>>>>> Component v1.2.1)
>>>>>>> MCA pls: proxy (MCA v1.0, API v1.3, Component
>>>>>>> v1.2.1)
>>>>>>> MCA pls: rsh (MCA v1.0, API v1.3, Component
>>>>>>> v1.2.1)
>>>>>>> MCA pls: slurm (MCA v1.0, API v1.3, Component
>>>>>>> v1.2.1)
>>>>>>> MCA pls: tm (MCA v1.0, API v1.3, Component
>>>>>>> v1.2.1)
>>>>>>> MCA sds: env (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA sds: pipe (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA sds: seed (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>> MCA sds: singleton (MCA v1.0, API v1.0,
>>>>>>> Component v1.2.1)
>>>>>>> MCA sds: slurm (MCA v1.0, API v1.0, Component
>>>>>>> v1.2.1)
>>>>>>>
>>>>>>> As you can see, I used Gnu gcc and g++ with Intel Fortran
>>>>>>> Compiler to
>>>>>>> compile Open MPI and I am not sure if there are any special
>>>>>>> flags that
>>>>>>> I need to have.
>>>>>>> ./configure --prefix=/usr/local/openmpi-1.2.1 --disable-ipv6
>>>>>>> --with-tm=/usr/local/pbs --enable-mpirun-prefix-by-default
>>>>>>> --enable-mpi-f90
>>>>>>>
>>>>>>> After getting mpif90, I compiled my application (VASP) with
>>>>>>> this new
>>>>>>> parellel compiler but then I could not run it through PBS.
>>>>>>>
>>>>>>> #PBS -N Pt.CO.bridge.25ML
>>>>>>> ### Set the number of nodes that will be used. Ensure
>>>>>>> ### that the number "nodes" matches with the need of your job
>>>>>>> ### DO NOT MODIFY THE FOLLOWING LINE FOR SINGLE-PROCESSOR JOBS!
>>>>>>> #PBS -l nodes=node07:ppn=4
>>>>>>> #PBS -l walltime=96:00:00
>>>>>>> ##PBS -M asit_at_[hidden]
>>>>>>> #PBS -m abe
>>>>>>> export NPROCS=`wc -l $PBS_NODEFILE |gawk '//{print $1}'`
>>>>>>> echo $NPROCS
>>>>>>> echo The master node of this job is `hostname`
>>>>>>> echo The working directory is `echo $PBS_O_WORKDIR`
>>>>>>> echo The node file is $PBS_NODEFILE
>>>>>>> echo This job runs on the following $NPROCS nodes:
>>>>>>> echo `cat $PBS_NODEFILE`
>>>>>>> echo "=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-"
>>>>>>> echo
>>>>>>> echo command to EXE:
>>>>>>> echo
>>>>>>> echo
>>>>>>> cd $PBS_O_WORKDIR
>>>>>>>
>>>>>>> echo "cachesize=4000 mpiblock=500 npar=4 procgroup=4 mkl ompi"
>>>>>>>
>>>>>>> date
>>>>>>> /usr/local/openmpi-1.2.1/bin/mpiexec -mca mpi_paffinity_alone 1
>>>>>>> -np
>>>>>>> $NPROCS /hom e/struong/bin/vaspmpi_mkl_ompi >"$PBS_JOBID".out
>>>>>>> date
>>>>>>> ------------
>>>>>>>
>>>>>>> My environment is CentOS 4.4 x86_64, Intel Xeon, Torque, Maui.
>>>>>>>
>>>>>>> Could somebody here tell me what I missed or did incorrectly?
>>>>>>>
>>>>>>> Thank you very much.
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
-- Jeff Squyres Cisco Systems