Subject: Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
From: Jim Kress ORG (jimkress_58_at_[hidden])
Date: 2009-06-23 17:49:44


For v 1.3.2:

Here is the ompi_info -config output and I've attached a copy of the
config.log file which seems to clearly indicate it found the infiniband
libraries.

[root_at_master ~]# ompi_info -config
           Configured by: root
           Configured on: Sun Jun 21 22:02:59 EDT 2009
          Configure host: master.org
                Built by: root
                Built on: Sun Jun 21 22:10:07 EDT 2009
              Built host: master.org
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (all)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
             C char size: 1
             C bool size: 1
            C short size: 2
              C int size: 4
             C long size: 8
            C float size: 4
           C double size: 8
          C pointer size: 8
            C char align: 1
            C bool align: 1
             C int align: 4
           C float align: 4
          C double align: 8
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
      Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
      Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran
       Fort integer size: 4
       Fort logical size: 4
 Fort logical value true: 1
      Fort have integer1: yes
      Fort have integer2: yes
      Fort have integer4: yes
      Fort have integer8: yes
     Fort have integer16: no
         Fort have real4: yes
         Fort have real8: yes
        Fort have real16: no
      Fort have complex8: yes
     Fort have complex16: yes
     Fort have complex32: no
      Fort integer1 size: 1
      Fort integer2 size: 2
      Fort integer4 size: 4
      Fort integer8 size: 8
     Fort integer16 size: -1
          Fort real size: 4
         Fort real4 size: 4
         Fort real8 size: 8
        Fort real16 size: -1
      Fort dbl prec size: 4
          Fort cplx size: 4
      Fort dbl cplx size: 4
         Fort cplx8 size: 8
        Fort cplx16 size: 16
        Fort cplx32 size: -1
      Fort integer align: 4
     Fort integer1 align: 1
     Fort integer2 align: 2
     Fort integer4 align: 4
     Fort integer8 align: 8
    Fort integer16 align: -1
         Fort real align: 4
        Fort real4 align: 4
        Fort real8 align: 8
       Fort real16 align: -1
     Fort dbl prec align: 4
         Fort cplx align: 4
     Fort dbl cplx align: 4
        Fort cplx8 align: 4
       Fort cplx16 align: 8
       Fort cplx32 align: -1
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
           Sparse Groups: no
            Build CFLAGS: -O3 -DNDEBUG -finline-functions
-fno-strict-aliasing
                          -pthread -fvisibility=hidden
          Build CXXFLAGS: -O3 -DNDEBUG -finline-functions -pthread
            Build FFLAGS:
           Build FCFLAGS:
           Build LDFLAGS: -export-dynamic
              Build LIBS: -lnsl -lutil -lm
    Wrapper extra CFLAGS: -pthread
  Wrapper extra CXXFLAGS: -pthread
    Wrapper extra FFLAGS: -pthread
   Wrapper extra FCFLAGS: -pthread
   Wrapper extra LDFLAGS:
      Wrapper extra LIBS: -ldl -Wl,--export-dynamic -lnsl -lutil -lm
-ldl
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
   Heterogeneous support: no
 mpirun default --prefix: no
         MPI I/O support: yes
       MPI_WTIME support: gettimeofday
Symbol visibility support: yes
   FT Checkpoint support: no (checkpoint thread: no)
[root_at_master ~]#

On Tue, 2009-06-23 at 15:20 -0600, Ralph Castain wrote:
> Hmmm...just to be clear - did you run this against OMPI 1.3.2, or
> 1.2.8? I see a 1.2.8 in your app name, hence the question.
>
> This option only works with 1.3.2, I'm afraid - it was a new feature.
>
> Ralph
>
> On Jun 23, 2009, at 2:31 PM, Jim Kress ORG wrote:
>
> > Ralph,
> >
> > I did the following:
> >
> > export OMPI_MCA_mpi_show_mca_params="file,env"
> >
> > then I checked and found it via the set command as
> >
> > OMPI_MCA_mpi_show_mca_params=file,env
> >
> > I then ran my application
> >
> > ./orca hexatriene_TDDFT_get_asa_input_parallel_1.inp >
> > 1.2.8_test_crafted_input_file.out
> >
> > and got the expected ORCA output in the .out file but nothing at the
> > command line or in the .out file about mca_params
> >
> > What did I do wrong?
> >
> > Jim
> >
> > On Mon, 2009-06-22 at 19:40 -0600, Ralph Castain wrote:
> >> Sounds very strange, indeed. You might want to check that your app is
> >> actually getting the MCA param that you think it is. Try adding:
> >>
> >> -mca mpi_show_mca_params file,env
> >>
> >> to your cmd line. This will cause rank=0 to output the MCA params it
> >> thinks were set via the default files and/or environment (including
> >> cmd line).
> >>
> >> Ralph
> >>
> >> On Jun 22, 2009, at 6:14 PM, Jim Kress ORG wrote:
> >>
> >>> For the app I am using, ORCA (a Quantum Chemistry program), when it
> >>> was
> >>> compiled using openMPI 1.2.8 and run under 1.2.8 with the
> >>> following in
> >>> the openmpi-mca-params.conf file:
> >>>
> >>> btl=self,openib
> >>>
> >>> the app ran fine with no traffic over my Ethernet network and all
> >>> traffic over my Infiniband network.
> >>>
> >>> However, now that ORCA has been recompiled with openMPI v1.3.2 and
> >>> run
> >>> under 1.3.2 (using the same openmpi-mca-params.conf file), the
> >>> performance has been reduced by 50% and all the MPI traffic is going
> >>> over the Ethernet network.
> >>>
> >>> As a matter of fact, the openMPI v1.3.2 performance now looks
> >>> exactly
> >>> like the performance I get if I use MPICH 1.2.7.
> >>>
> >>> Anyone have any ideas:
> >>>
> >>> 1) How could this have happened?
> >>>
> >>> 2) How can I fix it?
> >>>
> >>> a 50% reduction in performance is just not acceptable. Ideas/
> >>> suggestions would be appreciated.
> >>>
> >>> Jim
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users