Subject: Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
From: Gus Correa (gus_at_[hidden])
Date: 2009-06-23 14:00:40


Hi Jim, list

Have you checked if configure caught your IB libraries properly?
IIRR there has been some changes since 1.2.8 on how configure searches
for libraries (e.g. finding libnuma was a problem, now fixed).
Chances are that if you used some old script
or command line to run configure, it may not have worked as you expected.

Check the output of ompi_info -config.
It should show -lrdmacm -libverbs, otherwise it skipped IB.
In this case you can reconfigure, pointing to the IB library location.

If you have a log of your configure step you can also search it for
openib, libverbs, etc, to see if it did what you expected.

I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

Pavel Shamis (Pasha) wrote:
> Jim,
> Can you please share with us you mca conf file.
>
> Pasha.
> Jim Kress ORG wrote:
>> For the app I am using, ORCA (a Quantum Chemistry program), when it was
>> compiled using openMPI 1.2.8 and run under 1.2.8 with the following in
>> the openmpi-mca-params.conf file:
>>
>> btl=self,openib
>>
>> the app ran fine with no traffic over my Ethernet network and all
>> traffic over my Infiniband network.
>>
>> However, now that ORCA has been recompiled with openMPI v1.3.2 and run
>> under 1.3.2 (using the same openmpi-mca-params.conf file), the
>> performance has been reduced by 50% and all the MPI traffic is going
>> over the Ethernet network.
>>
>> As a matter of fact, the openMPI v1.3.2 performance now looks exactly
>> like the performance I get if I use MPICH 1.2.7.
>>
>> Anyone have any ideas:
>>
>> 1) How could this have happened?
>>
>> 2) How can I fix it?
>>
>> a 50% reduction in performance is just not acceptable. Ideas/
>> suggestions would be appreciated.
>>
>> Jim
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users