Subject: Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-06-23 17:20:58


Hmmm...just to be clear - did you run this against OMPI 1.3.2, or
1.2.8? I see a 1.2.8 in your app name, hence the question.

This option only works with 1.3.2, I'm afraid - it was a new feature.

Ralph

On Jun 23, 2009, at 2:31 PM, Jim Kress ORG wrote:

> Ralph,
>
> I did the following:
>
> export OMPI_MCA_mpi_show_mca_params="file,env"
>
> then I checked and found it via the set command as
>
> OMPI_MCA_mpi_show_mca_params=file,env
>
> I then ran my application
>
> ./orca hexatriene_TDDFT_get_asa_input_parallel_1.inp >
> 1.2.8_test_crafted_input_file.out
>
> and got the expected ORCA output in the .out file but nothing at the
> command line or in the .out file about mca_params
>
> What did I do wrong?
>
> Jim
>
> On Mon, 2009-06-22 at 19:40 -0600, Ralph Castain wrote:
>> Sounds very strange, indeed. You might want to check that your app is
>> actually getting the MCA param that you think it is. Try adding:
>>
>> -mca mpi_show_mca_params file,env
>>
>> to your cmd line. This will cause rank=0 to output the MCA params it
>> thinks were set via the default files and/or environment (including
>> cmd line).
>>
>> Ralph
>>
>> On Jun 22, 2009, at 6:14 PM, Jim Kress ORG wrote:
>>
>>> For the app I am using, ORCA (a Quantum Chemistry program), when it
>>> was
>>> compiled using openMPI 1.2.8 and run under 1.2.8 with the
>>> following in
>>> the openmpi-mca-params.conf file:
>>>
>>> btl=self,openib
>>>
>>> the app ran fine with no traffic over my Ethernet network and all
>>> traffic over my Infiniband network.
>>>
>>> However, now that ORCA has been recompiled with openMPI v1.3.2 and
>>> run
>>> under 1.3.2 (using the same openmpi-mca-params.conf file), the
>>> performance has been reduced by 50% and all the MPI traffic is going
>>> over the Ethernet network.
>>>
>>> As a matter of fact, the openMPI v1.3.2 performance now looks
>>> exactly
>>> like the performance I get if I use MPICH 1.2.7.
>>>
>>> Anyone have any ideas:
>>>
>>> 1) How could this have happened?
>>>
>>> 2) How can I fix it?
>>>
>>> a 50% reduction in performance is just not acceptable. Ideas/
>>> suggestions would be appreciated.
>>>
>>> Jim
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users