$subject_val = "Re: [OMPI users] Machinefile option in opempi-1.3.2"; include("../../include/msg-header.inc"); ?>
Subject: Re: [OMPI users] Machinefile option in opempi-1.3.2
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-06-20 10:28:58
Ah, yes - that is definitely true. What you need to use is the "seq" (for
"sequential") mapper. Do the following on your cmd line:
--hostfile hostfile -mca rmaps seq
This will cause OMPI to map the process ranks according to the order in the
hostfile. You need to specify one line for each node/rank, just as you have
done.
Ralph
On Fri, Jun 19, 2009 at 10:24 PM, Rajesh Sudarsan <rsudarsan_at_[hidden]>wrote:
> Hi Ralph,
>
> Thanks for the reply. The default mapper does round-robin assignment
> as long as I do not specify the machinefile in the following format:
>
> n1
> n2
> n2
> n1 where, n1 and n2 are two nodes in the cluster and I use two
> slots within each node.
>
>
> I have pasted the output and the display map for execution on 2, 4,8
> and 16 processors. The mapper does not use the nodes in which it is
> listed in the file.
>
> The machinefile that I tested with uses two nodes n105 and n106 with 8
> cores in each node.
>
> n105
> n105
> n105
> n105
> n106
> n106
> n106
> n106
> n106
> n106
> n106
> n106
> n105
> n105
> n105
> n105
>
> When I run a hello world program on 2 processors which prints the
> hostname, the output and the display map are as follows:
>
>
> $ mpiexec --display-map -machinefile m3 -np 2 ./hello
>
> ======================== JOB MAP ========================
>
> Data for node: Name: n106 Num procs: 2
> Process OMPI jobid: [7838,1] Process rank: 0
> Process OMPI jobid: [7838,1] Process rank: 1
>
> =============================================================
> Rank 0 is present in C version of Hello World...hostname = n106
> Rank 1 of C version says: Hello world!..hostname = n106
>
>
>
>
> On 4 processors the output is as follows
>
> $ mpiexec --display-map -machinefile m3 -np 4 ./hello
>
> ======================== JOB MAP ========================
>
> Data for node: Name: n106 Num procs: 4
> Process OMPI jobid: [7294,1] Process rank: 0
> Process OMPI jobid: [7294,1] Process rank: 1
> Process OMPI jobid: [7294,1] Process rank: 2
> Process OMPI jobid: [7294,1] Process rank: 3
>
> =============================================================
> Rank 0 is present in C version of Hello World...hostname = n106
> Rank 1 of C version says: Hello world!..hostname = n106
> Rank 3 of C version says: Hello world!..hostname = n106
> Rank 2 of C version says: Hello world!..hostname = n106
>
>
>
>
> On 8 processors the output is as follows:
>
> $ mpiexec --display-map -machinefile m3 -np 8 ./hello
>
> ======================== JOB MAP ========================
>
> Data for node: Name: n106 Num procs: 8
> Process OMPI jobid: [7264,1] Process rank: 0
> Process OMPI jobid: [7264,1] Process rank: 1
> Process OMPI jobid: [7264,1] Process rank: 2
> Process OMPI jobid: [7264,1] Process rank: 3
> Process OMPI jobid: [7264,1] Process rank: 4
> Process OMPI jobid: [7264,1] Process rank: 5
> Process OMPI jobid: [7264,1] Process rank: 6
> Process OMPI jobid: [7264,1] Process rank: 7
>
> =============================================================
> Rank 3 of C version says: Hello world!..hostname = n106
> Rank 7 of C version says: Hello world!..hostname = n106
> Rank 0 is present in C version of Hello World...hostname = n106
> Rank 2 of C version says: Hello world!..hostname = n106
> Rank 4 of C version says: Hello world!..hostname = n106
> Rank 6 of C version says: Hello world!..hostname = n106
> Rank 5 of C version says: Hello world!..hostname = n106
> Rank 1 of C version says: Hello world!..hostname = n106
>
>
>
> On 16 nodes the output is as follows:
>
> $ mpiexec --display-map -machinefile m3 -np 16 ./hello
>
> ======================== JOB MAP ========================
>
> Data for node: Name: n106 Num procs: 8
> Process OMPI jobid: [7266,1] Process rank: 0
> Process OMPI jobid: [7266,1] Process rank: 1
> Process OMPI jobid: [7266,1] Process rank: 2
> Process OMPI jobid: [7266,1] Process rank: 3
> Process OMPI jobid: [7266,1] Process rank: 4
> Process OMPI jobid: [7266,1] Process rank: 5
> Process OMPI jobid: [7266,1] Process rank: 6
> Process OMPI jobid: [7266,1] Process rank: 7
>
> Data for node: Name: n105 Num procs: 8
> Process OMPI jobid: [7266,1] Process rank: 8
> Process OMPI jobid: [7266,1] Process rank: 9
> Process OMPI jobid: [7266,1] Process rank: 10
> Process OMPI jobid: [7266,1] Process rank: 11
> Process OMPI jobid: [7266,1] Process rank: 12
> Process OMPI jobid: [7266,1] Process rank: 13
> Process OMPI jobid: [7266,1] Process rank: 14
> Process OMPI jobid: [7266,1] Process rank: 15
>
> =============================================================
> Rank 10 of C version says: Hello world!..hostname = n105
> Rank 12 of C version says: Hello world!..hostname = n105
> Rank 13 of C version says: Hello world!..hostname = n105
> Rank 14 of C version says: Hello world!..hostname = n105
> Rank 0 is present in C version of Hello World...hostname = n106
> Rank 1 of C version says: Hello world!..hostname = n106
> Rank 3 of C version says: Hello world!..hostname = n106
> Rank 6 of C version says: Hello world!..hostname = n106
> Rank 7 of C version says: Hello world!..hostname = n106
> Rank 15 of C version says: Hello world!..hostname = n105
> Rank 8 of C version says: Hello world!..hostname = n105
> Rank 11 of C version says: Hello world!..hostname = n105
> Rank 4 of C version says: Hello world!..hostname = n106
> Rank 2 of C version says: Hello world!..hostname = n106
> Rank 5 of C version says: Hello world!..hostname = n106
> Rank 9 of C version says: Hello world!..hostname = n105
>
>
>
> Thanks,
> Rajesh
>
>
>
>
>
> On Fri, Jun 19, 2009 at 10:40 PM, Ralph Castain<rhc_at_[hidden]> wrote:
> > If you do "man orte_hosts", you'll see a full explanation of how the
> various
> > machinefile options work.
> > The default mapper doesn't do any type of sorting - it is a round-robin
> > mapper that just works its way through the provided nodes. We don't
> reorder
> > them in any way.
> > However, it does depend on the number of slots we are told each node has,
> so
> > that might be what you are encountering. If you do a --display-map and
> send
> > it along, I might be able to spot the issue.
> > Thanks
> > Ralph
> >
> > On Fri, Jun 19, 2009 at 1:35 PM, Rajesh Sudarsan <rsudarsan_at_[hidden]>
> > wrote:
> >>
> >> Hi,
> >>
> >> I tested a simple hello world program on 5 nodes each with dual
> >> quad-core processors. I noticed that openmpi does not always follow
> >> the order of the processors indicated in the machinefile. Depending
> >> upon the number of processors requested, openmpi does some type of
> >> sorting to find the best node fit for a particular job and runs on
> >> them. Is there a way to make openmpi to turn off this sorting and
> >> strictly follow the order indicated in the machinefile?
> >>
> >> mpiexec supports three options to specify the machinefile -
> >> default-machinefile, hostfile, and machinefile. Can anyone tell what
> >> is the difference between these three options?
> >>
> >> Any help would be greatly appreciated.
> >>
> >> Thanks,
> >> Rajesh
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>