[GE users] Re: [OMPI devel] [GE users] OpenMPI 1.2 integration and dedicated MPI networks

Orion Poplawski orion at cora.nwra.com
Fri Oct 20 19:44:34 BST 2006

    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Pak Lui wrote:
> Hi Orion and Reuti,
> Let me see if I can understand the issue by breaking them down first:
> (1) First, I am curious to know why you would need to create a 
> PE_HOSTFILE yourself, because that file is generated by SGE/N1GE when 
> you specify you are running a parallel job under SGE/N1GE, by doing 
> something like this with qsub/qsh/qrsh, etc:

In our setup (which I don't believe is very unique) the nodes are 
connected by two networks:  an "admin" network which allows for 
connections from outside the cluster and an "MPI" network that is a 
private GigE network connecting the nodes for MPI traffic:

        +---------admin net (192.168.0.X)--------+
        |                           |            |
+-----------+                 +--------+    +--------+
| SGE Master|                 | coop00 |    | coop01 |
|           |                 | coop00x|    | coop01x|
+-----------+                 +--------+    +--------+
                                    |            |

                                     MPI net (192.168.1.X)

So the "x" suffix names are the addresses on the MPI network.

Currently (loose integration), we create machines files like:

coop00x.cora.nwra.com cpu=2
coop01x.cora.nwra.com cpu=2

which makes the MPI traffic travel over the MPI network.  I'm trying to 
duplicate this under "tight" integration.

> (2) As for the following error message:
>  > error: commlib error: access denied (client IP resolved to host name
>  > "coop01x.cora.nwra.com". This is not identical to clients host name
>  > "coop01.cora.nwra.com")
> As you mentioned in your setup, each node has 2 interfaces. And this 
> message is an SGE error and it seems to tell you that SGE cannot resolve 
> the host name.

No, it says that the IP resolved to "coop01x.cora.nwra.com", but the 
machines' primary hostname is "coop01.cora.nwra.com".

> (4) As for what you have mentioned here:
>  > Now, looking at the OpenMPI gridengine code, it looks like it gets the
>  > node name from the first entry in the pe_hostfile, and never really uses
>  > the queue name for anything.
>  >
>  >          ptr = strtok_r(buf, " \n", &tok);
>  >          num = strtok_r(NULL, " \n", &tok);
>  >          queue = strtok_r(NULL, " \n", &tok);
>  >          arch = strtok_r(NULL, " \n", &tok);
>  > ...
>  >          node->node_name = strdup(ptr);
>  >          node->node_arch = strdup(arch);
>  >
>  > Perhaps it can be modified it uses the queue name hostname when doing
>  > SGE/qrsh calls, but the first hostname when doing MPI communication.
>  > Not really sure what the intent of the two fields in SGE's pe_hostfile
>  > is, or if OpenMPI can handle the idea of two hostnames for different
>  > purposes.
>  >
> Once it is in a parallel environment of SGE (e.g. when you have started 
> a parallel job with "qsh/qsub/qrsh -pe name_of_pe"), in ORTE would use 
> the -inherit flag of qrsh to tell qrsh to start a task in a already 
> scheduled parallel job, therefore we cannot assign another queue to the 
> job

I don't want to assign another queue to the job.  Assuming a pe_hostfile of:

coop01x.cora.nwra.com 2 coop.q at coop01.cora.nwra.com <NULL>
coop00x.cora.nwra.com 2 coop.q at coop00.cora.nwra.com <NULL>

I would like ORTE to use "coop01.cora.nwra.com" and 
"coop00.cora.nwra.com" (taken from the hostname parts of the queue 
names) for use with qrsh, and "coop01x.cora.nwra.com" and 
"coop00x.cora.nwra.com" as the hostnames to use for MPI traffic.

Again, this is just a stab at some way to hack what I want to achieve, 
which is SGE/admin traffic traveling over one network, and MPI traffic 
traveling over another.  I don't care how it is done.


- Orion

Orion Poplawski
System Administrator                  303-415-9701 x222
NWRA/CoRA Division                    FAX: 303-415-9702
3380 Mitchell Lane                  orion at cora.nwra.com
Boulder, CO 80301              http://www.cora.nwra.com

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list