[GE users] FW: Pick nodes from one queue plus 1 node from another queue

William Burke wburke999 at msn.com
Wed Mar 30 16:49:51 BST 2005


Reuti

 

<snip>

> I think, he means an allocation in PBS like:

> 

> -l nodes=ionode:1+compute:5

 

Yes this is the exact functionality that I need and this would ensure that
the job would include 5 compute hosts and that one ionode in the
$pe_hostfile. Then I could 

 

1.    direct the output of $pe_hostfile to a file that could be manipulated 

2.    in the startmpi.sh ensure that in the PEHostfiletoMachinefile
conversion the 1 ionode node in $pe_hostfile becomes the last node in the
Machinefile

 

Does that make sense?

 

Regards,

William



 

-----Original Message-----
From: Reuti [mailto:reuti at staff.uni-marburg.de] 
Sent: Wednesday, March 30, 2005 9:26 AM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] FW: Pick nodes from one queue plus 1 node from
another queue

 

Stephan,

 

I was thinking of the same. But wouldn't this allow to get additional 

slots from the wrong queue as slaves?

 

I think, he means an allocation in PBS like:

 

-l nodes=ionode:1+compute:5

 

to get 6 CPUs - one from the nodes with the feature ionode and 5 with 

the feature compute. - Reuti

 

 

Stephan Grell - Sun Germany - SSG - Software Engineer wrote:

> William Burke wrote:

> 

>> Hi,

>> 

>> <Snip>

>>  

>> 

>>> you don't need a special queue to set up for the FatQueue machine. 

>>> You can submit with "-masterq QueueA.q at myhost" in qsub.

>>>   

>> 

>> 

>> The thing is if I pick myhost to be masterq what happens if that host is

>> busy with another job and there are other host that can be picked.

>> 

>> The robustness that I need in SGE is for it to arbitrarily pick those M-1

>> nodes from the QueueA.q and the Mth one from FatQueueB.q. I do not see 

>> how

>> the "-masterq QueueA.q at myhost" in qsub will achieve this. Help me to

>> understand your suggestion.

>>  

>> 

> You can put the M-1 hosts in one cluster queue and the other hosts into 

> another cluster queue.

> 

> A simple approach would be to put all M-1 hosts into the M-1 hostgroup 

> and all other hosts

> into a second hostgroup (others).

> 

> You than define a cluster queue on the M-1 and other hostgroup.

> 

> The qsub command would like like:

> 

> qsub -pe .... -masterq "cluster_queue@@M-1" .....

> 

> This ensures, that the master task is started on one of teh M-1 machines.

> 

> Does it help?

> 

> Stephan

> 

>> I think that PBS's qsub has a way to specify a queue and the number of 

>> nodes

>> from that queue - Queue:Num_nodes Does SGE have this built in 

>> functionality?

>> 

>> William

>> 

>> -----Original Message-----

>> From: Reuti [mailto:reuti at staff.uni-marburg.de] Sent: Wednesday, March 

>> 30, 2005 6:23 AM

>> To: users at gridengine.sunsource.net

>> Subject: Re: [GE users] FW: Pick nodes from one queue plus 1 node from

>> another queue

>> 

>> Hi,

>> 

>> you don't need a special queue to set up for the FatQueue machine. You 

>> can submit with "-masterq QueueA.q at myhost" in qsub.

>> 

>> Small problem: SGE may select another slot from this machine, unless 

>> you choose an allocation rule of 1. Then you can be sure, one slot 

>> (the special) one on the extra machine (so you may give this machine 

>> more slots than the other machines). The other slots will be on other 

>> machines this way. But as this can only be done for the head node of 

>> the parallel job, maybe you have to reorder any operation in your 

>> script, as you requested it to be the last machine.

>> 

>> Cheers - Reuti

>> 

>> William Burke wrote:

>>  

>> 

>>> Ultimately I would like to submit a parallel job that uses N-1 nodes 

>>> from QueueA.q and 1 node from FatQueueB.q as long as the node from 

>>> FatQueueB..q is the last node on the machinefile list

>>> 

>>> 

>>> 

>>> ------------------------------------------------------------------------

>>> 

>>> From: William Burke [mailto:wburke999 at msn.com]

>>> Sent: Wednesday, March 30, 2005 1:23 AM

>>> To: users at gridengine.sunsource.net

>>> Subject: RE: Pick nodes from one queue plus 1 node from another queue

>>> 

>>> 

>>> 

>>> There are M nodes in machine list and lets say that I want to submit 

>>> a job that can explicitly pick an exact amount of nodes from one 

>>> particular queue and  only one

>>> 

>>> node from another queue which equals the total # of nodes found in 

>>> the $pe_hostfile.

>>> 

>>> 

>>> 

>>> So for instance:

>>> 

>>> 

>>> 

>>> The user launches a parallel job that requests 33 processors. If two 

>>> queues exist, QueueA.q (consisting of 45 nodes) and FatQueueB.q 

>>> (consisting of 2 nodes from QueueA.q's nodes) the user wants the 

>>> ability to specify 32 processors from QueueA.q and only 1 processor 

>>> from FatQueueB.q, what is the best way to implement that?

>>> 

>>> 

>>> 

>>> This is the situation:

>>> 

>>> 

>>> 

>>> 1.    The particular application needs N processors for a job

>>> 

>>> 2.    I request this in -pe mpich N parameter

>>> 

>>> 3.    SGE generates M machines in its $pe_hostfile list based on the 

>>> Nth processor

>>> 

>>> 4.    As we already know the algorithm that creates $pe_hostfile says 

>>> create M nodes {if N is an even number then the Mth node should be 

>>> N/2 else the Mth node should be (Nth+1)/2}

>>> 

>>> a.    I need some way to tell SGE that the Mth (or last) node of the 

>>> Machinefile list always has to be a node from the FatQueue.q which I 

>>> use those type of nodes for heavy io processing of the job.

>>> 

>>> b.    I do not want a job to run unless the Mth (or last) node in the 

>>> Machinefile is a node from FatQueue.q otherwise the job should wait 

>>> until that request is filled.

>>> 

>>> 5.    Ultimately the correctly formatted mpirun machinefile gets 

>>> created from the final $pe_hostfile of M nodes.

>>> 

>>> 

>>> 

>>> FWIW, usually the amount of processors is odd.

>>> 

>>> What is very important is that the last node of the mpirun 

>>> machinefile list is always from the FatQueueB.q.

>>> 

>>> 

>>> 

>>> Regards,

>>> 

>>> 

>>> 

>>> William

>>> 

>>> 

>>> 

>>>   

>> 

>> 

>> 

>> ---------------------------------------------------------------------

>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net

>> For additional commands, e-mail: users-help at gridengine.sunsource.net

>> 

>> 

>> ---------------------------------------------------------------------

>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net

>> For additional commands, e-mail: users-help at gridengine.sunsource.net

>> 

>>  

>> 

> 

> 

> ---------------------------------------------------------------------

> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net

> For additional commands, e-mail: users-help at gridengine.sunsource.net

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net

For additional commands, e-mail: users-help at gridengine.sunsource.net

 




More information about the gridengine-users mailing list