[GE users] FW: Pick nodes from one queue plus 1 node from another queue

Stephan Grell - Sun Germany - SSG - Software Engineer stephan.grell at sun.com
Wed Mar 30 15:04:07 BST 2005


William Burke wrote:

>Hi,
>
><Snip>
>  
>
>>you don't need a special queue to set up for the FatQueue machine. You 
>>can submit with "-masterq QueueA.q at myhost" in qsub.
>>    
>>
>
>The thing is if I pick myhost to be masterq what happens if that host is
>busy with another job and there are other host that can be picked.
>
>The robustness that I need in SGE is for it to arbitrarily pick those M-1
>nodes from the QueueA.q and the Mth one from FatQueueB.q. I do not see how
>the "-masterq QueueA.q at myhost" in qsub will achieve this. Help me to
>understand your suggestion.
>  
>
You can put the M-1 hosts in one cluster queue and the other hosts into 
another cluster queue.

A simple approach would be to put all M-1 hosts into the M-1 hostgroup 
and all other hosts
into a second hostgroup (others).

You than define a cluster queue on the M-1 and other hostgroup.

The qsub command would like like:

qsub -pe .... -masterq "cluster_queue@@M-1" .....

This ensures, that the master task is started on one of teh M-1 machines.

Does it help?

Stephan

>I think that PBS's qsub has a way to specify a queue and the number of nodes
>from that queue - Queue:Num_nodes 
>Does SGE have this built in functionality?
>
>William
>
>-----Original Message-----
>From: Reuti [mailto:reuti at staff.uni-marburg.de] 
>Sent: Wednesday, March 30, 2005 6:23 AM
>To: users at gridengine.sunsource.net
>Subject: Re: [GE users] FW: Pick nodes from one queue plus 1 node from
>another queue
>
>Hi,
>
>you don't need a special queue to set up for the FatQueue machine. You 
>can submit with "-masterq QueueA.q at myhost" in qsub.
>
>Small problem: SGE may select another slot from this machine, unless you 
>choose an allocation rule of 1. Then you can be sure, one slot (the 
>special) one on the extra machine (so you may give this machine more 
>slots than the other machines). The other slots will be on other 
>machines this way. But as this can only be done for the head node of the 
>parallel job, maybe you have to reorder any operation in your script, as 
>you requested it to be the last machine.
>
>Cheers - Reuti
>
>William Burke wrote:
>  
>
>>Ultimately I would like to submit a parallel job that uses N-1 nodes 
>>from QueueA.q and 1 node from FatQueueB.q as long as the node from 
>>FatQueueB..q is the last node on the machinefile list
>>
>> 
>>
>>------------------------------------------------------------------------
>>
>>From: William Burke [mailto:wburke999 at msn.com]
>>Sent: Wednesday, March 30, 2005 1:23 AM
>>To: users at gridengine.sunsource.net
>>Subject: RE: Pick nodes from one queue plus 1 node from another queue
>>
>> 
>>
>>There are M nodes in machine list and lets say that I want to submit a 
>>job that can explicitly pick an exact amount of nodes from one 
>>particular queue and  only one
>>
>>node from another queue which equals the total # of nodes found in the 
>>$pe_hostfile.
>>
>> 
>>
>>So for instance:
>>
>> 
>>
>>The user launches a parallel job that requests 33 processors. If two 
>>queues exist, QueueA.q (consisting of 45 nodes) and FatQueueB.q 
>>(consisting of 2 nodes from QueueA.q's nodes) the user wants the ability 
>>to specify 32 processors from QueueA.q and only 1 processor from 
>>FatQueueB.q, what is the best way to implement that?
>>
>> 
>>
>>This is the situation:
>>
>> 
>>
>>1.    The particular application needs N processors for a job
>>
>>2.    I request this in -pe mpich N parameter
>>
>>3.    SGE generates M machines in its $pe_hostfile list based on the Nth 
>>processor
>>
>>4.    As we already know the algorithm that creates $pe_hostfile says 
>>create M nodes {if N is an even number then the Mth node should be N/2 
>>else the Mth node should be (Nth+1)/2}
>>
>>a.    I need some way to tell SGE that the Mth (or last) node of the 
>>Machinefile list always has to be a node from the FatQueue.q which I use 
>>those type of nodes for heavy io processing of the job.
>>
>>b.    I do not want a job to run unless the Mth (or last) node in the 
>>Machinefile is a node from FatQueue.q otherwise the job should wait 
>>until that request is filled.
>>
>>5.    Ultimately the correctly formatted mpirun machinefile gets created 
>>from the final $pe_hostfile of M nodes.
>>
>> 
>>
>>FWIW, usually the amount of processors is odd.
>>
>>What is very important is that the last node of the mpirun machinefile 
>>list is always from the FatQueueB.q.
>>
>> 
>>
>>Regards,
>>
>> 
>>
>>William
>>
>> 
>>
>>    
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list