[GE users] FW: Pick nodes from one queue plus 1 node from another queue

Reuti reuti at staff.uni-marburg.de
Wed Mar 30 17:12:09 BST 2005


William,

unfortunately this is not directly available in SGE with a -l option.

But, depending on the nodes you have and nodes you want to use for 
calculations: what about leaving the ionodes out of SGE? So, you request 
only 5 in this example. The script in start_proc_args will select one of 
the other nodes for you and append it to the already created 
$TMPDIR/machines file. Okay, how to select a free ionode. Depends on 
your configuration: can you give more details? One job per ionode, or 
two? How many of them you have - will this limit the total jobs in your 
cluster (at least for this type of job)?

CU - Reuti

William Burke wrote:
> Reuti
> 
>  
> 
> <snip>
> 
>>  I think, he means an allocation in PBS like:
> 
>> 
> 
>>  -l nodes=ionode:1+compute:5
> 
>  
> 
> Yes this is the exact functionality that I need and this would ensure 
> that the job would include 5 compute hosts and that one ionode in the 
> $pe_hostfile. Then I could
> 
>  
> 
> 1.    direct the output of $pe_hostfile to a file that could be manipulated
> 
> 2.    in the startmpi.sh ensure that in the PEHostfiletoMachinefile 
> conversion the 1 ionode node in $pe_hostfile becomes the last node in 
> the Machinefile
> 
>  
> 
> Does that make sense?
> 
>  
> 
> Regards,
> 
> William
> 
>  
> 
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: Wednesday, March 30, 2005 9:26 AM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] FW: Pick nodes from one queue plus 1 node from 
> another queue
> 
>  
> 
> Stephan,
> 
>  
> 
> I was thinking of the same. But wouldn't this allow to get additional
> 
> slots from the wrong queue as slaves?
> 
>  
> 
> I think, he means an allocation in PBS like:
> 
>  
> 
> -l nodes=ionode:1+compute:5
> 
>  
> 
> to get 6 CPUs - one from the nodes with the feature ionode and 5 with
> 
> the feature compute. - Reuti
> 
>  
> 
>  
> 
> Stephan Grell - Sun Germany - SSG - Software Engineer wrote:
> 
>>  William Burke wrote:
> 
>>
> 
>> > Hi,
> 
>> > 
> 
>> > <Snip>
> 
>> > 
> 
>> > 
> 
>> >> you don't need a special queue to set up for the FatQueue machine.
> 
>> >> You can submit with "-masterq QueueA.q at myhost" in qsub.
> 
>> >>  
> 
>> > 
> 
>> > 
> 
>> > The thing is if I pick myhost to be masterq what happens if that host is
> 
>> > busy with another job and there are other host that can be picked.
> 
>> > 
> 
>> > The robustness that I need in SGE is for it to arbitrarily pick those M-1
> 
>> > nodes from the QueueA.q and the Mth one from FatQueueB.q. I do not see
> 
>> > how
> 
>> > the "-masterq QueueA.q at myhost" in qsub will achieve this. Help me to
> 
>> > understand your suggestion.
> 
>> > 
> 
>> > 
> 
>>  You can put the M-1 hosts in one cluster queue and the other hosts into
> 
>>  another cluster queue.
> 
>>
> 
>>  A simple approach would be to put all M-1 hosts into the M-1 hostgroup
> 
>>  and all other hosts
> 
>>  into a second hostgroup (others).
> 
>>
> 
>>  You than define a cluster queue on the M-1 and other hostgroup.
> 
>>
> 
>>  The qsub command would like like:
> 
>>
> 
>>  qsub -pe .... -masterq "cluster_queue@@M-1" ....
> 
>>
> 
>>  This ensures, that the master task is started on one of teh M-1 machines.
> 
>>
> 
>>  Does it help?
> 
>>
> 
>>  Stephan
> 
>>
> 
>> > I think that PBS's qsub has a way to specify a queue and the number of
> 
>> > nodes
> 
>> > from that queue - Queue:Num_nodes Does SGE have this built in
> 
>> > functionality?
> 
>> > 
> 
>> > William
> 
>> > 
> 
>> > -----Original Message-----
> 
>> > From: Reuti [mailto:reuti at staff.uni-marburg.de] Sent: Wednesday, March
> 
>> > 30, 2005 6:23 AM
> 
>> > To: users at gridengine.sunsource.net
> 
>> > Subject: Re: [GE users] FW: Pick nodes from one queue plus 1 node from
> 
>> > another queue
> 
>> > 
> 
>> > Hi,
> 
>> > 
> 
>> > you don't need a special queue to set up for the FatQueue machine. You
> 
>> > can submit with "-masterq QueueA.q at myhost" in qsub.
> 
>> > 
> 
>> > Small problem: SGE may select another slot from this machine, unless
> 
>> > you choose an allocation rule of 1. Then you can be sure, one slot
> 
>> > (the special) one on the extra machine (so you may give this machine
> 
>> > more slots than the other machines). The other slots will be on other
> 
>> > machines this way. But as this can only be done for the head node of
> 
>> > the parallel job, maybe you have to reorder any operation in your
> 
>> > script, as you requested it to be the last machine.
> 
>> > 
> 
>> > Cheers - Reuti
> 
>> > 
> 
>> > William Burke wrote:
> 
>> > 
> 
>> > 
> 
>> >> Ultimately I would like to submit a parallel job that uses N-1 nodes
> 
>> >> from QueueA.q and 1 node from FatQueueB.q as long as the node from
> 
>> >> FatQueueB..q is the last node on the machinefile list
> 
>> >> 
> 
>> >> 
> 
>> >> 
> 
>> >> ------------------------------------------------------------------------
> 
>> >> 
> 
>> >> From: William Burke [mailto:wburke999 at msn.com]
> 
>> >> Sent: Wednesday, March 30, 2005 1:23 AM
> 
>> >> To: users at gridengine.sunsource.net
> 
>> >> Subject: RE: Pick nodes from one queue plus 1 node from another queue
> 
>> >> 
> 
>> >> 
> 
>> >> 
> 
>> >> There are M nodes in machine list and lets say that I want to submit
> 
>> >> a job that can explicitly pick an exact amount of nodes from one
> 
>> >> particular queue and  only one
> 
>> >> 
> 
>> >> node from another queue which equals the total # of nodes found in
> 
>> >> the $pe_hostfile.
> 
>> >> 
> 
>> >> 
> 
>> >> 
> 
>> >> So for instance:
> 
>> >> 
> 
>> >> 
> 
>> >> 
> 
>> >> The user launches a parallel job that requests 33 processors. If two
> 
>> >> queues exist, QueueA.q (consisting of 45 nodes) and FatQueueB.q
> 
>> >> (consisting of 2 nodes from QueueA.q's nodes) the user wants the
> 
>> >> ability to specify 32 processors from QueueA.q and only 1 processor
> 
>> >> from FatQueueB.q, what is the best way to implement that?
> 
>> >> 
> 
>> >> 
> 
>> >> 
> 
>> >> This is the situation:
> 
>> >> 
> 
>> >> 
> 
>> >> 
> 
>> >> 1.    The particular application needs N processors for a job
> 
>> >> 
> 
>> >> 2.    I request this in -pe mpich N parameter
> 
>> >> 
> 
>> >> 3.    SGE generates M machines in its $pe_hostfile list based on the
> 
>> >> Nth processor
> 
>> >> 
> 
>> >> 4.    As we already know the algorithm that creates $pe_hostfile says
> 
>> >> create M nodes {if N is an even number then the Mth node should be
> 
>> >> N/2 else the Mth node should be (Nth+1)/2}
> 
>> >> 
> 
>> >> a.    I need some way to tell SGE that the Mth (or last) node of the
> 
>> >> Machinefile list always has to be a node from the FatQueue.q which I
> 
>> >> use those type of nodes for heavy io processing of the job.
> 
>> >> 
> 
>> >> b.    I do not want a job to run unless the Mth (or last) node in the
> 
>> >> Machinefile is a node from FatQueue.q otherwise the job should wait
> 
>> >> until that request is filled.
> 
>> >> 
> 
>> >> 5.    Ultimately the correctly formatted mpirun machinefile gets
> 
>> >> created from the final $pe_hostfile of M nodes.
> 
>> >> 
> 
>> >> 
> 
>> >> 
> 
>> >> FWIW, usually the amount of processors is odd.
> 
>> >> 
> 
>> >> What is very important is that the last node of the mpirun
> 
>> >> machinefile list is always from the FatQueueB.q.
> 
>> >> 
> 
>> >> 
> 
>> >> 
> 
>> >> Regards,
> 
>> >> 
> 
>> >> 
> 
>> >> 
> 
>> >> William
> 
>> >> 
> 
>> >> 
> 
>> >> 
> 
>> >>  
> 
>> > 
> 
>> > 
> 
>> > 
> 
>> > ---------------------------------------------------------------------
> 
>> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> 
>> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
>> > 
> 
>> > 
> 
>> > ---------------------------------------------------------------------
> 
>> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> 
>> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
>> > 
> 
>> > 
> 
>> > 
> 
>>
> 
>>
> 
>>  ---------------------------------------------------------------------
> 
>>  To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> 
>>  For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
>  
> 
>  
> 
> ---------------------------------------------------------------------
> 
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> 
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
>  
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list