[GE users] Help on designing simple queues

Reuti reuti at staff.uni-marburg.de
Sun May 13 15:21:46 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

Am 13.05.2007 um 15:28 schrieb Lönroth Erik:

> Hello!
>
> My first post here, so bare with me.
>
> I have a single SGE cell with 70 nodes. Each node has 4 CPUs. (2 x 2)
>
> I have 2 separate HSI (myrinet mx) using 35 + 35 nodes. The HSIs  
> are separeted, so submitted jobs need to be running on a specific  
> HSI partition for performance.
>
> I'm refering to the two partitions as:
>
> Partition:         [ts102]   [ts103]
> Nodes x CPU:         [35x4]    [35x4]
>
> In summary there are 2 "subclusters" with regard to the HSI:s. I  
> want this to be transparent to our users (who sees this a single  
> cluster cell) and to get SGE secure that jobs are never split  
> between the two "subclusters". I also never want more than 4 "job- 
> processes" running on any node at a time.
>
> Question 1: How can I setup on a best practice, so that a single  
> job never cross the two HIS:s (subclusers) and the user wont have  
> to change jobscripts or anything between submitts. (I dont want to  
> specify different -p mpi_x ).

you will need a setup like:

http://gridengine.info/articles/2006/02/14/grouping-jobs-to-nodes-via- 
wildcard-pes

So the necessary changes to the jobscripts are minimal.

>  Question 2: How can I setup my queues so that a single parallell  
> job wont be split over two different ques once it has got assigned  
> one? (This happens in my lab-sessions alot)

Unfortunately, if you attach the same PE to different queues, you can  
always get a mixture of slots. But it's already an RFE to limit the  
slot allocation, once a queue/hostgroup is selected:

http://gridengine.sunsource.net/issues/show_bug.cgi?id=1311

Probably you will need four (identical) PEs in your cluster, so that  
you have only one PE per queue.

HTH - Reuti


> I have started a setup of 4 queues at the moment (ts102 and ts103  
> are my subclusters):
> short.ts102.q
> short.ts103.q
> long.ts102.q
> long.ts103.q
>
> All queues contain all my nodes.
> All queues should be able to run all applications.
> All queues have all parallell environments.
>
> I'm not afraid of ripping this apart to enforce a "best practice"  
> for my environment.
>
> I realize I'm a beginner here and I apprechiate all help and  
> pointers to reading on this and hopefully you can assist me in  
> deriving a good strategy on setting up my environment. We run some  
> 5-6 different application on the cluster and we need to have a  
> maintainable situation with not too many queues to maintain. SGE  
> has so far impressed us at Scania and we will probably expand our  
> usage of it as we learn.
>
> Best regards.
> /Erik Lonroth - Scania Infomate AB - Sodertalje, Sweden

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list