[GE users] Limiting jobs by available resources.

Lönroth Erik erik.lonroth at scania.com
Fri Jun 1 13:33:49 BST 2007


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

In respons to myself here =)

Would setting "Load Thresolds" in qmon (Queue Control -> Modify Queue) as ...

Load            Value
"mem_used"       4G

... At least prevent scheduling of new job-slots when there is more than 4G ram used on any host in that cluster que?


/Erik

-----Original Message-----
From: Lönroth Erik [mailto:erik.lonroth at scania.com] 
Sent: den 31 maj 2007 18:21
To: users at gridengine.sunsource.net
Subject: [GE users] Limiting jobs by available resources.


Hello again!

Thatx for all help so far, I've been reading the posts on the forum here and got alot of help already! I'd be happy to help out if I can while I learn more! We are currently running SGE 6.0u8 on a Rocks Cluster.

The applications we are running have 3 things in common that we want to adress initially. (There are more offcourse but anyway...)

1. They eat all RAM they get.
2. They consume 100% CPU while running.
3. They need to be run on a dedicated core per process.

The machines we have, have 2x2=4 cores and 8G RAM. Those limits must be honored mutually inclusive, meaning:


A) A node loading at - lets say "4" - may never accept a new job/task. (regardless of # slots)
   A node loading at - lets say "3" - may only accept <= 1 new job/task. (regardless of # slots)
   A node loading at - lets say "2" - may only accept <= 2 new job/task. (regardless of # slots)
   A node loading at - lets say "1" - may only accept <= 3 new job/task. (regardless of # slots)
   A node loading at - lets say "0" - may only accept <= 4 new job/task. (regardless of # slots)

B) A node with 4 occupied slots - may never accept a new job/task.
   A node with 3 occupied slots - may accept <= 1 new job/task.
   A node with 2 occupied slots - may accept <= 2 new job/task.
   A node with 1 occupied slots - may accept <= 3 new job/task.
   A node with 0 occupied slots - may accept <= 4 new job/task. 

C) A node with <1G RAM free may never accept a new job/task.
   A node with <2G RAM free may - accept <= 1 new job/task.
   A node with <4G RAM free may - accept <= 2 new job/task.
   A node with <6G RAM free may - accept <= 3 new job/task.
   A node with <8G RAM free may - accept <= 4 new job/task.

The lowest number of "possible slots" will thus decide the allocated number of job-slots by SGE.

This would guarantee (at job dispatch time) that jobs currently running on the node wont be heavily affected by a new job spawning on the same node, and this is what we want.

How would I address this in the best way? 

/Erik

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list