[GE users] Limiting jobs by available resources.

Lönroth Erik erik.lonroth at scania.com
Thu May 31 17:20:51 BST 2007


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hello again!

Thatx for all help so far, I've been reading the posts on the forum here and got alot of help already! I'd be happy to help out if I can while I learn more! We are currently running SGE 6.0u8 on a Rocks Cluster.

The applications we are running have 3 things in common that we want to adress initially. (There are more offcourse but anyway...)

1. They eat all RAM they get.
2. They consume 100% CPU while running.
3. They need to be run on a dedicated core per process.

The machines we have, have 2x2=4 cores and 8G RAM. Those limits must be honored mutually inclusive, meaning:


A) A node loading at - lets say "4" - may never accept a new job/task. (regardless of # slots)
   A node loading at - lets say "3" - may only accept <= 1 new job/task. (regardless of # slots)
   A node loading at - lets say "2" - may only accept <= 2 new job/task. (regardless of # slots)
   A node loading at - lets say "1" - may only accept <= 3 new job/task. (regardless of # slots)
   A node loading at - lets say "0" - may only accept <= 4 new job/task. (regardless of # slots)

B) A node with 4 occupied slots - may never accept a new job/task.
   A node with 3 occupied slots - may accept <= 1 new job/task.
   A node with 2 occupied slots - may accept <= 2 new job/task.
   A node with 1 occupied slots - may accept <= 3 new job/task.
   A node with 0 occupied slots - may accept <= 4 new job/task. 

C) A node with <1G RAM free may never accept a new job/task.
   A node with <2G RAM free may - accept <= 1 new job/task.
   A node with <4G RAM free may - accept <= 2 new job/task.
   A node with <6G RAM free may - accept <= 3 new job/task.
   A node with <8G RAM free may - accept <= 4 new job/task.

The lowest number of "possible slots" will thus decide the allocated number of job-slots by SGE.

This would guarantee (at job dispatch time) that jobs currently running on the node wont be heavily affected by a new job spawning on the same node, and this is what we want.

How would I address this in the best way? 

/Erik

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list