[GE users] max array job tasks - and load

Joseph Hargitai joseph.hargitai at nyu.edu
Fri Oct 31 00:29:29 GMT 2008


Our nodes have 16 cores. We will be using 20 of them for job arrays and serial jobs using a distinct queue and 60 nodes in another queue for parallel jobs.

What is the best way to control load threshold on the serial/job array part? Usually on other clusters we like processes to equal core count - such we set load limit in pbs to 7.8 or 8 for 8 core nodes. For this cluster we would set it to 15.5, 16.

It appears you can set/control load in two ways, possibly many more:

a, via load/suspend thresholds
I am not clear on the nomenclature in SGE regarding the np_load.avg usage. What is a np_load_av = 1.75 mean on a quadsocket quadcore node (16 cores)? Will SGE schedule jobs up to load 16x1.75? In other words, I would like to set a parameter to stop scheduling more jobs to a node when the node reaches load 16. 

b, configure max array job tasks - 
can you set this per host or only globally? It would be very useful If you could set this per host in addition to a global total. 


To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list