[GE users] Running 1, 2, 3, or 4 jobs on a host with 4 slots

Marconnet, James E Mr /Computer Sciences Corporation james.marconnet at smdc.army.mil
Mon Mar 14 22:25:43 GMT 2005


Thanks so much for your simple and understandable suggestion for using
several subordinated ques, each with a different slots setting. 

I reread several times what you suggested (down below), and I wonder if what
you really meant to suggest would be:

slots 1
subordinate_list max2=1,max3=1,max4=1

slots 2
subordinate_list max3=1,max4=1

slots 3
subordinate_list max4=1

slots 4
subordinate_list n/a

Probably a good idea for me to get this scheme understood and correct before
I implement it. Otherwise every que might shut down every other que. Or is
that the general idea?

I see where in general the multiple subordinated ques could result in some
jobs waiting a long time to start running, encouraging folks with just a few
jobs to always use the max1 que. Perhaps there is a better way to accomplish
this on a node by node basis rather than a subordinated que basis?

Excellent point about the potential inadvisability of running 3 jobs on a
node from a performance point of view. Anyone know a simple, clever way to
always have 1, 2, or 4 jobs running on an individual node, but never ever 3
jobs? ;-) And then if a job completes, leaving 3 running, would you somehow
suspend a job to keep the node running 1 or 2 jobs? Nah!

We recently had a few jobs out of a large set of similar jobs that
unexplainably went haywire, and it may well have happened on a node running
3 jobs at once. Your suggestion made me wonder if somehow the cache got
confused and sent the wrong program/data to the jobs. Or if running 3 jobs
at once on a node might have had nothing to do with anything in this

On a wider note, anyone here have anything to share about your similar que
experiences and/or the reasonability of setting up multiple, seemingly
redundant ques like this for running 1, 2, or 4 jobs on a node at user
discretion. In this specific case, we have a technical subgroup with a tight
deadline who arranged an absolute top/exclusive priority on all these fast
nodes, kicking everyone else off of them for the duration. Setting up such
multiple ques like this suggestion is relatively easy using qmon and
repeatedly copying ques, but is this perhaps micromanaging what just ought
to generally just be: "Throw it into a big cluster/nodes/SGE/ hopper and get
run whatever, whenever, and however the scheduler decides?" The que of the
week club VS lots of user qsub settings?? I'd rather not overwhelm the users
with what might seem like a crazy quilt of que choices. But then if they
could/would control all this from their qsub line, that might be really
confusing too! Clearly I've not explored tickets yet, and rather hope to not
have to.

Thanks for all your help and encouragement in this group!
Jim Marconnet

So, define 4 queues with the different settings in each of them:

slots 1
subordinate_list max2=1,max3=1,max4=1

slots 2
subordinate_list max1=1,max3=1,max4=1

slots 3
subordinate_list max1=1,max2=1,max4=1

slots 4
subordinate_list max1=1,max2=1,max3=1

In qsub you can submit to e.g. the max3 queue, and so on this machine only
jobs in the max3 queue allowed, since the other queues on this node are
This might of course lead to some endless waiting jobs - depending on your
average use of the different types of jobs.

Another (more Linux in general) point is: is it good to have 3 jobs on a
dual node? If you have just two jobs, the scheduler in the Liunux kernel
should place the same job always on the same CPU, because there maybe still
some valid data in the cache of this CPU. With 4 jobs - seems the same. But
what to do with three jobs?

Same on Opteron - and maybe worse, if you have dual Opterons and the memory
NUMA-like attached on both CPUs, so memory latency maybe higher if you
access memory attached to the other CPU. Has anyone more detailed
information on this?

Cheers - Reuti

