[GE users] Running 1, 2, 3, or 4 jobs on a host with 4 slots

Reuti reuti at staff.uni-marburg.de
Sat Mar 12 15:49:34 GMT 2005

    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi James,

the intention of this thread was to dry-out a low priority queue, when the high 
priority queue is used - therefore the use of the load_thresholds. But we don't 
need it in your setup.

So, define 4 queues with the different settings in each of them:

slots 1
subordinate_list max2=1,max3=1,max4=1

slots 2
subordinate_list max1=1,max3=1,max4=1

slots 3
subordinate_list max1=1,max2=1,max4=1

slots 4
subordinate_list max1=1,max2=1,max3=1

In qsub you can submit to e.g. the max3 queue, and so on this machine only jobs 
in the max3 queue allowed, since the other queues on this node are blocked. 
This might of course lead to some endless waiting jobs - depending on your 
average use of the different types of jobs.

Another (more Linux in general) point is: is it good to have 3 jobs on a dual 
node? If you have just two jobs, the scheduler in the Liunux kernel should 
place the same job always on the same CPU, because there maybe still some valid 
data in the cache of this CPU. With 4 jobs - seems the same. But what to do 
with three jobs?

Same on Opteron - and maybe worse, if you have dual Opterons and the memory 
NUMA-like attached on both CPUs, so memory latency maybe higher if you access 
memory attached to the other CPU. Has anyone more detailed information on this?

Cheers - Reuti

Quoting "Marconnet, James E Mr /Computer Sciences Corporation" 
<james.marconnet at smdc.army.mil>:

> Back in 2/2004 there was a thread called: Exclusive execution on a host
> with
> multiple slots. I read thru it, but I got totally confused.
> We are using SGE 6.something and have multiple identical Penguin nodes with
> dual-processors, hyperthreaded, and slots set to 4. So there tend to be 4
> jobs running at a time on each node.
> I'm trying to figure out how a user can easily specify, hopefully using
> just
> qsub, the maximum number of jobs which can be run on a node, including his
> job. There is no need to suspend existing jobs, just don't let too many
> start on that node.
> Specifically, the user may want to run a maximum of 1, 2, 3, or 4 of his
> (or
> someone else's) jobs on a node, depending on how much of a hurry he is for
> results, how many jobs he has to run, how many nodes are currently
> available, etc. etc.
> The term "load threshold" came up as a suggestion in that earlier thread,
> but that seems to be something que-specific, rather than something one
> controls using qsub. But I certainly could be wrong.
> Thanks for specifics if possible,
> Jim

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list