[GE users] Slots and CPU's
dan.templeton at sun.com
Fri Mar 5 20:10:14 GMT 2010
Welcome aboard! Is this that single-slot desktop queue, or are we
talking about one of the other queues?
In general, the way SGE tracks CPU usage is through slot usage. The
default assumption is that a single job consumes a single slot and hence
a single CPU. If a job needs to consume more than one slot (on one
machine or across multiple) it has to be submitted as a parallel job.
For example, "qsub -pe fea 8 job.sh" will submit a job that uses the
'fea' parallel environment to consume 8 slots. ("qconf -sp fea" will
tell you about that PE. If I remember correctly, we set that one up to
require that all the assigned slots be on the same node.) If you submit
your jobs that way, SGE won't let more slots be consumed than are available.
Now, in that single-slot desktop queue, the assumption is that only one
job will run there at a time, and the problem is that you're actually
competing with the desktop user himself. Because SGE has no handle on
what the user is doing beyond the CPU and memory state, the way to keep
SGE from competing with the user's processes is to set the
load_threshold low enough. That's where np_load_avg=1.0 comes it. And
actually, you probably want to set it even lower. If it's an 8-core
machine, and four cores are available to SGE, then you probably want to
set the load_threshold to np_load_avg=0.5 so that SGE won't schedule
anything there if the user is using more than 4 (half) of the cores.
On 03/05/10 11:40, murphygb wrote:
> Hello all. Newbie here so forgive if this is obvious. I've read and read, jumping from online manual to man pages and back again and can't seem figure out what I need to do. Imagine I have a 16 processor machine configured with 16 job slots; np_load_avg is set to 1. I could submit one 16 processor job or sixteen 1 processor jobs. Let's say someone is running a 4 processor job there. I would like the next job to only go to that machine if the next job is asking for 12 processors or less and if that next job is asking for more than 12 to pend until the resources are available. If the next job only asks for 4 processors, then run (now the machine is using 8 processors and thus the next job will only run or pend depending on whether the user asks for 8 processors or less and so on.) Do I use a complex like cpu or s_cpu when issuing the qsub command? Do I need to define a consumable? Right now, the jobs always go there (as long as the load average is under 16 since np_load_avg is 1) which is typical and that means I can have say three 12 processor jobs dispatched there all competing for only 16 processors. Succinctly, I am looking for a 1 to 1 correlation between available processors and what the user asks for when submitting. Any help or insight would be greatly appreciated.
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users