[GE users] Array jobs & job priorities
brendon.oliver at gmail.com
Tue Dec 2 23:59:23 GMT 2008
I've got a scenario here that I'm not sure how to deal with, so hopefully
someone can make some suggestions (even if what I want to be able to do just
isn't possible - just so I know for sure):
I have one queue on which two different (and unrelated) types of jobs can be
run. Call them job A & job B. They can't be run on separate queues due to
resource constraints. The queue has multiple slots, because there are other
job types that can run on the same queue in parallel. However job types A &
B must never be run at the same time on the same node, so each gets submitted
with a hard resource requirement ( "-l xxx=1" where xxx is the name of a
consumable complex we have set up).
Job type A is much more important than B. So we use 'qsub -p 100 ...' when it
is submitted (job type B gets no '-p' priority whent submitted).
Now job type B can often be an extremely large long-running job. We know this
in advance, so when this happens, the job gets submitted as an array job (
qsub -t 1-n ... etc.). There will frequently be more array tasks than there
are available machines in the cluster.
What I want / need / would like to happen: if an array job of type B is
currently running on the queue (eg. there might be 4x machines in the queue
cluster, tasks 1 thru 4 are executing on those 4 boxes, tasks 5 thru 10 are
sitting in the queue as "pending"), if a job of type A is submitted, that it
gets scheduled before the next task of job type B gets scheduled. The type A
job _must_ get executed before the remaining tasks for job B are run.
At the moment, from the testing I've done, it seems that all tasks for job
type B are scheduled / completed before the type A job is scheduled. Now
it's quite possible that there's some queue or scheduler configuration I've
overlooked, but I've not been able to find any doco on how priorities might
work in this scenario (feel free to point me in the right direction). I did
see some mention in the docs about having to enable 'weight_priority' in the
scheduler config but no mention of where / how that is done & what other
ramifications it might have, or even if that will solve my current problem.
Lastly, job type B can also be small enough that it doesn't require array
handling, in which case we only care that job type A gets scheduled before B.
If all slots are currently executing B-type jobs, that's fine. All that
matters is that if a type A job comes in that it jumps to the top of the list
and will be executed on the next free slot before any other B jobs which
might be sitting in the queue waiting.
[Also, FWIW, all jobs run on this queue are always submitted / owned by the
same user ID]
Hope that all makes sense. If anyone's got any thoughts, it would be much
10:14:42 up 38 days, 12:12, 2 users, load average: 0.12, 0.15, 0.29
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users