[GE users] problems with queueing and scheduling after upgrading to 6.2u5
reuti at staff.uni-marburg.de
Wed Mar 17 17:56:38 GMT 2010
Am 16.03.2010 um 02:46 schrieb snosov:
> After upgrading from 6.1u5 to 6.2u5 we are experiencing a whole
> slew of problems with the SGE. One most notable and annoying is the
> segfaulting of the sge_shepherd. I wrote about it in a parallel
> thread and I need to paste some traces there.
> This time, however, I would like to discuss the queueing and
> scheduling problems.
> To overcome lack of per-slot preemption in 6.1.u5, we configured
> the following queues to use 4 slots per node:
> hight_1.q -> medium_1.q -> low_1.q
> hight_2.q -> medium_2.q -> low_2.q
> hight_3.q -> medium_3.q -> low_3.q
> hight_4.q -> medium_4.q -> low_4.q
> So, for example, medium_1.q would preempt low_1.q, and hight_1.q
> would preempt both medium_1.q and low_1.q.
> High queues had hard wall-clock limit of 1 hour, medium queues had
> 3 hours and low queues were unlimited.
> To specify the type of queue to use, a user needed to request a
> complex "low", "medium", or "high", which could be satisfied only
> by corresponding queues.
> Also, these complexes had 1000, 2000, 3000 urgency tickets
> respectively to push higher priority jobs up in the scheduler.
> Everything worked fine with 6.1u5. After the upgrade, however, we
> see the following behaviour:
> - jobs will get assigned to a different queue despite the requested
> complex, e.g., to low_3.q despite "medium" complex being requested
> - those miss-assigned jobs will not be killed by exceeding the hard
> wall-clock limit.
yep, we also just faced this. It looks like SGE is having a hiccup,
as all miss-assigned jobs seem to be scheduled in one and the same
scheduling cycle. When it's over, it will schedule fine for the next
couple of hours.
I'll file an issue pointing to this thread.
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users