[GE users] Slot-wise suspend on subordinate
harald.pollinger at sun.com
Mon Jul 27 16:56:45 BST 2009
On 07/27/09 14:40, Reuti wrote:
> Hi Harald,
> this is a great proposal, as it was on the mailing list quite often.
> Some things I want to add:
> " The algorithm will work like this:
> It is assumed the queues are configured before any jobs are submitted.
> For each group of queues and the superordinated queue, the sum of
> jobs (i.e. the number of occupied slots) is calculated. If the sum
> to <nr_of_slots>+1, a job in one of the most subordinate queues is
> suspended. From all most subordinated queues, the job with the shortest
> run time (wallclock time) is suspended.
> Well, for us the suspension of the job with the longest runtime would be
> more adequate. When a job is supposed to run for two month, it doesn't
> matter whether it's 6 hrs shorter or longer. But a job which should run
> for 3 hrs, shouldn't be suspended. Syntax could be the sign of the slot
> count slots=-8(father.1) for the short ones will be suspended first,
> slots=+8(father.1) will suspend longer ones first (syntax like in the
> find command to search for -mtime files with positive and negative range).
Ok, this really makes sense. Thanks for the input!
> Will the wallclock time of the suspended jobs be adjusted? AFAIK this
> isn't done for now, and this is one of the reasons not to use
With "adjusted" you mean: The wallclock time counter should be halted
while the job is suspended?
> But OTOH this adjustment will also affect reservations.
> Maybe it should be implemented with
> http://gridengine.sunsource.net/issues/show_bug.cgi?id=2087 to specify a
> not enforced run-time which could be honored?
Yes, this would make sense. But we want to keep the changes small - at
least for the first version. Nevertheless I will add this to the spec.
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users