[GE users] (another) slotwise preemption question

spow_ miomax_ at hotmail.com
Fri Aug 27 09:51:50 BST 2010


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

reuti a écrit :
> <snip>
>> So, question is, why is SGE trying to push a 5th job onto
>> a machine that has only 4 slots, and all 4 are "busy" ? And, is
>> there a way around this ?
>>
>
> What about using a checkpointing environment for the jobs in the secondary queue, where the suspension of the job will kill and requeue it (check-transparent will do already). You wouldn't need any special script like the one you used for the suspension right now.
>
Could you further explain this ? I am also using a co-scheduler to qmod -rj jobs
that have 'S' in their state which means their slots got preempted, and I am
also concerned with the example the PO adduced.
Does the check-transparent environment automatically requeue jobs that got
suspended ?
Can it be used _without_ any end-user code/script modification ? (just specify
parameters in SGE)

> Well, although the black hole is gone this way, one job is oscillating all the time when a checkpointing environment is used between "SR" and "Rq" states (with the schedule_interval period).
>
>
Thanks,
GQ



More information about the gridengine-users mailing list