[GE users] qlicserver behavior for suspended jobs

gutnik gutnik at gmail.com
Fri Nov 20 15:36:52 GMT 2009


On Fri, Nov 20, 2009 at 7:24 AM, olesen
<Mark.Olesen at emcontechnologies.com> wrote:
>> That said, this feels like a hack, even if it would work. I'd be happy
>> to find another solution,
>> but I don't see one for the low-priority jobs. Do you?
>
> Are the low-priority jobs themselves suspended (ie, with SIGTSTP, but
> still residing in memory) or is the suspension more like a checkpoint
> (ie results written and application ended)?
>
> If it's the second, could you just restart/resubmit the stopped jobs?

Well, I'd like them to be suspended with SIGTSTP, in which case they release
their licenses. Checkpointing _might_ be possible, but is much less
appealing even
if it is; jobs are often started from within scripts that would have
to get modified
to know to start from a saved checkpoint, and we'd have to have some sort of
system to make sure that only the _right_ checkpoint is taken, rather
than a previous
one from some job that crashed after it was restarted.

       Vadim

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=228243

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list