[GE users] SGE Rescheduling

Gruhn Daniel J Contractor AF/A9IT Daniel.Gruhn.ctr at pentagon.af.mil
Thu Aug 3 13:05:16 BST 2006


One additional thing, I don't think the bug with rescheduling is fixed yet.
That bug is that rescheduling seems to be an asyncronous process.  That is,
the rescheduled job may be able to get started before the original job is
killed.  In my case this makes a difference and I have to compensate for it.

Dan

//SIGNED//
Daniel J.Gruhn, CTR (Group W Inc.)
HQ USAF/A9IT
Studies & Analyses, Assesments and Lessons Learned


-----Original Message-----
From: Reuti [mailto:reuti at staff.uni-marburg.de] 
Sent: Thursday, August 03, 2006 7:33 AM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] SGE Rescheduling

Hi,

Am 02.08.2006 um 23:14 schrieb Sreenath Nampally:

> Hello,
>
> Could someone explain the sequence of events that happen in SGE (both 
> on qmaster and exec host) when a job is rescheduled  and suspended? 
> What signals are sent to the job ?

if the job gets supended, it will get a SIGSTOP which you can't catch. But
you could submit the job with -notify, to get a warning before, which you
can catch. Have a look at `man qsub`, and you could even redefine the
signal: `man sge_conf`section execd_params. But be aware, that the signal
will be send to the whole process group, and this might need proper handling
in the jobscript and the compiled program.

If you reschedule a job, it will be killed, and also before this you could
get a warning by -notify. But I think, you will only get the information
about the kill, but not the reason that it will be rescheduled. Only during
the next run, you can test the variable RESTARTED, whether it's 1. If you
need a more sophisticated handling, you can also try to use the
checkpointing interface.

HTH - Reuti

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list