[GE users] Spectre checkpoint

reuti reuti at staff.uni-marburg.de
Tue Sep 8 14:11:36 BST 2009


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Am 08.09.2009 um 14:36 schrieb veerendra_n:

> Hi Reuti
>
> Thanks for the response.
>
> My requirement is that when I reschedule a spectre job running on  
> host x to resume on host y.

This I answered yesterday.


> To achieve what can configuration needs to be in place? If  
> checkpoint configuration is the answer how do I go about?

I still don't get it: you have a working checkpointing facility right  
now by just setting up the suspend_- and resume_method? Suspended  
jobs are still on the same machine and will continue at a later point  
in time on this machine.

-- Reuti


> Regards
> Veeru!
>
> -----Original Message-----
> From: reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: Tuesday, September 08, 2009 5:32 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Spectre checkpoint
>
> Hi,
>
> Am 08.09.2009 um 12:01 schrieb veerendra_n:
>
>> I?m trying to configure checkpoint for a spectre job. I pass
>> SIGTSTP and SIGCONT  in the  execution method and it works very
>> well when the job reschedules on the same host.
>>
>> However the problem arises when the rescheduled job resumes on
>> different host from where it started. It restarts from the
>> beginning instead of resuming. Right now we have just configured
>> Execution method in queue configuration (Suspend method SIGTSTP ?
>> Resume method SIGCONT).
>>
>> How should I configure checkpointing?
>
> the job quits itself after writing the checkpointing file by the
> sigtstp? When you only defined the suspend and resume method, then
> the job stays on the node and won't get rescheduled at all. Therefore
> I don't understand your question in detail.
>
> -- Reuti
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=216398
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=216402
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=216407

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list