[GE users] About checkpointing and job migration

Lip Kian lkng at eblackprint.com
Wed Jan 4 12:59:45 GMT 2006


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Sorry, got myself confused. :p

The differences between the various interfaces is on which commands are
used during the different stages of a checkpointable job. The illustration
in the document should hopefully make things clearer.

> Hi,
>
>> Hi,I have 2 questions:
>>
>> 1.
>> I had read some reference. But I'm still confused what the differences
>> are between User level and kernel level checkpointing. If you know the
>> answers, please tell me "with some examples". Thank you!!
>
> You really shouldn't treat kernel or user level checkpointing as 2 vastly
> different things.
>
> The difference between them is only during the restart of migrated jobs
> where the kernel checkpoint mechanism will execute the restart script that
> is specified whereas for the user level checkpoint mechanism will NOT.
> Instead, during migration restart for the user level, the submitted script
> will be re-executed. Hence, to differentiate if the submission script is
> restarted or an initial start, the environment variable $RESTART is used
> for this test.
>
> Take a look at the diagram on page 5 of
> http://gridengine.sunsource.net/howto/APSTC-TB-2004-005.pdf
>
> In short, if you don't specify a restart script for the kernel checkpoint,
> it's really no different from the user level checkpoint. However, the
> other way doesn't hold. As in, if you specified a restart script for the
> user level checkpoint, it is NEVER used during a migration restart.
>
>>
>> 2.
>> I try to test job migration with checkpointing. I found that the
>> migrated
>> job always restart execution instead of continue execution. I use cray
>> checkpoint object.
>
> What is your checkpoint tool? Note that N1GE/GE/SGE does NOT provide any
> checkpoint tools. It only provides mechanisms to integrate 3rd party
> tools.
>
> Cheers,
>
> Lip Kian
>
>>
>> Thank you.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list