[GE users] About checkpointing and job migration

Lip Kian lkng at eblackprint.com
Wed Jan 4 12:53:16 GMT 2006


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

> Hi,I have 2 questions:
>
> 1.
> I had read some reference. But I'm still confused what the differences
> are between User level and kernel level checkpointing. If you know the
> answers, please tell me "with some examples". Thank you!!

You really shouldn't treat kernel or user level checkpointing as 2 vastly
different things.

The difference between them is only during the restart of migrated jobs
where the kernel checkpoint mechanism will execute the restart script that
is specified whereas for the user level checkpoint mechanism will NOT.
Instead, during migration restart for the user level, the submitted script
will be re-executed. Hence, to differentiate if the submission script is
restarted or an initial start, the environment variable $RESTART is used
for this test.

Take a look at the diagram on page 5 of
http://gridengine.sunsource.net/howto/APSTC-TB-2004-005.pdf

In short, if you don't specify a restart script for the kernel checkpoint,
it's really no different from the user level checkpoint. However, the
other way doesn't hold. As in, if you specified a restart script for the
user level checkpoint, it is NEVER used during a migration restart.

>
> 2.
> I try to test job migration with checkpointing. I found that the migrated
> job always restart execution instead of continue execution. I use cray
> checkpoint object.

What is your checkpoint tool? Note that N1GE/GE/SGE does NOT provide any
checkpoint tools. It only provides mechanisms to integrate 3rd party
tools.

Cheers,

Lip Kian

>
> Thank you.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list