[GE users] checkpointing and SGE

Chris Dagdigian dag at sonsorol.org
Sun Jun 24 19:29:42 BST 2007

Hi Jerry,

Grid Engine can't magically checkpoint your application for migration  
to another node -- all it really does is play nicely with either  
applications or Operating Systems that themselves are checkpoint-aware.

Either the code itself needs to be able to checkpoint locally or you  
need to be running Grid Engine on an operating system that can do  
system level checkpointing. To my knowledge, Linux and the standard  
linux kernel does not have this sort of capability. I could not tell  
from your messages what OS and kernel you are talking about.

Most people I know who seriously use checkpointing in production  
environments are doing it at the application level these days.


On Jun 24, 2007, at 5:49 AM, Jerry Mersel wrote:

> In addition does the kernel have to be the same across all the nodes?
> It seems that the "N1GE6 Checkpointing and Berkeley lab Checkpoint/ 
> Restart" doc
> contradicts itself on weather a process can migrate across nodes.
>                                                           Regards,
>                                                               Jerry
> Jerry Mersel wrote:
>> Hi:
>>  I have to checkpoint a process and then restart the process on  
>> another node.
>>  I also have to use kernel checkpointing because I don't always  
>> have access to
>>  the code that is being run.
>>  I read the documentation, N1GE6 Checkpointing and Berkeley lab  
>> Checkpoint/Restart
>>  and it seemed to say  that  the checkpointed process can't  
>> migrate  to other nodes.
>>  Am I  reading this correctly? Can someone recommend another method.
>> Regards,
>>   Jerry

Chris Dagdigian  <dag at sonsorol.org>
Current coordinates: Boston-area, USA
GPS: http://bioteam.net/dagbin/gps?42.385693+N+71.115535+W

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list