[GE users] How important is checkpointing support?

agay agay at cc.huji.ac.il
Thu May 14 15:11:38 BST 2009

SGE supports external checkpointing solutions. Such solutions allow SGE to restart/migrate jobs/processes to minimize the damage of host malfunctions and implement load balancing.

Do you find checkpointing useful in your work? Does it work ok with all your programs? Is it reliable?

I understand checkpointing is a non-trivial technology now actively developed on UNIX. Do you know any practical Windows solution?

What solution would you suggest for a cluster that is used interactively during the day, maybe with occasional reboots?


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list