[GE users] Help: Checkpoint Problem

Lee Amy openlinuxsource at gmail.com
Thu Apr 3 15:50:28 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

2008/4/3, Reuti <reuti at staff.uni-marburg.de>:
>
> Am 01.04.2008 um 15:00 schrieb Lee Amy:
>
> > 2008/4/1, Reuti <reuti at staff.uni-marburg.de>: Hi,
> >
> > Am 01.04.2008 um 13:51 schrieb Lee Amy:
> >
> > > Hello,
> > >
> > > I use MPICH 1.2.6 in my cluster and I wanna build a checkpoint by
> > > SGE. So is there any way to build checkpoint step by step?
> >
> >
> > AFAIK it's not possible with MPICH unless you program it on your own
> > at an application level. LAM/MPI has it built-in since 7.1, and for
> > Open MPI it's scheduled for the upcoming 1.3 release.
> >
> > There are Howtos for checkpointing of serial applications with SGE,
> > but be aware the SGE will only "let's say" trigger an already
> > existent checkpointing facility of the application (which already has
> > to work without SGE).
> >
> > -- Reuti
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> > Thanks for your reply. And could you tell me more details about the 'an
> > application level'? And what if I can only use MPICH 1.2.6, what can I do
> > with it in checkpoint?
> >
>
> How to write checkpointable applications is far beyond the scope of this
> list. If your program is checkpointable without a queuingsystem, then we can
> start to integrate it into SGE. In addition, it would be good if your
> application would be movable between different set of nodes, so take care
> where and how you store any temporary information and avoid recording node
> specific dependencies.
>
> You may try Google with keywords like "Checkpointing Strategy  Parallel".
>
> -- Reuti
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> Thank you very much~ It's quite clear.



More information about the gridengine-users mailing list