[GE users] checkpointing and SGE

Dave Love david.love at manchester.ac.uk
Tue Jun 26 11:37:00 BST 2007


Jerry Mersel <jerry.mersel at weizmann.ac.il> writes:

>  I read the documentation, N!GE6 Checkpointing and Berkeley lab
> Checkpoint/Restart
>  and it seemed to say  that  the checkpointed process can't migrate
> to other nodes.
>  Am I  reading this correctly? Can someone recommend another method.

The LAM-MPI integration migrated MPI processes between processors when
I had a brief play with it on Fedora (using the OSCAR-packaged version
with identical kernels on each node).

I was going to ask about BLCR integration myself.  A couple of
limitations in both BLCR and SGE have been removed since the HOWTO was
written.  Does anyone know if there is a more recent recipe around for
running it with SGE on Linux?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list