[GE users] checkpointing with blcr

Jerry Mersel jerry.mersel at weizmann.ac.il
Tue Dec 11 13:31:26 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi:

  I manage to successfully checkpoint and rerun an application, with 
migration.
  But I won't be able to do that if the PID is in use on the other 
machine. (That the process migrated to).

  What I want to do is have the job wait on its queue until the PID 
becomes free.
  I simulated a situation where  the PID is in use, I find that it is in 
use I then call
  qalter -q $QUEUE $JOB_ID, from the batch script.

But it didn't work. The job was just killed

Any ideas?

                               Regards,
                                 Jerry

P.S. I use BLCR and application_level checkpointing as in the how-to.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list