[GE users] lam and blcr checkpointing

Jerry Mersel jerry.mersel at weizmann.ac.il
Wed Nov 19 14:16:13 GMT 2008


Hi Reuti:

  Sorry I was very unclear in my last email. I'll try to improve it here.

  First thanks for your response, second I am familiar with that
  very well written and helpful HOW-TO.

  I succeeded to get checkpointing and GE working together for
  serial applications.

  I now need to do the same for parallel applications.
  Since I am using BLCR to do checkpointing, and LAM has integrated
  BLCR I decided to try LAM (7.1.4).

  I managed to get things working from the command line, but from
  GE when I do the checkpoint those checkpointed files can't restart
  the application. Neither from GE or the command line.

  I am getting  kernel: Skipping a socket

  I'd appreciate any ideas.

                               Regards,
                                 Jerry





> Hi,
>
> Am 19.11.2008 um 11:05 schrieb Jerry Mersel:
>
>> Hi:
>>
>>   I got lam, with tight_integration, working with GE. I also have it
>>   working with blcr checkpointing outside of GE. Using GE however the
>>   checkpointing does not work properly.
>>
>>   I see in /var/log/messages:
>>
>> Nov 19 11:34:09 hezi-1 kernel: Retry on -CR_ENOSUPPORT
>> Nov 19 11:34:19 hezi-1 kernel: Skipping a socket.
>>
>>
>> I'm using lam 7.1.4, blcr 0.6.4 and GE 6.1U4.
>
> you followed the Howto: http://gridengine.sunsource.net/howto/APSTC-
> TB-2004-005.pdf ?
>
> -- Reuti
>
>> The checkpointed files appear not to be good.
>>
>> Anyone succeed with this?
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?
>> dsForumId=38&dsMessageId=89048
>>
>> To unsubscribe from this discussion, e-mail: [users-
>> unsubscribe at gridengine.sunsource.net].
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89088
>
> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89107

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list