[GE users] execd disappearing on Opteron/SuSE 9.0

Boone J. Severson severson at cray.com
Fri Apr 23 19:39:36 BST 2004


Nothing was really happening on the machines. The last 2 execd crashes 
have been on machines where 1 user (not the same user) had an 
INTERACTIVE session going, with a cpu load of near 0. Never has this 
happened if one or more processors held an active job. In the messages 
spool there's mention of a shephard terminating with exit status 28, but 
that was many days previous.

John Hearns wrote:

>On Fri, 2004-04-23 at 16:44, Boone J. Severson wrote:
>  
>
>>Greetings,
>>
>>We're seeing execd crashes on a couple of our Opteron/SuSE 9.0 grid 
>>machines. Mailing list searches came up with a similar case from last 
>>summer,
>>    
>>
>I don't see problems like this with an Opteron/SuSE 9 cluster.
>Quick look at the version and its 5.3p3
>
>Any ideas on what provokes the crashes?
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list