[GE users] SGE6.2u5 - sge_execd on windows node not responding after period of time

jching jching at bbn.com
Tue Oct 26 22:03:27 BST 2010

I have a strange problem with sge 6.2u5 where everything works fine for a windows node but it will randomly (could be a few hours to a few days) go into au state BUT on the machine, sge_execd and qloadsensor.exe are both still running.  If I start sge_execd again on the windows node which will then cause duplicate processes (sge_execd and qloadsensor.exe), the au state will go away... any thoughts on what might cause this and how I may be able to debug on the client side?

All logs have been checked but on qmaster and on client, nothing suspicious...

Thanks for any direction!


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list