[GE users] sgemaster fails to start

Arne Brutschy arne.brutschy at ulb.ac.be
Thu Jun 12 13:51:08 BST 2008


On Do, 2008-06-12 at 08:44 -0400, Chris Dagdigian wrote:
> I seem to recall that the GDI messages passed via internal SGE  
> communication include a version field that changes between SGE  
> revisions. Is it possible that you have a mixture of old and new SGE  
> binaries - perhaps some "old" sge_execd's trying to talk to a "new"  
> qmaster daemon?

Nope, that is rather unlikely. I'm using Rocks Clusters 4.2.1, which is
a CentOS 4 with a sge-V60u8-1. Although I'm updating the base packages
of the headnode on regular basis, the sge stuff should never change.

And indeed, the binaries do not seem to be modified lately:

ll /opt/gridengine/bin/lx26-x86/
total 21248
-rwxr-xr-x  1 root root 1249630 Sep 25  2006 qacct
-rwxr-xr-x  1 root root 1108174 Sep 25  2006 qalter
-rwxr-xr-x  1 root root 1282002 Sep 25  2006 qconf
-rwxr-xr-x  1 root root  757798 Sep 25  2006 qdel
lrwxrwxrwx  1 root root       6 Feb 14  2007 qhold -> qalter
-rwxr-xr-x  1 root root 1283209 Sep 25  2006 qhost
lrwxrwxrwx  1 root root       3 Feb 14  2007 qlogin -> qsh
-rwxr-xr-x  1 root root  140258 Sep 25  2006 qmake
-rwxr-xr-x  1 root root  771376 Sep 25  2006 qmod
-rwxr-xr-x  1 root root 2754916 Sep 25  2006 qmon
-rwxr-xr-x  1 root root  707147 Sep 25  2006 qping
lrwxrwxrwx  1 root root       6 Feb 14  2007 qresub -> qalter
lrwxrwxrwx  1 root root       6 Feb 14  2007 qrls -> qalter
lrwxrwxrwx  1 root root       3 Feb 14  2007 qrsh -> qsh
lrwxrwxrwx  1 root root       5 Feb 14  2007 qselect -> qstat
-rwxr-xr-x  1 root root 1185111 Sep 25  2006 qsh
-rwxr-xr-x  1 root root 1352792 Sep 25  2006 qstat
-rwxr-xr-x  1 root root    1653 Jun  6 18:01 qsub
-rwxr-xr-x  1 root root 1376682 Sep 25  2006 qtcsh
-rwxr-xr-x  1 root root  158055 Sep 25  2006 sge_coshepherd
-rwxr-xr-x  1 root root 1384300 Sep 25  2006 sge_execd
-r-s--x--x  1 root root  733095 Sep 25  2006 sgepasswd
-rwxr-xr-x  1 root root 1763831 Sep 25  2006 sge_qmaster
-rwxr-xr-x  1 root root 1483082 Sep 25  2006 sge_schedd
-rwxr-xr-x  1 root root 1068887 Sep 25  2006 sge_shadowd
-rwxr-xr-x  1 root root 1075388 Sep 25  2006 sge_shepherd

Is this version number like a hash, thus indicating a broken/damaged
binary?

Thanks for the reply!
Arne


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list