[GE users] yet another commlib error

Michael Green mishagreen at gmail.com
Tue Dec 6 08:43:40 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

SLES9SP1
N1GE 6U6
1 master <-NAT-> 8 nodes
$SGE_ROOT=/srv/N1GE on physical shared file system (GPFS) on IBM FASTt700 SAN.

Yesterday I had IBM staff over here servicing the storage. I cleanly
unmounted file systems and shut down all machines before they put
their hands on it.

After they finished I booted the systems, everything went without
hitch except SGE refused to start.

On the master:
<code>
gene1:/srv/N1GE/default/spool/qmaster # /etc/init.d/sgemaster start
   starting sge_qmaster

sge_qmaster didn't start!
Please check the messages file

   starting sge_schedd
error: commlib error: can't connect to service (Connection refused)
error: getting configuration: unable to contact qmaster using port 536
on host "gene1.weizmann.ac.il"
error: can't get configuration from qmaster -- backgrounding
</code>

<log>
gene1:/srv/N1GE/default/spool/qmaster # tail -f messages
12/06/2005 10:24:54|qmaster|gene1|E|missing configuration attribute "hostname"
12/06/2005 10:24:54|qmaster|gene1|E|cannot recreate queue all.q from
disk because of unknown host g1.biocl.weizmann.ac.il
12/06/2005 10:24:54|qmaster|gene1|I|read job database with 1 entries
in 0 seconds
12/06/2005 10:24:54|qmaster|gene1|E|cqueue_list_locate_qinstance("all.q at g3.biocl.weizmann.ac.il"):
cqueue == NULL("all.q", "g3.biocl.weizmann.ac.il", 1, 0)
12/06/2005 10:24:54|qmaster|gene1|E|can't find queue
"all.q at g3.biocl.weizmann.ac.il" referenced in job 27
</log>

qmaster complains on missing hostname attribute, but what is the file
that contains it? grepping on default/ directory reveals quite a few
files containing 'hostname'.
Also the line with 'cqueue_list_locate_qinstance', does it check the
cqueues/all.q file?

Please help!
--
Warm regards,
Michael Green

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list