[GE users] Problems installing 6.0u1

Chris Dagdigian dag at sonsorol.org
Tue Oct 5 22:11:40 BST 2004


These sorts of errors are almost always caused by configuration problems 
relating to hostnames, hostname resolution and DNS configuration. SGE is 
very very sensitive to these sorts of things.

In particular your server named "master.medusa.scorec.rpi.edu" should be 
listed in your campus DNS server for both forward and reverse queries.

The hostname of the acting qmaster is going to be written to your 
$SGE_ROOT/<cell>/common/act_qmaster

Your compute nodes on the private network will read that file and try to 
contact the hostname listed. If that hostname is the public name and is 
unreachable via the private network you are going to have issues. The 
fix for that situation is to create a file in 
$SGE_ROOT/<cell>/host_aliases that has an entry for 
"master.medusa.scorec.rpi.edu <internal-hostnam>" -- this will let the 
private nodes contact the correct IP address.


If this problem is not caused by hostname/DNS issues then it could also 
be something simple like a firewall blocking your TCP port etc.

-Chris





Christophe Dupre wrote:

> I am trying to install 6.0u1 on a cluster running RHEL 3.0. The master
> node is attached to a private network with the compute nodes, and a public
> network for end-users access.
> When I try to start the daemons:
> bash-2.05b# /etc/init.d/sgemaster  start
>    starting sge_qmaster
> 
> sge_qmaster didn't start!
> Please check the messages file
> 
>    starting sge_schedd
> error: getting configuration: unable to contact qmaster using port 536 on
> host "master.medusa.scorec.rpi.edu"
> can't get configuration from qmaster -- waiting ...
> can't get configuration from qmaster -- waiting ...
> can't get configuration from qmaster -- waiting ...
> error: can't get configuration from qmaster -- backgrounding
> 
> but the messages file is empty.
> 
> I ran the install_qmaster script and followed the instructions, but the
> script failed while trying to start the qmaster daemon.
> 
> 
> 
> 
> --
> Christophe Dupre
> System Administrator, Scientific Computation Research Center
> Rensselaer Polytechnic Institute
> Troy, NY        USA
> Phone: (518) 276-2578  -  Fax: (518) 276-4886
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

-- 
Chris Dagdigian, <dag at sonsorol.org>
BioTeam  - Independent life science IT & informatics consulting
Office: 617-665-6088, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E iChat/AIM: bioteamdag  Web: http://bioteam.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list