[GE users] I need help badly.

Trey trey at hyper.com
Wed Feb 28 21:18:02 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

OK, this worked however whenever I try to install the execution hosts it 
cant contact the qmaster.  I tracked it down to not having the service 
running.  when i start up rcsge it loads everything cept fails at the 
localhost.q. I can not locate the localhost.q to fix it.


[root at centos1 common]# ./rcsge
    starting sge_qmaster
Reading in complexes:
         Complex "host".
         Complex "queue".
Reading in execution hosts.
Reading in administrative hosts.
Reading in submit hosts.
Reading in queues:
         Queue "localhost.q".
error: can't resolve hostname "localhost.localdomain"
critical error: setup failed
    starting sge_schedd
critical error: scheduler already running


I also get this error that I get another error and I cant seem to track 
it down:


[root at centos1 common]# ./rcsge -qmaster
    starting sge_qmaster
starting program: /gridware/sge/bin/lx24-amd64/sge_commd
using service "sge_commd"
bound to port 536
Reading in complexes:
         Complex "host".
         Complex "queue".
Reading in execution hosts.
Reading in administrative hosts.
Reading in submit hosts.
Reading in parallel environments:
         PE "make".
Reading in scheduler configuration
    starting sge_schedd
error: getting configuration: unable to contact qmaster via "" commd - 
qmaster n
ot enrolled at commd
error: can't get configuration from qmaster -- backgrounding


How can I get this working?


Chris Dagdigian wrote:
> 
> First things first, you are using a really old version of Grid Engine 
> (the 5.3 series ...)
> 
> It would be a very unusual case for any *new* installation to require 
> SGE 5.3.x
> 
> So^Cthe first thing you should do is head on over to 
> http://gridengine.sunsource.net and  grab the latest version of Grid 
> Engine 6.0 binaries. The latest is 6.0u10.
> 
> Next you may want to take a look at some stuff I wrote a long time ago, 
> it covers some of the pre-install things that can be significant:
> http://gridengine.info/articles/2005/09/29/things-to-think-about-before-installing 
> 
> 
> -Chris
> 
> 
> 
> On Feb 27, 2007, at 11:00 AM, Trey wrote:
> 
>> I need major help.  I am desperately tring to install grid engine as a 
>> cluster software package on 3 servers that run cent OS.  I have 
>> unpacked it and added a script /etc/profile.d/sge.sh  It sets a path 
>> to the executables and a set a var of SGE_ROOT as /gridware/sge.  When 
>> I try to run a script to add a host I get:
>>
>> [root at centos1 exec_hosts]# qconf -ah centos2.hyper.com
>> critical error: Please set the environment variable SGE_ROOT.
>>
>>
>> Also when I try to reinstall the qmaster I get:
>>
>> I get all sort of permission denied and unable to resolve 
>> localhost.localdomain.
>>
>>
>> How can I fix this?
>>
>>
>>
>> Setting.sh
>>
>> [root at centos1 common]# less settings.sh
>> SGE_ROOT=/gridware/sge; export SGE_ROOT
>>
>> ARCH=`$SGE_ROOT/util/arch`
>> DEFAULTMANPATH=`$SGE_ROOT/util/arch -m`
>> MANTYPE=`$SGE_ROOT/util/arch -mt`
>>
>> unset SGE_CELL
>> unset COMMD_PORT
>>
>> if [ "$MANPATH" = "" ]; then
>>    MANPATH=$DEFAULTMANPATH
>> fi
>> MANPATH=$SGE_ROOT/$MANTYPE:$MANPATH; export MANPATH
>>
>> PATH=$SGE_ROOT/bin/$ARCH:$PATH; export PATH
>> shlib_path_name=`$SGE_ROOT/util/arch -lib`
>> old_value=`eval echo '$'$shlib_path_name`
>> if [ x$old_value = x ]; then
>>    eval $shlib_path_name=$SGE_ROOT/lib/$ARCH
>> else
>>    eval $shlib_path_name=$SGE_ROOT/lib/$ARCH:$old_value
>> fi
>> export $shlib_path_name
>> unset ARCH DEFAULTMANPATH MANTYPE shlib_path_name
>>
>>
>>
>> SGE.sh ( A file to make sure it is running when started)
>>
>> [root at centos1 common]# less /etc/profile.d/sge.sh
>> SGE_ROOT=/gridware/sge
>> PATH=$PATH:$SGE_ROOT/bin/lx24-amd64
>> if [ -are $SGE_ROOT/default/common/settings.sh ]; then
>> . $SGE_ROOT/default/common/settings.sh
>> fi
>>
>> [root at centos1 common]# less act_qmaster
>> centos1.hyper.com
>>
>>
>>
>> Configuration file
>>
>> # Version: 5.3
>> #
>> # DO NOT MODIFY THIS FILE MANUALLY!
>> #
>> conf_version           0
>> qmaster_spool_dir      /gridware/sge/default
>> execd_spool_dir        /gridware/sge/default
>> binary_path            /gridware/sge/bin
>> mailer                 /bin/mail
>> xterm                  /usr/bin/X11/xterm
>> load_sensor            none
>> prolog                 none
>> epilog                 none
>> shell_start_mode       posix_compliant
>> login_shells           sh,ksh,csh,tcsh
>> min_uid                0
>> min_gid                0
>> user_lists             none
>> xuser_lists            none
>> load_report_time       00:00:40
>> stat_log_time          48:00:00
>> max_unheard            00:05:00
>> reschedule_unknown     00:00:00
>> loglevel               log_warning
>> administrator_mail     none
>> set_token_cmd          none
>> pag_cmd                none
>> token_extend_time      none
>> shepherd_cmd           none
>> qmaster_params         none
>> schedd_params          none
>> execd_params           none
>> finished_jobs          100
>> gid_range              20000-20100
>> admin_user             none
>> qlogin_command         telnet
>> qlogin_daemon          /usr/sbin/in.telnetd
>> rlogin_daemon          /usr/sbin/in.rlogind
>> default_domain         root
>> ignore_fqdn            true
>> max_aj_instances       2000
>> max_aj_tasks           75000
>> max_u_jobs             0
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 
> 
> 
> --No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.446 / Virus Database: 268.18.4/703 - Release Date: 
> 2/26/2007 2:56 PM
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list