[GE users] Install problem setting up GE 6.1 u2 on Fedora 8

Bruce Rothermal Bruce.Rothermal at Sun.COM
Tue Nov 27 01:14:04 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Found it bin/sge_qmaster

strace looks like it is trying to bind to address 0.0.0.0

uname({sys="Linux", node="strat", ...}) = 0
getrlimit(RLIMIT_NOFILE, {rlim_cur=8*1024, rlim_max=8*1024}) = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 4
setsockopt(4, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(4, {sa_family=AF_INET, sin_port=htons(48620), 
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)
shutdown(4, 2 /* send and receive */)   = -1 ENOTCONN (Transport 
endpoint is not connected)
close(4)                                = 0
write(2, "    81   9758 46912496256640 ", 29    81   9758 46912496256640 
) = 29
write(2, "    ../daemons/qmaster/sge_qmast"..., 96    
../daemons/qmaster/sge_qmaster_main.c 328 abort qmaster startup due to 
communication errors
) = 96
open("/tmp/sge_messages", O_WRONLY|O_CREAT|O_APPEND, 0666) = 4
write(4, "11/26/2007 18:10:40|qmaster|stra"..., 86) = 86
close(4)                                = 0
exit_group(1)                           = ?

I'll have to look in the source to find out which function it is using 
to get the address. Probably a configuration problem. I'm more familiar 
with working with Solaris for the past 11 yrs. Fedora is new to me.

Bruce

Daniel Templeton wrote:
> Nope.  You should source the dl.[c]sh file and then run "dl 1".  Or, 
> instead you could additionally set the SGE_ND env var to 1.  The dl 
> script is easier to deal with, though.
>
> Daniel
>
> Bruce Rothermal wrote:
>> Thanks Daniel
>>
>> I set the env var SGE_DEBUG_LEVEL="2 0 0 0 0 0 0 0"; export 
>> SGE_DEBUG_LEVEL
>>
>> Then run install   install_qmaster
>>
>> Everything runs the same with no debug output to the terminal. Is it 
>> supposed to go to some log file or the terminal?
>>
>> Bruce
>>
>>
>> Daniel Templeton wrote:
>>> When I have that problem, it's normally because the qmaster's port 
>>> is taken by some other application.  You can see what's going on by 
>>> setting debug level 1.  See:
>>>
>>> http://blogs.sun.com/templedf/entry/using_debugging_output
>>>
>>> for details on setting debug levels.
>>>
>>> Daniel
>>>
>>> Bruce Rothermal wrote:
>>>> I'm running through the install process and I seam to be hanging at 
>>>> the point of:
>>>>
>>>>> Grid Engine qmaster and scheduler startup
>>>>> -----------------------------------------
>>>>>
>>>>> Starting qmaster and scheduler daemon. Please wait ...
>>>>>    starting sge_qmaster
>>>>
>>>> I am using all default parameters in the install script except 
>>>> installing as root, sge_qmaster     48620/tcp  and sge_execd       
>>>> 48621/tcp. Does anybody have suggestions how to figure out what is 
>>>> hung here.  I think it is at the point where qmaster is being 
>>>> started. Ive changed the script default/common/sgemaster right 
>>>> after it is created to print out debug info and it shows I'm 
>>>> looping at this point in the script:
>>>>> + masterhost=strat
>>>>> ++ expr 9 + 1
>>>>> + loop=10
>>>>> + '[' false = false -a 10 -ne 30 ']'
>>>>> + /sge/bin/lx24-amd64/qping -info strat 48620 qmaster 1
>>>>> + '[' 1 = 0 ']'
>>>>> + sleep 2
>>>>> ++ cat /sge/default/common/act_qmaster
>>>>> + masterhost=strat
>>>>> ++ expr 10 + 1
>>>>> + loop=11
>>>>> + '[' false = false -a 11 -ne 30 ']'
>>>>> + /sge/bin/lx24-amd64/qping -info strat 48620 qmaster 1
>>>>> + '[' 1 = 0 ']'
>>>>> + sleep 2
>>>>> ++ cat /sge/default/common/act_qmaster
>>>>> + masterhost=strat
>>>>> ++ expr 11 + 1
>>>>> + loop=12
>>>> I've attached a strace file of the /sge/bin/lx24-amd64/qping -info 
>>>> strat 48620 qmaster 1 which is looping and it shows it is timing out.
>>>>
>>>> Anyone know which process it is trying to ping so I can trace it, 
>>>> or any other ideas?
>>>>
>>>> Thanks for any help
>>>>
>>>> Bruce Rothermal
>>>>
>>>> ------------------------------------------------------------------------ 
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list