[GE users] sge v6.0u3 update from v6.0u1

Viktor Oudovenko udo at physics.rutgers.edu
Mon Mar 21 22:43:42 GMT 2005


Linux Suse 7.3 with new kernel:

rupc-cs04b:/opt/SGE/default/common # uname -a
Linux rupc-cs04b 2.4.28 #9 SMP Wed Dec 8 14:52:03 EST 2004 i686 unknown

It is not. I just made a fresh installation of 6.0u4 and it worked but
previous one which I want to keed as all the hosts and all the setting I
defined there does not want to start.

You know the key word here is crash. Something was written somewhere that
qmaster does not want to start. It is not the problem of busy ports of it
the problem that master does not start!
Any help and ideas are welcome ! I am really running out of time.
Best,
v


> -----Original Message-----
> From: Ovid Jacob [mailto:ovid.jacob at sun.com] 
> Sent: Monday, March 21, 2005 17:20
> To: users at gridengine.sunsource.net
> Cc: Ovid.Jacob at sun.com
> Subject: Re: [GE users] sge v6.0u3 update from v6.0u1
> 
> 
> Viktor,
> 
> What OS are you running?
> 
> Check that port 536 is not used by some other procces?
> 
> grep 536 /etc/services
> 
> If you get a non-empty string, try changing the ports to 
> something like
> 
> sge_qmaster 836/tcp #SGE_PORT
> sge_execd 837/tcp #SGE_PORT
> 
> 
> Viktor Oudovenko wrote:
> > Hi, guys,
> > 
> > Did anybody meet this problem:
> > 
> > rupc-cs04b:/opt/SGE/default/spool/qmaster # 
> /etc/init.d/sgemaster start
> >    starting sge_qmaster
> >    starting sge_schedd
> > error: commlib error: got read error (closing connection)
> > error: commlib error: can't connect to service (socket error 
> > errno=111)
> > error: getting configuration: unable to contact qmaster 
> using port 536 on
> > host "rupc-cs04b" can't get configuration from qmaster -- 
> waiting ...
> > error: can't connect to service
> > can't get configuration from qmaster -- waiting ...
> > error: can't connect to service
> > can't get configuration from qmaster -- waiting ...
> > error: can't connect to service
> > error: can't get configuration from qmaster -- backgrounding
> > 
> > 
> > After server crush I could not start SGE 6.0u1 qmaster did 
> not want to 
> > start. I have upgraded  6.0u1 to 6.0u3 and got the messages above.
> > 
> > 
> > In qmaster messages I have:
> > 
> > 
> > rupc-cs04b:/opt/SGE/default/spool/qmaster # more messages 
> 03/21/2005 
> > 15:56:47|qmaster|rupc-cs04b|E|wrong cull version, read 
> 0x00000000, but 
> > expected actual version 0x10020000 03/21/2005 
> > 15:56:47|qmaster|rupc-cs04b|E|error in init_packbuffer: wrong cull 
> > version rupc-cs04b:/opt/SGE/default/spool/qmaster #
> > 
> > 
> > Any ideas how to fix this? It is VERY urgent! Please help! 
> Thank you 
> > any body for attention and help!
> > 
> > Best,
> > viktor
> > 
> > 
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> -- 
> 
> 
> take care,
> ovid
> 
> ----------------------------------------------------------------
> 	         "Your Windows system is my other computer."
>                             Grid Engineering
> 
> http://namefinder.sfbay.sun.com/NameFinder?view=sunEmployees&n
fquery=ovid+jacob
                          http://tent.sfbay:88/
                          http://www.mishkan.com
                          ovid.jacob at sun.com
                          x84774 (650.786.4774)
-----------------------------------------------------------------





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list