[GE users] shadow master can not start sge_schedd

A listner gg3796 at yahoo.com
Fri Mar 23 00:04:27 GMT 2007


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Rayson,
Thanks for the reply.
I checked and as you can see it started sge_master as
well as sge_schedd but then it died.

Sorry about long "log"

rgds,
 

Output of shadow master ps:
************************
sh-3.00# ps -ef|grep sge
gridadm  17523     1  0 14:22 ?        00:00:00
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_sha dowd
root     18654     1  0 18:45 ?        00:00:00
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_qma ster
root     18657  5279  0 18:45 pts/1    00:00:00 grep
sge
sh-3.00# ps -ef|grep sge
root     17523     1  0 14:22 ?        00:00:00
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_sha dowd
root     18659 17523  0 18:45 ?        00:00:00
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_sch edd
root     18660 18659  0 18:45 ?        00:00:00
[sge_schedd] <defunct>
gridadm  18661     1  0 18:45 ?        00:00:00
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_sch edd
root     18664  5279  0 18:45 pts/1    00:00:00 grep
sge
sh-3.00# ps -ef|grep sge
gridadm  17523     1  0 14:22 ?        00:00:00
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_sha dowd
root     18667  5279  0 18:45 pts/1    00:00:00 grep
sge
sh-3.00# ps -ef|grep sge
gridadm  17523     1  0 14:22 ?        00:00:00
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_shadowd
root     18670  5279  0 18:46 pts/1    00:00:00 grep
sge

************************************
# tail messages_shadowd.aus-node0
*************************************
03/22/2007 14:26:50|shadowd|aus-node0|W|starting
program:
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_qmaster
03/22/2007 14:26:55|shadowd|aus-node0|W|starting
program:
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_schedd
03/22/2007 14:27:02|shadowd|aus-node0|C|couldn't start
process:
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_schedd
03/22/2007 14:27:02|shadowd|aus-node0|E|can't start
qmaster
03/22/2007 18:45:10|shadowd|aus-node0|E|could not get
environment variable SGE_QMASTER_PORT or service
"sge_qmaster"
03/22/2007 18:45:10|shadowd|aus-node0|W|using cached
"sge_qmaster" port value 4891
03/22/2007 18:45:10|shadowd|aus-node0|W|starting
program:
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_qmaster
03/22/2007 18:45:15|shadowd|aus-node0|W|starting
program:
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_schedd
03/22/2007 18:45:22|shadowd|aus-node0|C|couldn't start
process:
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_schedd
03/22/2007 18:45:22|shadowd|aus-node0|E|can't start
qmaster

***********************************
# tail messages_qmaster.aus-node0
**********************************
error: could not get environment variable
SGE_QMASTER_PORT or service "sge_qmaster"
daemonize error: child exited before sending daemonize
state ######################
 03/22/2007 18:45:10
######################
######################
 03/22/2007 18:45:15
######################
error: could not get environment variable
SGE_QMASTER_PORT or service "sge_qmaster"
daemonize error: child exited before sending daemonize
state


--- Rayson Ho <rayrayson at gmail.com> wrote:

> When you check with ps on the shadow machine, do you
> see the qmaster
> process running??
> 
> Rayson
> 
> 
> 
> On 3/22/07, A listner <gg3796 at yahoo.com> wrote:
> >
> >
> > All,
> > I have qmaster(Solaris) and i created Shadow
> (RHEL4).
> > When I kill daemons on master server the shadow
> does
> > not take over. it has following errors.
> >
> > Thanks for the help.
> >
> >
> >
> >
> > service "sge_qmaster"
> > 03/22/2007 14:22:50|shadowd|aus-node0|I|starting
> up
> > 6.0u9
> > 03/22/2007 14:26:50|shadowd|aus-node0|E|commlib
> error:
> > can't connect to service (Connection refused)
> > 03/22/2007 14:26:50|shadowd|aus-node0|W|starting
> > program:
> >
>
/tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_qmaster
> > 03/22/2007 14:26:55|shadowd|aus-node0|W|starting
> > program:
> >
> /tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_schedd
> > 03/22/2007 14:27:02|shadowd|aus-node0|C|couldn't
> start
> > process:
> >
> /tools/sge_austin/n1ge-6_0u9/bin/lx24-x86/sge_schedd
> > 03/22/2007 14:27:02|shadowd|aus-node0|E|can't
> start
> > qmaster
> >
> >
> >
> >
> >
> >
> >
>
____________________________________________________________________________________
> > Get your own web address.
> > Have a HUGE year through Yahoo! Small Business.
> > http://smallbusiness.yahoo.com/domains/?p=BESTDEAL
> >
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> >
> >
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> 
> 



 
____________________________________________________________________________________
The fish are biting. 
Get more visitors on your site using Yahoo! Search Marketing.
http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list