[GE users] sge_execd problems

Rayson Ho rayrayson at gmail.com
Fri Oct 17 19:15:21 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Should be available from the "spooling_method" parameter -- see the
bootstrap(5) manpage.

No matter what the spooling method is used, looks like your qmaster
machine crashed recently, and you will need to fix the corrupted
configuration files.

Rayson


On 10/17/08, Mag Gam <magawake at gmail.com> wrote:
> I don't know...how can I check?
>
> On Fri, Oct 17, 2008 at 2:00 PM, Rayson Ho <rayrayson at gmail.com> wrote:
> > On 10/17/08, Mag Gam <magawake at gmail.com> wrote:
> >> 10/17/2008 13:53:53|qmaster|master01.engrMec.unc.edu|E|cqueue_list_locate_qinstance("(null)@(null)"):
> >> cqueue == NULL("(null)", "(null)", 1, 0
> >> 10/17/2008 13:53:53|qmaster|master01.engrMec.unc.edu|E|writing job
> >> finish information: can't locate queue "(null)@(null)"
> >
> > Looks like your SGE configuration is corrupted! Are you using Berkeley
> > DB spooling or classic spooling??
> >
> > Rayson
> >
> >
> >
> >> 10/17/2008 13:53:53|qmaster|master01.engrMec.unc.edu|W|job 5014.1
> >> failed on host <unknown host> before writing exit_status because:
> >> shepherd exited with exit status 19
> >> 10/17/2008 13:53:53|qmaster|master01.engrMec.unc.edu|C|!!!!!!!!!! got
> >> NULL element for QU_rerun !!!!!!!!!!
> >>
> >>
> >>
> >> On Fri, Oct 17, 2008 at 1:32 PM, Rayson Ho <rayrayson at gmail.com> wrote:
> >> > Looks like a network resolution/connection problem... Are you abkle to
> >> > connect to the master from the command line, like:
> >> >
> >> > % telnet master01.engrMec.unc.edu 536
> >> >
> >> > Rayson
> >> >
> >> >
> >> > On 10/17/08, Mag Gam <magawake at gmail.com> wrote:
> >> >> I have the sgemaster running on our head node and the on the clients I
> >> >> am able to start up sge_execd
> >> >>
> >> >> I see sge_execd process running on the client.
> >> >>
> >> >> But when I do
> >> >>
> >> >> $ qhost
> >> >> error: commlib error: can't connect to service (Connection refused)
> >> >> error: unable to contact qmaster using port 536 on host
> >> >> "master01.engrMec.unc.edu"
> >> >>
> >> >>
> >> >> When I start up the client I see no changed in the messages file
> >> >> either. Has anyone seen this before? Using, GE 6.1u5
> >> >>
> >> >> TIA
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >> >>
> >> >>
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >> >
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list