[GE users] Unknown host

Robert White alphamonk at gmail.com
Wed Dec 19 23:21:35 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

I ran qacct -j jobid for each of the failed jobs and there is one common
system that is returned from that output. I have pulled it out of grid to
check it. I can run grids gethostbyname on this system without error. Does
anyone know of any test I can run on this system to see what it's problem
is?

Bob

On Dec 19, 2007 5:01 PM, Robert White <alphamonk at gmail.com> wrote:

> Do you know of a way I can tell which system was the master system that
> needed to communicate with the slave nodes. I am running qacct -j 2136232 -o
> -h. Will this output allow me to find out which node was the master? Are
> there any other commands that could provide the information I seek.
>
> Bob
>
>   On Dec 19, 2007 9:32 AM, Rayson Ho <rayrayson at gmail.com> wrote:
>
> > The first task of the MPI job needs to connect to the rest of the
> > nodes that are allocated to the MPI job to start the slave tasks -- so
> > may be 1 or 2 hosts could not resolve some hostnames properly, and
> > that's why if the first task does not land on those hosts, the job
> > runs fine??
> >
> > Rayson
> >
> >
> >
> > On Dec 18, 2007 11:44 AM, Robert White <alphamonk at gmail.com> wrote:
> > > Yes the jobs that are giving this error are parallel jobs.
> > >
> > >
> > > On Dec 17, 2007 6:21 PM, Rayson Ho <rayrayson at gmail.com> wrote:
> > >
> > > > Is it a parallel job??
> > > >
> > > > Rayson
> > > >
> > > > On Dec 17, 2007 11:10 AM, Robert White <alphamonk at gmail.com> wrote:
> > > > > My users receive this error message from time to time and I was
> > > wondering if
> > > > > anyone out there knows what the cause or possible cause of the
> > message
> > > could
> > > > > be. I am able to resolve the hostname from all computers using
> > nslookup
> > > and
> > > > > grids gethostbyname application. I am not able to resolve this
> > name
> > > using
> > > > > the "execd@" in front of the hostname.
> > > > >
> > > > > error: executing task of job 2134745 failed: failed sending task
> > to
> > > > > execd at quark.serverengines.com : UNKNOWN_HOST
> > > > >
> > > > > Any help would be greatly appreciated.
> > > > >
> > > > > Bob
> > > >
> > > >
> > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > > > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > > >
> > > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
>



More information about the gridengine-users mailing list