[GE users] Re: submit jobs to specific hosts

Nano Surbakti nano.surbakti at gmail.com
Thu Aug 31 12:37:25 BST 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

The NFS problem is solved.

But I still have problem:
I can do qping from exec host to the execd daemon in local machine and
master machine, and to the qmaster daemon in master machine.
But I can do qping from master machine to execd daemon in exec machine.

There firewall between those machines. I guess the firewall close the
incoming port to the exec machines. I guess I have to find alternate
ports.

--
Nano Surbakti


On 8/30/06, Nano Surbakti <nano.surbakti at gmail.com> wrote:
> sge_execd is still running
> I check the message in sge-root/default/spool/qmaster:
>
> 08/31/2006 12:09:39|qmaster|elka-70|I|read job database with 0 entries
> in 0 seconds
> 08/31/2006 12:09:39|qmaster|elka-70|I|qmaster hard descriptor limit is
> set to 8192
> 08/31/2006 12:09:39|qmaster|elka-70|I|qmaster soft descriptor limit is
> set to 8192
> 08/31/2006 12:09:39|qmaster|elka-70|I|qmaster will use max. 8172 file
> descriptors for communication
> 08/31/2006 12:09:39|qmaster|elka-70|I|qmaster will accept max. 99
> dynamic event clients
> 08/31/2006 12:09:39|qmaster|elka-70|I|starting up 6.0u8
>
> Another info:
>
> I can't do rsh from master host (named elka-70) to the exec host, so I
> can't do the job submit test in "N1 Grid Engine 6 Installation Guide"
> (817_6118.pdf, page 86).
>
> When you said "the sge_execd spool directory for the node", I think I
> don't have it. In the exec node the $SGE_ROOT is not owned by
> sgeadmin. The permission look like this:
>
> drwxr-xr-x  18 507      root        4096 Aug 18 00:26 sge-root
>
> Perhaps I need to do something with NFS setting? The sgeadmin is exist
> both on exec node and master node, with the same password. Do I need
> to setup NIS to sync the account?
>
> There are so much new things for me here. Sorry, if I didn't
> understand many things.
>
> Thanks for your help.
>
> --
> Nano Surbakti
>
> On 8/30/06, Sean Dilda <sean at duke.edu> wrote:
> > Nano Surbakti wrote:
> > > On 8/29/06, Davide Cittaro <davide.cittaro at ifom-ieo-campus.it> wrote:
> > >>
> > >> Sorry for the stupid question, but is sge_execd running on nodes?
> > >
> > > Yup.. it's running. I have tested it to run a simple job.
> > >
> >
> > It sounds like sge_execd was communicating with sge_qmaster, thens
> > topped.  Please check that sge_execd is still running.  Please look in
> > the 'messages' files in the sge_qmaster spool directory, and the
> > sge_execd spool directory for the node.  Those files often have useful
> > information when something goes wrong.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list