[GE users] install_execd fails because it can't find queue master

Rayson Ho rayrayson at gmail.com
Fri Sep 21 22:03:18 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Is the $SGE_ROOT directory shared??

Rayson



On 9/21/07, skip at pobox.com <skip at pobox.com> wrote:
> With a few fits and starts I eventually got the queue master installed and
> running on one host (tuba):
>
>     % pgrep -fl sge
>     20264 /gridware/sge/bin/sol-x86/sge_schedd
>     20258 /gridware/sge/bin/sol-x86/sge_qmaster
>
> I then went to another host (userver69), made sure I had appropriate entries
> in /etc/services and ran install_execd.  It said:
>
>     If you haven't installed the Grid Engine qmaster host yet, you must execute
>     this step (with >install_qmaster<) prior the execution host installation.
>
>     For a sucessfull installation you need a running Grid Engine qmaster. It is
>     also neccesary that this host is an administrative host.
>
>     You can verify your current list of administrative hosts with
>     the command:
>
>        # qconf -sh
>
>     You can add an administrative host with the command:
>
>        # qconf -ah <hostname>
>
> So I executed:
>
>     % SGE_ROOT=/gridware/sge /gridware/sge/bin/sol-x86/qconf -sh
>     error: cell directory "/gridware/sge/default" doesn't exist
>
> Hmmm...  I tried continuing anyway:
>
>     Grid Engine cells
>     -----------------
>
>     Please enter cell name which you used for the qmaster
>     installation or press <RETURN> to use [default] >>
>
> "default" is the cell name so I accepted it.  However, it complained:
>
>     Obviously there was no qmaster installation yet!
>     Call >install_qmaster<
>     on the machine which shall run the Grid Engine qmaster
>
> and exited.  Now I was under the impression that I was supposed to run
> install_qmaster on the admin host and install_execd on all execution hosts.
> The two hosts have the same services entries:
>
>     % ssh tuba egrep sge /etc/services
>     sge_qmaster     6444/tcp
>     sge_execd       6445/tcp
>     % ssh userver69 egrep sge /etc/services
>     sge_qmaster     6444/tcp
>     sge_execd       6445/tcp
>
> I'm obviously missing something, but what?
>
> Thanks,
>
> --
> Skip Montanaro - skip at pobox.com - http://www.webfast.com/~skip/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list