[GE users] Unable install execd

anthony_whelan anthony.tux at gmail.com
Wed Jan 27 10:40:15 GMT 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hi Marco,

Thank you, you are correct, it is due to mismatching versions. I am now upgrading the grid nodes to the latest version. I am fully confident that this is the cause. If not, I'm sure I'll be mailing you back again.

Thanks,

Anthony

On Wed, Jan 27, 2010 at 7:55 AM, dom <marco.donauer at sun.com<mailto:marco.donauer at sun.com>> wrote:
Hi,

could you check the version of all of your binaries please?
Which GE version are you using? You're is looking like mixing up different ge versions.
eg. running a qmaster with 62u3 and doing a qconf of version 62u2 then you will get the same error message.

Mixing GE versions is not allowed. There might be some versions of GE where may work, but this shouldn't work.

You will get the version with qconf -help and the version of qmaster is shown at the qmaster messages file. It must be one of the first entries.

Regards,
Marco



On 01/26/10 18:15, anthony_whelan wrote:
Output from install_execd also contains this:

Grid Engine TCP/IP communication service
----------------------------------------

The port for sge_execd is curently set BOTH as service and by the
shell environment

   SGE_EXECD_PORT = 6445
   sge_execd service set to port 6445

   Currently SGE_EXECD_PORT = 6445 is active!

Hit <RETURN> to continue >>

Could this mean that port 6445 already has something attached to it?

I tried running netstat but was unable to get anything from it.

On Tue, Jan 26, 2010 at 4:02 PM, Anthony <anthony.tux at gmail.com<mailto:anthony.tux at gmail.com>> wrote:
Yes the 64 bit machine is running a 64 bit OS.
No useful info from 64bit qconf other than expected output

Anthony


On Tue, Jan 26, 2010 at 3:56 PM, reuti <reuti at staff.uni-marburg.de<mailto:reuti at staff.uni-marburg.de>> wrote:
Am 26.01.2010 um 15:05 schrieb anthony_whelan:

> Output:
>
>
> Checking hostname resolving
> ---------------------------
>
> Cannot contact qmaster. The command failed:
>
>    ./bin/lx24-amd64/qconf -sh
>
> The error message was:
>
>    ERROR: failed receiving gdi request response for mid=1 (got
> syncron message receive timeout error).
>
> You can fix the problem now or abort the installation  procedure.
> The problem can be:
>
>    - the qmaster is not running
>    - the qmaster host is down
>    - an active firewall blocks your request
>
> Contact qmaster again (y/n) ('n' will abort) [y] >>
>
> the funny thing is that
> $SGE_ROOT/bin/lx24-amd64/qconf -sh fails yet
> $SGE_ROOT/bin/lx24-x86/qconf -sh works

SGE supports heterogenous clusters by default. So this shouldn't be a
problem, otherwise it's an issue.

Is your 64 bit machine running a 64 bit OS? Is there any useful
output when calling the 64 bit qconf by hand?

-- Reuti


>
>
> On Tue, Jan 26, 2010 at 2:02 PM, Anthony <anthony.tux at gmail.com<mailto:anthony.tux at gmail.com>>
> wrote:
> Yes, $SGE_ROOT/default/common is the same for all nodes.
>
>
> On Tue, Jan 26, 2010 at 1:56 PM, reuti <reuti at staff.uni-marburg.de<mailto:reuti at staff.uni-marburg.de>>
> wrote:
> Am 26.01.2010 um 14:42 schrieb anthony_whelan:
>
> > No $SGE_ROOT is installed on each node seperatly
>
> You copied the $SGE_ROOT/default/common to all nodes?
>
> http://gridengine.sunsource.net/howto/nfsreduce.html
>
> -- Reuti
>
> >
> > On Tue, Jan 26, 2010 at 1:40 PM, reuti <reuti at staff.uni-marburg.de<mailto:reuti at staff.uni-marburg.de>>
> > wrote:
> > Am 26.01.2010 um 13:03 schrieb anthony_whelan:
> >
> > > My apologies, I forgot to include the output of qconf -sh on the
> > > execution host:
> > >
> > > ERROR: failed receiving gdi request response for mid=1 (got
> syncron
> > > message receive timeout error).
> > >
> > >
> > > On Tue, Jan 26, 2010 at 12:01 PM, Anthony <anthony.tux at gmail.com<mailto:anthony.tux at gmail.com>>
> > > wrote:
> > > Hi Harald,
> > >
> > > I have the same problem as Manjula,
> > >
> > > To answer your questions:
> > >
> > > qconf -sh on qmaster has the execution host listed
> > >
> > > qconf -sh on execution host outputs:
> > >
> > > "ignor_fqdn" is set to true for $SGE_ROOT/$SGE_CEL?L/common/
> > > bootstrap
> >
> > The $SGE_ROOT is shared berween all nodes?
> >
> > -- Reuti
> >
> >
> > > I too am using a 64 bit execution host connecting to a 32 bit
> > qmaster.
> > >
> > > Any advice would be greatly appreciated.
> > >
> > > Anthony
> > >
> > > Hi Manjula,
> > >
> > > please answer these questions:
> > >
> > >
> > > Is the execution host listed in the output of
> > > # qconf -sh
> > > ?
> > >
> > >
> > > What is printed if you execute
> > > # qconf -sh
> > > on the execution host?
> > >
> > >
> > > Is "ignore_fqdn" enabled?
> > > This is defined in the $SGE_ROOT/$SGE_CEL?L/common/bootstrap
> file.
> > >
> > > Regards,
> > > Harald
> > >
> > >
> > >
> > > On 05/19/09 13:09, manjula14 wrote:
> > > > Hi,
> > > >
> > > > My SGE master host is a fedora 8, 32 bit machine which I've
> > > configured successfully.
> > > > Now I want to add an execution host which is fedora 8, 64 bit
> > > machine.
> > > > while installing execd it says exec- host is not an admin host,
> > > but I've added it as an admin host using qconf -ah command.
> > > >
> > > > (contents of /etc/hosts and /etc/nsswitch.conf are same on both
> > > master and exec host)
> > > >
> > > > Kindly help to resolve this issue.
> > > >
> > > > Thanks,
> > > >
> > > > --------------------?--------------------?--------------
> > > > http://gridengine.su?nsource.net/ds/viewM<http://nsource.net/ds/viewM>?essage.do?
> > > dsForumId=?38?&dsMessageId=197363
> > > >
> > > > To unsubscribe from this discussion, e-mail: [users-
> > > unsubscribe at g?ridengine.sunsource.?net].
> > >
> > >
> > > --
> > > Sun Microsystems GmbH Harald Pollinger
> > > Dr.-Leo-Ritter-Str. 7 Sun Grid Engine Engineering
> > > D-93049 Regensburg Phone: +49 (0)941 3075-209 (x60209)
> > > Germany Fax: +49 (0)941 3075-222 (x60222)
> > > http://www.sun.com/gridware
> > > mailto:harald.pollinger at sun.com<mailto:harald.pollinger at sun.com>
> > > Sitz der Gesellschaft:
> > > Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-
> Heimstetten
> > > Amtsgericht Muenchen: HRB 161028
> > > Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
> > > Vorsitzender des Aufsichtsrates: Martin Haering
> > >
> >
> > ------------------------------------------------------
> > http://gridengine.sunsource.net/ds/viewMessage.do?
> > dsForumId=38&dsMessageId=241099
> >
> > To unsubscribe from this discussion, e-mail: [users-
> > unsubscribe at gridengine.sunsource.net<mailto:unsubscribe at gridengine.sunsource.net>].
> >
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?
> dsForumId=38&dsMessageId=241104
>
> To unsubscribe from this discussion, e-mail: [users-
> unsubscribe at gridengine.sunsource.net<mailto:unsubscribe at gridengine.sunsource.net>].
>
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=241125

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net<mailto:users-unsubscribe at gridengine.sunsource.net>].




--

Sun Microsystems GmbH         Marco Donauer
Dr.-Leo-Ritter-Str. 7         SUN Grid Engine Engineering
D-93049 Regensburg            Phone: +49 (0)941 3075-211  (x60211)
Germany                       Fax: +49 (0)941 3075-222  (x60222)

http://www.sun.com/gridware

mailto:marco.donauer at sun.com
Sitz der Gesellschaft:
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels
Vorsitzender des Aufsichtsrates: Martin Haering





More information about the gridengine-users mailing list