[GE users] sge_execd says it starts but it doesn't start

futurity neil at futurity.co.uk
Tue Apr 27 17:06:45 BST 2010


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi rems0,

Thank you for your quick reply.

We're really happy with openSuse10.3 as we've found it to be bug free and very stable.  We've had some issues with openSuse11.0 and 11.1 which is why we're still using openSuse10.3. As these servers aren't on the internet and we have our own local copy of the update repository we've found it to be a very nice OS, but I agree there are newer perhaps better Linux distros out there.

openSuse10.3 worked perfectly for grid engine 61u3 although this doesn't automatically mean it'll work for 62u5.  I thought openSuse10.3 met the requirements for grid engine 6.2u5.  Is there a known working Linux distribution that is recommended?

Running the following gives:

ldd /rmt/sge62/bin/lx24-x86/sge_execd

        linux-gate.so.1 =>  (0xffffe000)
        libdl.so.2 => /lib/libdl.so.2 (0xb7f80000)
        libm.so.6 => /lib/libm.so.6 (0xb7f5b000)
        libpthread.so.0 => /lib/libpthread.so.0 (0xb7f44000)
        libcore.so => /rmt/sge62/bin/lx24-x86/../../lib/lx24-x86/libcore.so (0xb7f42000)
        libc.so.6 => /lib/libc.so.6 (0xb7e0f000)
        /lib/ld-linux.so.2 (0xb7fa1000)

Unfortunately I don't understand this output.  Is it ok?

sge_execd doesn't appear to have logged anything to /var/log/messages :(

Many thanks for your help.

Neil

On 27 April 2010 16:49, rems0 <Richard.Ems at cape-horn-eng.com<mailto:Richard.Ems at cape-horn-eng.com>> wrote:
Hi Neil,

openSUSE 10.3 is really old and has been discontinued October 31st 2009,
see http://en.opensuse.org/SUSE_Linux_Lifetime#Discontinued_Distributions.


What does " ldd /rmt/sge62/bin/lx24-x86/sge_execd " report ?
Any message in /var/log/messages ?

Richard

On 04/27/2010 05:29 PM, futurity wrote:
> Hi,
>
> I'm in the process of installing a new grid with the aim of migrating
> machines from our 61 grid to 62u5.
>
> Unfortunately the sge_execd process doesn't seem to start on our
> execution host machines.
>
> The qmaster installed without any problems (on openSuse 10.3 32bit) and
> when started using "/etc/init.d/sgemaster.p6444 start" the process works
> fine.  qstat, qhost etc all work fine.
>
> The sge_execd installed without any problems (again on openSuse 10.3
> 32bit) and when started using "/etc/init.d/sgeexecd.p6444 start" it says
> it started, but the process just isn't running.  qhost lists the new
> execution host, but with dashes against the new host (not the details as
> expected).
>
> I've even tried running "/rmt/sge62/bin/lx24-x86/sge_execd" as user
> sgeadmin62 (with the correct environment) and no errors are reported,
> but again the process isn't running.
>
> The only non default value used during the sge_execd install was the
> spool directory for which I entered "/local".  I had previously made a
> directory "/local" on the local disk and chmod'ed it to 777 (still owned
> by root).  Again it said this was fine, but sge_execd didn't actually
> make any sub directories or log any messages to files within it (during
> the install stage or while being run).
>
> Any idea what could be going on?  Is there a way to turn on any debug
> for sge_execd so I can see what's going on?
>
> Kind Regards
>
> Neil


--
Richard Ems       mail: Richard.Ems at Cape-Horn-Eng.com

Cape Horn Engineering S.L.
C/ Dr. J.J. Dómine 1, 5? piso
46011 Valencia
Tel : +34 96 3242923 / Fax 924
http://www.cape-horn-eng.com

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=255144

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net<mailto:users-unsubscribe at gridengine.sunsource.net>].




More information about the gridengine-users mailing list