[GE users] Can't get SGE 6.1u5 to work on Linux/PPC64

Ron Chen ron_chen_123 at yahoo.com
Thu Sep 25 23:43:45 BST 2008


Then it really looks like a communication problem. qhost is really basic (with no complex settings or other kinds of setup needed).

As you mentioned that TARGET_64BIT is defined, I greped the source and found that there is a case for the LINUXAMD64 macro but not TARGET_64BIT. I am wondering if it is right or not, as AMD64 is also 64-bit?

So, one last thing that I can think of right now is in common/basis_types.h:

#if defined(FREEBSD) || defined(NETBSD) || defined(LINUXAMD64)
#  define sge_U32CFormat "%u"
#  define sge_U32CLetter "u"
#  define sge_u32c(x)  (unsigned int)(x)

#  define sge_X32CFormat "%x"
#  define sge_x32c(x)  (unsigned int)(x)
#else
...
...

In the code,  add a case for "TARGET_64BIT", like:

#if defined(FREEBSD) || defined(NETBSD) || defined(LINUXAMD64) ||
defined(TARGET_64BIT)

Do an "aimk clean" (since it is a header file, the dependency may not be able to detect that) and recompile everything.

 -Ron


--- On Fri, 9/26/08, Nick Tan <nick at wehi.EDU.AU> wrote:
> doing qhost shows:
> 
> bionode01               lx24-amd64      8  0.00    7.8G 
> 122.9M    2.0G 
>      0.0
> bionode34               -               -     -       -    
>   -       - 
>        -
> 
> where bionode01 is one an x86_64 node which is working and
> bionode34 is 
> a ppc64 node which isn't working.
> 
> Nick
> 
> Rayson Ho wrote:
> > On 9/25/08, Nick Tan <nick at wehi.edu.au> wrote:
> >> It looks like it can collect the data so would
> that indicate a
> >> communication error then?
> > 
> > What does qhost show??
> > 
> > Rayson
> > 
> > 
> >> Thanks,
> >>
> >> Nick
> >>
> >>
> >> Chris Dagdigian wrote:
> >>> Hi Nick,
> >>>
> >>> I'm guessing that maybe the PDC part of
> SGE on your ppc systems is unable
> >> to poll the apple nodes to get load and state
> status.
> >>> Can you try the following?
> >>>
> >>> Run the utilbin/loadcheck program on your PPC
> systems and see what comes
> >> back?
> >>> Running it on my OS X intel macbook pro
> returns:
> >>>
> >>>
> >>>> $ /opt/sge/utilbin/darwin-x86/loadcheck
> >>>> arch            darwin-x86
> >>>> num_proc        2
> >>>> load_short      1.35
> >>>> load_medium     1.37
> >>>> load_long       1.39
> >>>> mem_free        2044.082031M
> >>>> swap_free       0.000000M
> >>>> virtual_free    2044.082031M
> >>>> mem_total       4096.000000M
> >>>> swap_total      0.000000M
> >>>> virtual_total   4096.000000M
> >>>> mem_used        2051.917969M
> >>>> swap_used       0.000000M
> >>>> virtual_used    2051.917969M
> >>>> cpu             45.5%
> >>>>
> >>>
> >>> If you can't find the equiv for your
> PPC/Linux setup then I think that may
> >> be the issue (SGE is running but can't collect
> local performance data)
> >>> Regards,
> >>> Chris
> >>>
> >>>
> >>>
> >>>
> >>> On Sep 25, 2008, at 2:26 AM, Nick Tan wrote:
> >>>
> >>>
> >>>> Hi all,
> >>>>
> >>>> I am setting up a cluster with 33 nodes
> running Linux on x86_64 (SunFire
> >> X2100) and 40 nodes running Linux on ppc64 (Apple
> Xserve G5 cluster node).
> >>>> I am using the precompiled SGE binaries
> for the x86_64 nodes which are
> >> working fine.  I have compiled SGE for the PPC64
> nodes.  The x86_64 nodes
> >> are running CentOS 5.2 and the PPC64 nodes are
> running Fedora 9.
> >>>> sge_execd starts on the ppc64 node but I
> get this in the "qstat -f
> >> -explain a" output
> >>>> all.q at bionode34.biocluster     BIP   0/1  
>     -NA-     -NA-          a
> >>>>       error: no complex attribute for
> threshold np_load_avg
> >>>>
> >>>> What can I do to fix this?  I've
> searched the mailing list archives but
> >> couldn't find anything so I'm hoping
> someone will be able to help.
> >>>> Thanks,
> >>>>
> >>>> Nick
> >>>>
> >>>
> >>>
> >>
> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail:
> >> users-unsubscribe at gridengine.sunsource.net
> >>> For additional commands, e-mail:
> >> users-help at gridengine.sunsource.net
> >>>
> >> --
> >> Nick Tan
> >> Unix Systems Manager
> >> The Walter and Eliza Hall Institute
> >> nick at wehi.edu.au
> >>
> >>
> >>
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail:
> >> users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail:
> >> users-help at gridengine.sunsource.net
> >>
> >>
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> > 
> 
> -- 
> Nick Tan
> Unix Systems Manager
> The Walter and Eliza Hall Institute
> nick at wehi.edu.au
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail:
> users-help at gridengine.sunsource.net


      

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list