[GE users] Can't get SGE 6.1u5 to work on Linux/PPC64

Ron Chen ron_chen_123 at yahoo.com
Thu Sep 25 23:02:56 BST 2008


Then, can you run:

% scripts/compilearch -t

and

% scripts/compilearch -t lx26-ppc64

 -Ron


--- On Fri, 9/26/08, Nick Tan <nick at wehi.EDU.AU> wrote:
> yes, all machines are runing 6.1u5
> 
> Thanks,
> 
> Nick
> 
> Ron Chen wrote:
> > Are all machines running exactly the same version,
> namely SGE 6.1u5?
> > 
> >  -Ron
> > 
> > 
> > --- On Fri, 9/26/08, Nick Tan <nick at wehi.EDU.AU>
> wrote:
> >> I ran utilbin/loadcheck and got this:
> >>
> >> arch            lx26-ppc64
> >> num_proc        2
> >> load_short      0.00
> >> load_medium     0.00
> >> load_long       0.00
> >> mem_free        4130.062500M
> >> swap_free       2047.992188M
> >> virtual_free    6178.054688M
> >> mem_total       4363.109375M
> >> swap_total      2047.992188M
> >> virtual_total   6411.101562M
> >> mem_used        233.046875M
> >> swap_used       0.000000M
> >> virtual_used    233.046875M
> >> cpu             0.3%
> >>
> >> It looks like it can collect the data so would
> that
> >> indicate a
> >> communication error then?
> >>
> >> Thanks,
> >>
> >> Nick
> >>
> >> Chris Dagdigian wrote:
> >>> Hi Nick,
> >>>
> >>> I'm guessing that maybe the PDC part of
> SGE on
> >> your ppc systems is 
> >>> unable to poll the apple nodes to get load and
> state
> >> status.
> >>> Can you try the following?
> >>>
> >>> Run the utilbin/loadcheck program on your PPC
> systems
> >> and see what comes 
> >>> back?
> >>>
> >>> Running it on my OS X intel macbook pro
> returns:
> >>>
> >>>> $ /opt/sge/utilbin/darwin-x86/loadcheck
> >>>> arch            darwin-x86
> >>>> num_proc        2
> >>>> load_short      1.35
> >>>> load_medium     1.37
> >>>> load_long       1.39
> >>>> mem_free        2044.082031M
> >>>> swap_free       0.000000M
> >>>> virtual_free    2044.082031M
> >>>> mem_total       4096.000000M
> >>>> swap_total      0.000000M
> >>>> virtual_total   4096.000000M
> >>>> mem_used        2051.917969M
> >>>> swap_used       0.000000M
> >>>> virtual_used    2051.917969M
> >>>> cpu             45.5%
> >>>
> >>> If you can't find the equiv for your
> PPC/Linux
> >> setup then I think that 
> >>> may be the issue (SGE is running but can't
> collect
> >> local performance data)
> >>> Regards,
> >>> Chris
> >>>
> >>>
> >>>
> >>>
> >>> On Sep 25, 2008, at 2:26 AM, Nick Tan wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> I am setting up a cluster with 33 nodes
> running
> >> Linux on x86_64 
> >>>> (SunFire X2100) and 40 nodes running Linux
> on
> >> ppc64 (Apple Xserve G5 
> >>>> cluster node).
> >>>>
> >>>> I am using the precompiled SGE binaries
> for the
> >> x86_64 nodes which are 
> >>>> working fine.  I have compiled SGE for the
> PPC64
> >> nodes.  The x86_64 
> >>>> nodes are running CentOS 5.2 and the PPC64
> nodes
> >> are running Fedora 9.
> >>>> sge_execd starts on the ppc64 node but I
> get this
> >> in the "qstat -f 
> >>>> -explain a" output
> >>>>
> >>>> all.q at bionode34.biocluster     BIP   0/1  
>    
> >> -NA-     -NA-          a
> >>>>        error: no complex attribute for
> threshold
> >> np_load_avg
> >>>> What can I do to fix this?  I've
> searched the
> >> mailing list archives 
> >>>> but couldn't find anything so I'm
> hoping
> >> someone will be able to help.
> >>>> Thanks,
> >>>>
> >>>> Nick
> >>>
> >>>
> >>
> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail:
> >> users-unsubscribe at gridengine.sunsource.net
> >>> For additional commands, e-mail:
> >> users-help at gridengine.sunsource.net
> >> -- 
> >> Nick Tan
> >> Unix Systems Manager
> >> The Walter and Eliza Hall Institute
> >> nick at wehi.edu.au
> >>
> >>
> >>
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail:
> >> users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail:
> >> users-help at gridengine.sunsource.net
> > 
> > 
> >       
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> > 
> 
> -- 
> Nick Tan
> Unix Systems Manager
> The Walter and Eliza Hall Institute
> nick at wehi.edu.au
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail:
> users-help at gridengine.sunsource.net


      

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list