[GE users] Can't get SGE 6.1u5 to work on Linux/PPC64

Nick Tan nick at wehi.EDU.AU
Thu Sep 25 22:45:58 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Chris,

I ran utilbin/loadcheck and got this:

arch            lx26-ppc64
num_proc        2
load_short      0.00
load_medium     0.00
load_long       0.00
mem_free        4130.062500M
swap_free       2047.992188M
virtual_free    6178.054688M
mem_total       4363.109375M
swap_total      2047.992188M
virtual_total   6411.101562M
mem_used        233.046875M
swap_used       0.000000M
virtual_used    233.046875M
cpu             0.3%

It looks like it can collect the data so would that indicate a
communication error then?

Thanks,

Nick

Chris Dagdigian wrote:
> Hi Nick,
> 
> I'm guessing that maybe the PDC part of SGE on your ppc systems is 
> unable to poll the apple nodes to get load and state status.
> 
> Can you try the following?
> 
> Run the utilbin/loadcheck program on your PPC systems and see what comes 
> back?
> 
> Running it on my OS X intel macbook pro returns:
> 
>> $ /opt/sge/utilbin/darwin-x86/loadcheck
>> arch            darwin-x86
>> num_proc        2
>> load_short      1.35
>> load_medium     1.37
>> load_long       1.39
>> mem_free        2044.082031M
>> swap_free       0.000000M
>> virtual_free    2044.082031M
>> mem_total       4096.000000M
>> swap_total      0.000000M
>> virtual_total   4096.000000M
>> mem_used        2051.917969M
>> swap_used       0.000000M
>> virtual_used    2051.917969M
>> cpu             45.5%
> 
> 
> If you can't find the equiv for your PPC/Linux setup then I think that 
> may be the issue (SGE is running but can't collect local performance data)
> 
> Regards,
> Chris
> 
> 
> 
> 
> On Sep 25, 2008, at 2:26 AM, Nick Tan wrote:
> 
>> Hi all,
>>
>> I am setting up a cluster with 33 nodes running Linux on x86_64 
>> (SunFire X2100) and 40 nodes running Linux on ppc64 (Apple Xserve G5 
>> cluster node).
>>
>> I am using the precompiled SGE binaries for the x86_64 nodes which are 
>> working fine.  I have compiled SGE for the PPC64 nodes.  The x86_64 
>> nodes are running CentOS 5.2 and the PPC64 nodes are running Fedora 9.
>>
>> sge_execd starts on the ppc64 node but I get this in the "qstat -f 
>> -explain a" output
>>
>> all.q at bionode34.biocluster     BIP   0/1       -NA-     -NA-          a
>>        error: no complex attribute for threshold np_load_avg
>>
>> What can I do to fix this?  I've searched the mailing list archives 
>> but couldn't find anything so I'm hoping someone will be able to help.
>>
>> Thanks,
>>
>> Nick
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

-- 
Nick Tan
Unix Systems Manager
The Walter and Eliza Hall Institute
nick at wehi.edu.au


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list