[GE users] Can't get SGE 6.1u5 to work on Linux/PPC64

Nick Tan nick at wehi.EDU.AU
Thu Sep 25 23:09:12 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Ron,

Running both of those commands results in the same output:

TARGET_64BIT

Nick

Ron Chen wrote:
> Then, can you run:
> 
> % scripts/compilearch -t
> 
> and
> 
> % scripts/compilearch -t lx26-ppc64
> 
>  -Ron
> 
> 
> --- On Fri, 9/26/08, Nick Tan <nick at wehi.EDU.AU> wrote:
>> yes, all machines are runing 6.1u5
>>
>> Thanks,
>>
>> Nick
>>
>> Ron Chen wrote:
>>> Are all machines running exactly the same version,
>> namely SGE 6.1u5?
>>>  -Ron
>>>
>>>
>>> --- On Fri, 9/26/08, Nick Tan <nick at wehi.EDU.AU>
>> wrote:
>>>> I ran utilbin/loadcheck and got this:
>>>>
>>>> arch            lx26-ppc64
>>>> num_proc        2
>>>> load_short      0.00
>>>> load_medium     0.00
>>>> load_long       0.00
>>>> mem_free        4130.062500M
>>>> swap_free       2047.992188M
>>>> virtual_free    6178.054688M
>>>> mem_total       4363.109375M
>>>> swap_total      2047.992188M
>>>> virtual_total   6411.101562M
>>>> mem_used        233.046875M
>>>> swap_used       0.000000M
>>>> virtual_used    233.046875M
>>>> cpu             0.3%
>>>>
>>>> It looks like it can collect the data so would
>> that
>>>> indicate a
>>>> communication error then?
>>>>
>>>> Thanks,
>>>>
>>>> Nick
>>>>
>>>> Chris Dagdigian wrote:
>>>>> Hi Nick,
>>>>>
>>>>> I'm guessing that maybe the PDC part of
>> SGE on
>>>> your ppc systems is 
>>>>> unable to poll the apple nodes to get load and
>> state
>>>> status.
>>>>> Can you try the following?
>>>>>
>>>>> Run the utilbin/loadcheck program on your PPC
>> systems
>>>> and see what comes 
>>>>> back?
>>>>>
>>>>> Running it on my OS X intel macbook pro
>> returns:
>>>>>> $ /opt/sge/utilbin/darwin-x86/loadcheck
>>>>>> arch            darwin-x86
>>>>>> num_proc        2
>>>>>> load_short      1.35
>>>>>> load_medium     1.37
>>>>>> load_long       1.39
>>>>>> mem_free        2044.082031M
>>>>>> swap_free       0.000000M
>>>>>> virtual_free    2044.082031M
>>>>>> mem_total       4096.000000M
>>>>>> swap_total      0.000000M
>>>>>> virtual_total   4096.000000M
>>>>>> mem_used        2051.917969M
>>>>>> swap_used       0.000000M
>>>>>> virtual_used    2051.917969M
>>>>>> cpu             45.5%
>>>>> If you can't find the equiv for your
>> PPC/Linux
>>>> setup then I think that 
>>>>> may be the issue (SGE is running but can't
>> collect
>>>> local performance data)
>>>>> Regards,
>>>>> Chris
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sep 25, 2008, at 2:26 AM, Nick Tan wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am setting up a cluster with 33 nodes
>> running
>>>> Linux on x86_64 
>>>>>> (SunFire X2100) and 40 nodes running Linux
>> on
>>>> ppc64 (Apple Xserve G5 
>>>>>> cluster node).
>>>>>>
>>>>>> I am using the precompiled SGE binaries
>> for the
>>>> x86_64 nodes which are 
>>>>>> working fine.  I have compiled SGE for the
>> PPC64
>>>> nodes.  The x86_64 
>>>>>> nodes are running CentOS 5.2 and the PPC64
>> nodes
>>>> are running Fedora 9.
>>>>>> sge_execd starts on the ppc64 node but I
>> get this
>>>> in the "qstat -f 
>>>>>> -explain a" output
>>>>>>
>>>>>> all.q at bionode34.biocluster     BIP   0/1  
>>    
>>>> -NA-     -NA-          a
>>>>>>        error: no complex attribute for
>> threshold
>>>> np_load_avg
>>>>>> What can I do to fix this?  I've
>> searched the
>>>> mailing list archives 
>>>>>> but couldn't find anything so I'm
>> hoping
>>>> someone will be able to help.
>>>>>> Thanks,
>>>>>>
>>>>>> Nick
>>>>>
>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail:
>>>> users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail:
>>>> users-help at gridengine.sunsource.net
>>>> -- 
>>>> Nick Tan
>>>> Unix Systems Manager
>>>> The Walter and Eliza Hall Institute
>>>> nick at wehi.edu.au
>>>>
>>>>
>>>>
>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail:
>>>> users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail:
>>>> users-help at gridengine.sunsource.net
>>>
>>>       
>>>
>>>
>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
>> -- 
>> Nick Tan
>> Unix Systems Manager
>> The Walter and Eliza Hall Institute
>> nick at wehi.edu.au
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
> 
> 
>       
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

-- 
Nick Tan
Unix Systems Manager
The Walter and Eliza Hall Institute
nick at wehi.edu.au

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list