[GE users] Can't get SGE 6.1u5 to work on Linux/PPC64

Nick Tan nick at wehi.EDU.AU
Fri Sep 26 02:53:37 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Ron,

I've done as you suggested and recompiled but I am seeing the same 
behaviour as before.

Nick

Ron Chen wrote:
> Then it really looks like a communication problem. qhost is really basic (with no complex settings or other kinds of setup needed).
> 
> As you mentioned that TARGET_64BIT is defined, I greped the source and found that there is a case for the LINUXAMD64 macro but not TARGET_64BIT. I am wondering if it is right or not, as AMD64 is also 64-bit?
> 
> So, one last thing that I can think of right now is in common/basis_types.h:
> 
> #if defined(FREEBSD) || defined(NETBSD) || defined(LINUXAMD64)
> #  define sge_U32CFormat "%u"
> #  define sge_U32CLetter "u"
> #  define sge_u32c(x)  (unsigned int)(x)
> 
> #  define sge_X32CFormat "%x"
> #  define sge_x32c(x)  (unsigned int)(x)
> #else
> ...
> ...
> 
> In the code,  add a case for "TARGET_64BIT", like:
> 
> #if defined(FREEBSD) || defined(NETBSD) || defined(LINUXAMD64) ||
> defined(TARGET_64BIT)
> 
> Do an "aimk clean" (since it is a header file, the dependency may not be able to detect that) and recompile everything.
> 
>  -Ron
> 
> 
> --- On Fri, 9/26/08, Nick Tan <nick at wehi.EDU.AU> wrote:
>> doing qhost shows:
>>
>> bionode01               lx24-amd64      8  0.00    7.8G 
>> 122.9M    2.0G 
>>      0.0
>> bionode34               -               -     -       -    
>>   -       - 
>>        -
>>
>> where bionode01 is one an x86_64 node which is working and
>> bionode34 is 
>> a ppc64 node which isn't working.
>>
>> Nick
>>
>> Rayson Ho wrote:
>>> On 9/25/08, Nick Tan <nick at wehi.edu.au> wrote:
>>>> It looks like it can collect the data so would
>> that indicate a
>>>> communication error then?
>>> What does qhost show??
>>>
>>> Rayson
>>>
>>>
>>>> Thanks,
>>>>
>>>> Nick
>>>>
>>>>
>>>> Chris Dagdigian wrote:
>>>>> Hi Nick,
>>>>>
>>>>> I'm guessing that maybe the PDC part of
>> SGE on your ppc systems is unable
>>>> to poll the apple nodes to get load and state
>> status.
>>>>> Can you try the following?
>>>>>
>>>>> Run the utilbin/loadcheck program on your PPC
>> systems and see what comes
>>>> back?
>>>>> Running it on my OS X intel macbook pro
>> returns:
>>>>>
>>>>>> $ /opt/sge/utilbin/darwin-x86/loadcheck
>>>>>> arch            darwin-x86
>>>>>> num_proc        2
>>>>>> load_short      1.35
>>>>>> load_medium     1.37
>>>>>> load_long       1.39
>>>>>> mem_free        2044.082031M
>>>>>> swap_free       0.000000M
>>>>>> virtual_free    2044.082031M
>>>>>> mem_total       4096.000000M
>>>>>> swap_total      0.000000M
>>>>>> virtual_total   4096.000000M
>>>>>> mem_used        2051.917969M
>>>>>> swap_used       0.000000M
>>>>>> virtual_used    2051.917969M
>>>>>> cpu             45.5%
>>>>>>
>>>>> If you can't find the equiv for your
>> PPC/Linux setup then I think that may
>>>> be the issue (SGE is running but can't collect
>> local performance data)
>>>>> Regards,
>>>>> Chris
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sep 25, 2008, at 2:26 AM, Nick Tan wrote:
>>>>>
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am setting up a cluster with 33 nodes
>> running Linux on x86_64 (SunFire
>>>> X2100) and 40 nodes running Linux on ppc64 (Apple
>> Xserve G5 cluster node).
>>>>>> I am using the precompiled SGE binaries
>> for the x86_64 nodes which are
>>>> working fine.  I have compiled SGE for the PPC64
>> nodes.  The x86_64 nodes
>>>> are running CentOS 5.2 and the PPC64 nodes are
>> running Fedora 9.
>>>>>> sge_execd starts on the ppc64 node but I
>> get this in the "qstat -f
>>>> -explain a" output
>>>>>> all.q at bionode34.biocluster     BIP   0/1  
>>     -NA-     -NA-          a
>>>>>>       error: no complex attribute for
>> threshold np_load_avg
>>>>>> What can I do to fix this?  I've
>> searched the mailing list archives but
>>>> couldn't find anything so I'm hoping
>> someone will be able to help.
>>>>>> Thanks,
>>>>>>
>>>>>> Nick
>>>>>>
>>>>>
>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail:
>>>> users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail:
>>>> users-help at gridengine.sunsource.net
>>>> --
>>>> Nick Tan
>>>> Unix Systems Manager
>>>> The Walter and Eliza Hall Institute
>>>> nick at wehi.edu.au
>>>>
>>>>
>>>>
>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail:
>>>> users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail:
>>>> users-help at gridengine.sunsource.net
>>>>
>>>>
>>>
>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
>> -- 
>> Nick Tan
>> Unix Systems Manager
>> The Walter and Eliza Hall Institute
>> nick at wehi.edu.au
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
> 
> 
>       
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

-- 
Nick Tan
Unix Systems Manager
The Walter and Eliza Hall Institute
nick at wehi.edu.au

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list