[GE users] Sgemaster won't start after upgrading

reuti reuti at staff.uni-marburg.de
Mon Feb 22 22:04:16 GMT 2010


Am 22.02.2010 um 22:54 schrieb heywood:

> Looks to me like the util/arch script uses uname to get the lx26  
> (kernel is
> 2.6.*), not by looking for the directory lx26* (or lx24*).
>
> If so, the question is why it returned lx24 for 6.2u3 (if it did on  
> this 2.6
> system).

The actual script uses:

    case $osrelease in
    2.[46].*)
       # retrieve os release. We use 2.4 on kernel 2.6 machines, unless
       # we have binaries installed that have been built for 2.6
       case $osrelease in
       2.4.*)
          lxrelease=24
          ;;
       2.6.*)
          ROOT_DIR=`dirname $0`/..
          if [ "$SGE_ROOT" != "" -a -d "$SGE_ROOT/bin/lx26-$ 
{lxmachine}" ] ; then
             lxrelease=26
          elif [ "$SGE_ROOT" = "" -a -d "$ROOT_DIR/bin/lx26-$ 
{lxmachine}" ] ; then
             lxrelease=26
          else
             lxrelease=24
          fi
          ;;
       esac
...

Did you also install the updated common package?

-- Reuti


> But things are working so I'm OK.
>
> Todd
>
>
> On 2/22/10 4:35 PM, "reuti" <reuti at staff.uni-marburg.de> wrote:
>
>> Am 22.02.2010 um 22:18 schrieb heywood:
>>
>>> No, it isn't hard coded. It returns lx26, while the directories are
>>> named
>>> lx24...
>>
>> The actual version of the arch script checks whether there is a
>> directory lx26-... As you created links to the dirs, it will answer
>> with lx26... But w/o the links, it should fall back to the default
>> lx24...
>>
>> So, the question remains why the actual version of the script
>> answered lx26... although there were no links or dirs in the  
>> beginning.
>>
>> -- Reuti
>>
>>
>>> [root at bhmnode2 ~]# $SGE_ROOT/util/arch
>>> lx26-amd64
>>> [root at bhmnode2 ~]# uname -a
>>> Linux bhmnode2.cshl.edu 2.6.9-42.0.3.ELsmp #1 SMP Mon Sep 25
>>> 17:24:31 EDT
>>> 2006 x86_64 x86_64 x86_64 GNU/Linux
>>> [root at bhmnode2 ~]# ls -l $SGE_ROOT/bin
>>> total 96
>>> drwxr-xr-x  2 root root 4096 Feb 22 10:55 lx24-amd64
>>> lrwxrwxrwx  1 root root   10 Feb 22 11:06 lx26-amd64 -> lx24-amd64
>>> [root at bhmnode2 ~]#
>>>
>>> (I defined that symlink to get things running this morning)
>>>
>>>
>>> On 2/22/10 4:03 PM, "reuti" <reuti at staff.uni-marburg.de> wrote:
>>>
>>>> Am 22.02.2010 um 18:54 schrieb heywood:
>>>>
>>>>> No, we have not compiled SGE, but have used courtesy binaries all
>>>>> along.
>>>>>
>>>>> The /etc/init.d/{sgemaster,sgeexecd} scripts (which are from
>>>>> installing
>>>>> 6.2u3 last summer) are looking for lx26-*. But the utilbin and bin
>>>>> directory
>>>>> names are lx24-*.
>>>>
>>>> You mean it's hardcoded in the script? AFAIK it always used the  
>>>> arch
>>>> script in $SGE_ROOT/util/arch by default to determine the platform
>>>> its running on. This should also return lx24-amd64 on your system.
>>>>
>>>> -- Reuti
>>>>
>>>>
>>>>> Todd
>>>>>
>>>>>
>>>>> On 2/22/10 12:41 PM, "reuti" <reuti at staff.uni-marburg.de> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Am 22.02.2010 um 17:28 schrieb heywood:
>>>>>>
>>>>>>> Well. For some reason the directory in $SGE_ROOT/utilbin and
>>>>>>> $SGE_ROOT/bin
>>>>>>> was "lx24-amd64", and the script was looking for "lx26- 
>>>>>>> amd64". We
>>>>>>> are
>>>>>>> running kernel 2.6 and always have so I don't know where that
>>>>>>> lx24*
>>>>>>> directory name came from.
>>>>>>
>>>>>> the lx24-* is the minimum supported kernel by the provided  
>>>>>> binaries
>>>>>> and will also work fine under kernel 2.6. But when you build  
>>>>>> SGE on
>>>>>> your own on a 2.6 system, the created directories will be named
>>>>>> according to the version it found, i.e. you get lx26-*. Did you
>>>>>> compile it on your own?
>>>>>>
>>>>>> -- Reuti
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Anyways I just created a symlink lx26-amd64 -> lx24-amd64,  
>>>>>>> and SGE
>>>>>>> started
>>>>>>> up.
>>>>>>>
>>>>>>> Really weird.
>>>>>>>
>>>>>>> Todd
>>>>>>>
>>>>>>>
>>>>>>> On 2/22/10 10:32 AM, "heywood" <heywood at cshl.edu> wrote:
>>>>>>>
>>>>>>>> No I did not.
>>>>>>>>
>>>>>>>> I followed the patch instructions. I renamed the sge_shepherd
>>>>>>>> with ?mv" and
>>>>>>>> unpacked these tar.gz files:
>>>>>>>>
>>>>>>>>  ge-6.2u5-bin-lx24-amd64.tar.gz
>>>>>>>>  ge-6.2u5-common.tar.gz
>>>>>>>>  hedeby-1.0u5-core.tar.gz
>>>>>>>>
>>>>>>>> Then I tried restarting qmaster
>>>>>>>>
>>>>>>>> Todd
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2/22/10 10:25 AM, "craffi" <dag at sonsorol.org> wrote:
>>>>>>>>
>>>>>>>>> The "can't find path" error is significant. Did you (or the  
>>>>>>>>> init
>>>>>>>>> script)
>>>>>>>>> source or run the settings.sh|csh files to set up the SGE
>>>>>>>>> environment
>>>>>>>>> before trying to restart the qmaster?
>>>>>>>>>
>>>>>>>>> -Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> heywood wrote:
>>>>>>>>>> I upgraded from 6.2u3 to 6.2u5, and now sgemaster will not
>>>>>>>>>> start:
>>>>>>>>>>
>>>>>>>>>> [root at bhmnode2 sge]# /etc/init.d/sgemaster.bh
>>>>>>>>>> can't determine path to Grid Engine utility binaries
>>>>>>>>>> [root at bhmnode2 sge]#
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------
>>>>>>>>>
>>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>>> dsForumId=38&dsMessageId=24>>
>>>>>>> 5
>>>>>>>>> 435
>>>>>>>>>
>>>>>>>>> To unsubscribe from this discussion, e-mail:
>>>>>>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>>>>>>
>>>>>>>> ------------------------------------------------------
>>>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>>>> dsForumId=38&dsMessageId=245
>>>>>>>> 436
>>>>>>>>
>>>>>>>> To unsubscribe from this discussion, e-mail:
>>>>>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>>>>>
>>>>>>> ------------------------------------------------------
>>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>>> dsForumId=38&dsMessageId=245442
>>>>>>>
>>>>>>> To unsubscribe from this discussion, e-mail: [users-
>>>>>>> unsubscribe at gridengine.sunsource.net].
>>>>>>
>>>>>> ------------------------------------------------------
>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>> dsForumId=38&dsMessageId=245
>>>>>> 454
>>>>>>
>>>>>> To unsubscribe from this discussion, e-mail:
>>>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>>>
>>>>> ------------------------------------------------------
>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>> dsForumId=38&dsMessageId=245455
>>>>>
>>>>> To unsubscribe from this discussion, e-mail: [users-
>>>>> unsubscribe at gridengine.sunsource.net].
>>>>
>>>> ------------------------------------------------------
>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>> dsForumId=38&dsMessageId=245
>>>> 473
>>>>
>>>> To unsubscribe from this discussion, e-mail:
>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=245476
>>>
>>> To unsubscribe from this discussion, e-mail: [users-
>>> unsubscribe at gridengine.sunsource.net].
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessageId=245
>> 481
>>
>> To unsubscribe from this discussion, e-mail:
>> [users-unsubscribe at gridengine.sunsource.net].
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=245485
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=245486

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list