[GE users] Sgemaster won't start after upgrading

heywood heywood at cshl.edu
Mon Feb 22 22:09:05 GMT 2010


But of course the old 6.2u3 arch script is now returning lx26 since I have
that symlink defined!

There are significant diffs between the u3 and u5 arch scripts though.


On 2/22/10 5:04 PM, "heywood" <heywood at cshl.edu> wrote:

> Yes, I used the updated common (6.2u5). I just tested the old 6.2u3 "arch"
> script and it returns lx26.
> 
> Here are the relevant diffs between 6.2u3 "arch" and 6.2u5 "arch":
> 
> <    2.[46].*)
> <       # retrieve os release. We use 2.4 on kernel 2.6 machines, unless
> <       # we have binaries installed that have been built for 2.6
> <       case $osrelease in
> <       2.4.*) 
> <          lxrelease=24
> <          ;;
> <       2.6.*) 
> <          ROOT_DIR=`dirname $0`/..
> <          if [ "$SGE_ROOT" != "" -a -d "$SGE_ROOT/bin/lx26-${lxmachine}" ]
> ; then
> <             lxrelease=26
> <          elif [ "$SGE_ROOT" = "" -a -d "$ROOT_DIR/bin/lx26-${lxmachine}" ]
> ; then
> <             lxrelease=26
> <          else
> <             lxrelease=24
> <          fi
> <          ;;
> <       esac
> < 
> <       # verify the GNU C lib version
> <       # For an alternative means to determine GNU C lib version see
> <       # http://www.gnu.org/software/libc/FAQ.html#s-4.9
> ---
>>    2.2.*)
>>       lxrelease=22
>>       ;;
>>    2.4.*)
> 
> 
> 
> 
> On 2/22/10 5:04 PM, "reuti" <reuti at staff.uni-marburg.de> wrote:
> 
>> Am 22.02.2010 um 22:54 schrieb heywood:
>> 
>>> Looks to me like the util/arch script uses uname to get the lx26
>>> (kernel is
>>> 2.6.*), not by looking for the directory lx26* (or lx24*).
>>> 
>>> If so, the question is why it returned lx24 for 6.2u3 (if it did on
>>> this 2.6
>>> system).
>> 
>> The actual script uses:
>> 
>>     case $osrelease in
>>     2.[46].*)
>>        # retrieve os release. We use 2.4 on kernel 2.6 machines, unless
>>        # we have binaries installed that have been built for 2.6
>>        case $osrelease in
>>        2.4.*)
>>           lxrelease=24
>>           ;;
>>        2.6.*)
>>           ROOT_DIR=`dirname $0`/..
>>           if [ "$SGE_ROOT" != "" -a -d "$SGE_ROOT/bin/lx26-$
>> {lxmachine}" ] ; then
>>              lxrelease=26
>>           elif [ "$SGE_ROOT" = "" -a -d "$ROOT_DIR/bin/lx26-$
>> {lxmachine}" ] ; then
>>              lxrelease=26
>>           else
>>              lxrelease=24
>>           fi
>>           ;;
>>        esac
>> ...
>> 
>> Did you also install the updated common package?
>> 
>> -- Reuti
>> 
>> 
>>> But things are working so I'm OK.
>>> 
>>> Todd
>>> 
>>> 
>>> On 2/22/10 4:35 PM, "reuti" <reuti at staff.uni-marburg.de> wrote:
>>> 
>>>> Am 22.02.2010 um 22:18 schrieb heywood:
>>>> 
>>>>> No, it isn't hard coded. It returns lx26, while the directories are
>>>>> named
>>>>> lx24...
>>>> 
>>>> The actual version of the arch script checks whether there is a
>>>> directory lx26-... As you created links to the dirs, it will answer
>>>> with lx26... But w/o the links, it should fall back to the default
>>>> lx24...
>>>> 
>>>> So, the question remains why the actual version of the script
>>>> answered lx26... although there were no links or dirs in the
>>>> beginning.
>>>> 
>>>> -- Reuti
>>>> 
>>>> 
>>>>> [root at bhmnode2 ~]# $SGE_ROOT/util/arch
>>>>> lx26-amd64
>>>>> [root at bhmnode2 ~]# uname -a
>>>>> Linux bhmnode2.cshl.edu 2.6.9-42.0.3.ELsmp #1 SMP Mon Sep 25
>>>>> 17:24:31 EDT
>>>>> 2006 x86_64 x86_64 x86_64 GNU/Linux
>>>>> [root at bhmnode2 ~]# ls -l $SGE_ROOT/bin
>>>>> total 96
>>>>> drwxr-xr-x  2 root root 4096 Feb 22 10:55 lx24-amd64
>>>>> lrwxrwxrwx  1 root root   10 Feb 22 11:06 lx26-amd64 -> lx24-amd64
>>>>> [root at bhmnode2 ~]#
>>>>> 
>>>>> (I defined that symlink to get things running this morning)
>>>>> 
>>>>> 
>>>>> On 2/22/10 4:03 PM, "reuti" <reuti at staff.uni-marburg.de> wrote:
>>>>> 
>>>>>> Am 22.02.2010 um 18:54 schrieb heywood:
>>>>>> 
>>>>>>> No, we have not compiled SGE, but have used courtesy binaries all
>>>>>>> along.
>>>>>>> 
>>>>>>> The /etc/init.d/{sgemaster,sgeexecd} scripts (which are from
>>>>>>> installing
>>>>>>> 6.2u3 last summer) are looking for lx26-*. But the utilbin and bin
>>>>>>> directory
>>>>>>> names are lx24-*.
>>>>>> 
>>>>>> You mean it's hardcoded in the script? AFAIK it always used the
>>>>>> arch
>>>>>> script in $SGE_ROOT/util/arch by default to determine the platform
>>>>>> its running on. This should also return lx24-amd64 on your system.
>>>>>> 
>>>>>> -- Reuti
>>>>>> 
>>>>>> 
>>>>>>> Todd
>>>>>>> 
>>>>>>> 
>>>>>>> On 2/22/10 12:41 PM, "reuti" <reuti at staff.uni-marburg.de> wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> Am 22.02.2010 um 17:28 schrieb heywood:
>>>>>>>> 
>>>>>>>>> Well. For some reason the directory in $SGE_ROOT/utilbin and
>>>>>>>>> $SGE_ROOT/bin
>>>>>>>>> was "lx24-amd64", and the script was looking for "lx26-
>>>>>>>>> amd64". We
>>>>>>>>> are
>>>>>>>>> running kernel 2.6 and always have so I don't know where that
>>>>>>>>> lx24*
>>>>>>>>> directory name came from.
>>>>>>>> 
>>>>>>>> the lx24-* is the minimum supported kernel by the provided
>>>>>>>> binaries
>>>>>>>> and will also work fine under kernel 2.6. But when you build
>>>>>>>> SGE on
>>>>>>>> your own on a 2.6 system, the created directories will be named
>>>>>>>> according to the version it found, i.e. you get lx26-*. Did you
>>>>>>>> compile it on your own?
>>>>>>>> 
>>>>>>>> -- Reuti
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Anyways I just created a symlink lx26-amd64 -> lx24-amd64,
>>>>>>>>> and SGE
>>>>>>>>> started
>>>>>>>>> up.
>>>>>>>>> 
>>>>>>>>> Really weird.
>>>>>>>>> 
>>>>>>>>> Todd
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 2/22/10 10:32 AM, "heywood" <heywood at cshl.edu> wrote:
>>>>>>>>> 
>>>>>>>>>> No I did not.
>>>>>>>>>> 
>>>>>>>>>> I followed the patch instructions. I renamed the sge_shepherd
>>>>>>>>>> with ?mv" and
>>>>>>>>>> unpacked these tar.gz files:
>>>>>>>>>> 
>>>>>>>>>>  ge-6.2u5-bin-lx24-amd64.tar.gz
>>>>>>>>>>  ge-6.2u5-common.tar.gz
>>>>>>>>>>  hedeby-1.0u5-core.tar.gz
>>>>>>>>>> 
>>>>>>>>>> Then I tried restarting qmaster
>>>>>>>>>> 
>>>>>>>>>> Todd
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 2/22/10 10:25 AM, "craffi" <dag at sonsorol.org> wrote:
>>>>>>>>>> 
>>>>>>>>>>> The "can't find path" error is significant. Did you (or the
>>>>>>>>>>> init
>>>>>>>>>>> script)
>>>>>>>>>>> source or run the settings.sh|csh files to set up the SGE
>>>>>>>>>>> environment
>>>>>>>>>>> before trying to restart the qmaster?
>>>>>>>>>>> 
>>>>>>>>>>> -Chris
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> heywood wrote:
>>>>>>>>>>>> I upgraded from 6.2u3 to 6.2u5, and now sgemaster will not
>>>>>>>>>>>> start:
>>>>>>>>>>>> 
>>>>>>>>>>>> [root at bhmnode2 sge]# /etc/init.d/sgemaster.bh
>>>>>>>>>>>> can't determine path to Grid Engine utility binaries
>>>>>>>>>>>> [root at bhmnode2 sge]#
>>>>>>>>>>> 
>>>>>>>>>>> ------------------------------------------------------
>>>>>>>>>>> 
>>>>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>>>>> dsForumId=38&dsMessageId=24>>
>>>>>>>>> 5
>>>>>>>>>>> 435
>>>>>>>>>>> 
>>>>>>>>>>> To unsubscribe from this discussion, e-mail:
>>>>>>>>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>>>>>>>> 
>>>>>>>>>> ------------------------------------------------------
>>>>>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>>>>>> dsForumId=38&dsMessageId=245
>>>>>>>>>> 436
>>>>>>>>>> 
>>>>>>>>>> To unsubscribe from this discussion, e-mail:
>>>>>>>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>>>>>>> 
>>>>>>>>> ------------------------------------------------------
>>>>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>>>>> dsForumId=38&dsMessageId=245442
>>>>>>>>> 
>>>>>>>>> To unsubscribe from this discussion, e-mail: [users-
>>>>>>>>> unsubscribe at gridengine.sunsource.net].
>>>>>>>> 
>>>>>>>> ------------------------------------------------------
>>>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>>>> dsForumId=38&dsMessageId=245
>>>>>>>> 454
>>>>>>>> 
>>>>>>>> To unsubscribe from this discussion, e-mail:
>>>>>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>>>>> 
>>>>>>> ------------------------------------------------------
>>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>>> dsForumId=38&dsMessageId=245455
>>>>>>> 
>>>>>>> To unsubscribe from this discussion, e-mail: [users-
>>>>>>> unsubscribe at gridengine.sunsource.net].
>>>>>> 
>>>>>> ------------------------------------------------------
>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>> dsForumId=38&dsMessageId=245
>>>>>> 473
>>>>>> 
>>>>>> To unsubscribe from this discussion, e-mail:
>>>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>>> 
>>>>> ------------------------------------------------------
>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>> dsForumId=38&dsMessageId=245476
>>>>> 
>>>>> To unsubscribe from this discussion, e-mail: [users-
>>>>> unsubscribe at gridengine.sunsource.net].
>>>> 
>>>> ------------------------------------------------------
>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>> dsForumId=38&dsMessageId=245
>>>> 481
>>>> 
>>>> To unsubscribe from this discussion, e-mail:
>>>> [users-unsubscribe at gridengine.sunsource.net].
>>> 
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=245485
>>> 
>>> To unsubscribe from this discussion, e-mail: [users-
>>> unsubscribe at gridengine.sunsource.net].
>> 
>> ------------------------------------------------------
>> 
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=24>>
5
>> 486
>> 
>> To unsubscribe from this discussion, e-mail:
>> [users-unsubscribe at gridengine.sunsource.net].
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=245
> 488
> 
> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=245490

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list