[GE users] Minor upgrade from 6.2 to 6.2u2

matbradford matthew.bradford at eds.com
Fri Mar 6 12:41:49 GMT 2009


Andy,

I'm possibly doing something stupid...

I've shutdown the cluster, all daemons etc.
Added the patches using the tar.gz method for lx24-amd64 and common and
when I attempt to restart the sgemaster, it fails and I get the
following message in the messages file:

<date> main|<host>|ClSetUlong: wrong type for field CE_Consumable
(lBoolT)

I'm running on Suse Enterprise 10 on Xeon.

Any ideas?

Cheers,

Mat

>-----Original Message-----
>From: andy [mailto:andy.schwierskott at sun.com]
>Sent: 06 March 2009 08:54
>To: users at gridengine.sunsource.net
>Subject: Re: [GE users] Minor upgrade from 6.2 to 6.2u2
>
>Hi Mat,
>
>> What is the upgrade procedure for moving from SGE 6.2 to SGE6.2u2?
>>
>> Do we need to follow the full upgrade procedure as defined in the
>> documentation as if we were moving from 6.1 to 6.2 or is there a
>> simplified process?
>
>
>the spool file formats haven't changed. So running and pending jobs can
>continue to stay in the system. Basically it's the usual things you
have
>to
>ensure: don't overwrite a binary of a running deamon/process.
>
>See the long version below (taken from the patch installation
>instructions).
>The note about parallel jobs is probably too over-cautios. If you would
>rename the qrsh/rsh/rshd/qrsh_starter binaries as well I believe a
>running
>parallel job which already has started it's parallel task would not be
>affected by an upgrade.
>
>
>Special Install Instructions:
>-----------------------------
>
>   Content
>   -------
>   Patch Installation
>      Stopping the Sun Grid Engine cluster to prevent start of new jobs
>      Shutting down the Sun Grid Engine daemons
>      Installing the patch and restarting the software
>   New functionality delivered with SGE 6.2 Update 2
>
>
>   Patch Installation
>   ------------------
>
>   These installation instructions assume that you are running a
>homogeneous
>   Sun Grid Engine cluster (called "the software") where all hosts
share
>the
>   same directory for the binaries. If you are running the software in
a
>   heterogeneous environment (mix of different binary architectures),
>you
>   need to apply the patch installation for all binary architectures as
>well
>   as the "common" and "arco" packages. See the patch matrix above for
>   details about the available patches.
>
>   If you upgrade from a previous version of Sun Grid Engine (for
>example
>   6.0), please perform the steps described in the Sun Grid Engine
>   documentation.
>(http://wikis.sun.com/display/gridengine62u2/Upgrading)
>
>   If you installed the software on local filesystems, you need to
>install
>   all relevant patches on all hosts where you installed the software
>   locally.
>
>   By default, there should by no running jobs when the patch is
>installed.
>   There may pending batch jobs, but no pending interactive jobs (qrsh,
>   qmake, qsh, qtcsh, qlogin).
>
>   It is possible to install the patch with running batch jobs. To
avoid
>a
>   failure of the active 'sge_shepherd' binary, it is necessary to move
>the
>   old shepherd binary (and copy it back prior to the installation of
>the
>   patch).
>
>   You can not install the patch with running interactive jobs, 'qmake'
>jobs
>   or with running parallel jobs which use the tight integration
support
>   (control_slaves=true in PE configuration is set).
>
>   A. Stopping the Sun Grid Engine cluster to prevent start of new jobs
>   --------------------------------------------------------------------
>
>   Disable all queues so that no new jobs are started:
>
>      # qmod -d '*'
>
>   Optional (only needed if there are running jobs which should
continue
>to
>   run when the patch is installed):
>
>      # cd $SGE_ROOT/bin
>      # mv <arch>/sge_shepherd <arch>/sge_shepherd.sge62
>      # cp <arch>/sge_shepherd.sge62 <arch>/sge_shepherd
>
>   It is important that the binary is moved with the "mv" command. It
>should
>   not be copied because this could cause the crash of an active
>shepherd
>   process which is currently running job when the patch is installed.
>
>   B. Shutting down the Sun Grid Engine daemons
>   --------------------------------------------
>
>   You need to shutdown (and restart) the qmaster and scheduler daemon
>and
>   all running execution daemons.
>
>   Shutdown all your execution hosts. Login to all your execution hosts
>and
>   stop the execution daemons:
>
>      # $SGE_ROOT/$SGE_CELL/common/sgeexecd softstop
>
>   Then login to your qmaster machine and stop qmaster and scheduler:
>
>      # $SGE_ROOT/$SGE_CELL/common/sgemaster stop
>
>   Now verify with the 'ps' command that all Sun Grid Engine daemons on
>all
>   hosts are stopped. If you decided to rename the 'sge_shepherd'
binary
>so
>   that running jobs can continue to run during the patch installation,
>you
>   must not kill the 'sge_shepherd' binary (process).
>
>   C. Installing the patch and restarting the software
>   ---------------------------------------------------
>
>   Now install the patch by installing the patch with "patchadd" or by
>   unpacking the 'tar.gz' files included in this patch as outlined
>above.
>
>      Restarting the software
>      -----------------------
>
>      If you have configured ARCo, you must first complete steps 1 and
2
>      from the section "Stopping the Accounting and Reporting Console"
>from
>      the ARCo patch before restarting the qmaster.
>
>      Please login to your qmaster machine and execution hosts and
>enter:
>
>         # $SGE_ROOT/$SGE_CELL/common/sgemaster
>         # $SGE_ROOT/$SGE_CELL/common/sgeexecd
>
>      After restarting the software, you may again enable your queues:
>
>         # qmod -e '*'
>
>      If you renamed the shepherd binary, you may safely delete the old
>      binary when all jobs which where running prior the patch
>installation
>      have finished.
>
>
>Regards,
>Andy
>
>------------------------------------------------------
>http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessag
e
>Id=122031
>
>To unsubscribe from this discussion, e-mail: [users-
>unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=122182

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list