[GE users] How to get shutdown/startup working on Redhat/SUSE

Dan Gruhn Dan.Gruhn at Group-W-Inc.com
Fri Mar 25 16:44:12 GMT 2005


I and others on this list have had problems for some time getting Grid
Engine to shutdown properly when one of my systems goes down.  This
leads to problems on restarting the system because the qmaster thinks
there is already a system running by the given name.

We am running Fedora Core 1, but some comments found in another script
lead me to believe that this effects Redhat and SUSE, perhaps someone
else can verify this.

It turns out that for a kill script (e.g. /etc/rc6.d/K??<cmnd>) to be
run by the Redhat/SUSE, there must be a corresponding
/var/lock/subsys/<cmnd> or /var/lock/subsys/<cmnd>.init file.  The
/etc/rc script which is run to change run levels specifically checks for
the existance of this file before executing the kill script.

So, we have added the following to the startup of section of sgeexecd:

	 # Make lock for RedHat / SuSE
	if test -w /var/lock/subsys
		touch /var/lock/subsys/sgeexecd
And to the shutdown section we have added the following:

   # Delete lock for RedHat / SuSE
   if test -f /var/lock/subsys/mysql
      rm -f /var/lock/subsys/sgeexecd

The same type of thing should be done for sgemaster.

I'm not set up to make such a change in the Grid Engine code, but
perhaps someone else who has permission plus a greater knowledge of
different types of systems could put this in properly.

Also, to use the chkconfig utility, you need to add a comment line
similar to:

# chkconfig: 35 91 02

We've put it right after the Default-Stop comment line.

This says to run for run levels 3 and 5, to start up a priority 91 and
shutdown at priority 02.  Our shutdown/startup cycles now seem to work
quite well.  Which is great because I was manually restarting Grid
Engine each time a system got rebooted.


More information about the gridengine-users mailing list