[GE users] Grid NFS problem

fansn fansn at hotmail.com
Mon Oct 11 11:04:06 BST 2010

    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Many thanks Reuti, you are very helpful.

-----Original Message-----
From: reuti [mailto:reuti at staff.uni-marburg.de] 
Sent: 11 October 2010 10:43
To: users at gridengine.sunsource.net
Subject: Re: [GE users] Grid NFS problem


Am 11.10.2010 um 10:32 schrieb fansn:

> I exported NFS share on the grid master server, which is mounted by other exec hosts. I rebooted grid master to apply a patch last Friday. I made a mistake -- I

you mean a patch to the OS or SGE? If it's OS patch, the qmaster machine can safely be rebooted and jobs should continue to run.

> didn?t shut down all the sge_execd daemons when I rebooted the grid master server and all the exec hosts were locked up. The fstab on exec hosts looks like this:
> grid:/usr/local/grid         /usr/local/grid         nfs     rw,noatime,hard,intr,bg         0 0

Has "noatime" such a big impact nowadays to mention it there? Anyway, you may need "fsid=1004" or alike in the /etc/exports file with kernel nfs (a unique one for each exported file system) to allow the clients to survive a reboot of the file server (this is independent from the fact, whether this machine is also the qmaster or not).

-- Reuti

> I think if the grid master server crashed, the lock up would happen again. Is there any solution to prevent this happening other than installing the software on a separate NFS file server?
> Thanks for your help.
> Yours sincerely,
> Sinong Fan


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list