[GE users] Grid NFS problem

reuti reuti at staff.uni-marburg.de
Mon Oct 11 10:42:39 BST 2010

    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]


Am 11.10.2010 um 10:32 schrieb fansn:

> I exported NFS share on the grid master server, which is mounted by other exec hosts. I rebooted grid master to apply a patch last Friday. I made a mistake -- I

you mean a patch to the OS or SGE? If it's OS patch, the qmaster machine can safely be rebooted and jobs should continue to run.

> didn?t shut down all the sge_execd daemons when I rebooted the grid master server and all the exec hosts were locked up. The fstab on exec hosts looks like this:
> grid:/usr/local/grid         /usr/local/grid         nfs     rw,noatime,hard,intr,bg         0 0

Has "noatime" such a big impact nowadays to mention it there? Anyway, you may need "fsid=1004" or alike in the /etc/exports file with kernel nfs (a unique one for each exported file system) to allow the clients to survive a reboot of the file server (this is independent from the fact, whether this machine is also the qmaster or not).

-- Reuti

> I think if the grid master server crashed, the lock up would happen again. Is there any solution to prevent this happening other than installing the software on a separate NFS file server?
> Thanks for your help.
> Yours sincerely,
> Sinong Fan


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list