[GE users] Grid NFS problem
reuti at staff.uni-marburg.de
Mon Oct 11 10:42:39 BST 2010
[ The following text is in the "utf-8" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some characters may be displayed incorrectly. ]
Am 11.10.2010 um 10:32 schrieb fansn:
> I exported NFS share on the grid master server, which is mounted by other exec hosts. I rebooted grid master to apply a patch last Friday. I made a mistake -- I
you mean a patch to the OS or SGE? If it's OS patch, the qmaster machine can safely be rebooted and jobs should continue to run.
> didn?t shut down all the sge_execd daemons when I rebooted the grid master server and all the exec hosts were locked up. The fstab on exec hosts looks like this:
> grid:/usr/local/grid /usr/local/grid nfs rw,noatime,hard,intr,bg 0 0
Has "noatime" such a big impact nowadays to mention it there? Anyway, you may need "fsid=1004" or alike in the /etc/exports file with kernel nfs (a unique one for each exported file system) to allow the clients to survive a reboot of the file server (this is independent from the fact, whether this machine is also the qmaster or not).
> I think if the grid master server crashed, the lock up would happen again. Is there any solution to prevent this happening other than installing the software on a separate NFS file server?
> Thanks for your help.
> Yours sincerely,
> Sinong Fan
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users