[GE users] only 1 set of Qs will run

Harry Mangalam harry.mangalam at uci.edu
Thu Nov 27 16:08:18 GMT 2008


Thanks for the suggestions, John,

On Wednesday 26 November 2008, John Hearns wrote:
> Harry,
> on each type of node run 'mount' and look at exactly how the
> SGE_ROOT filesystem is mounted.

one every node, the (anonymized) mount line is:

mount_host:/share/sge62 on /sge62 type nfs 
(rw,nosuid,intr,hard,timeo=20,rsize=8192,wsize=8192,addr=x.x.x.x)

> On the AMD64 nodes, consider using local spooling - you can find
> out how to do that in the NFS_Reduce HOWTO. Basically create a
> directory called (let's say)
> /var/spool/sge on the remote host, make it owned by the sge user.
> Define sge_execd_spool = /var/spool/sge  for that machine. Restart
> sgeexecd.

I had already done this (probably lost in the long initial post) - 
that's why I restarted the sge_execd on the nodes and qmaster daemon. 
And afterwards the NFS errors stopped, tho the inability to run jobs 
on those Qs remains.  All AMD nodes are now spooling to a local dir 
(/tmp/sge_execspool/).



-- 
Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, 
UC Irvine 92697  949 824-0084(o), 949 285-4487(c)
---
Good judgment comes from experience; 
Experience comes from bad judgment. [F. Brooks.]

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=90107

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list