[GE users] NSF write errors on nodes not accepting jobs

tmac tmacmd at gmail.com
Wed Dec 19 15:31:26 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

what are your NFS mount options and what is the OS on the
NFS client having problems...?

--tmac

On Dec 16, 2007 5:52 PM, FL <lengyel at gmail.com> wrote:

> Here is a problem with a Sun Solaris cluster, on which two nodes stopped
> accepting jobs.
>
> > There is a problem with now 2 cluster nodes. They seem to be up and
> > running but they won't accept any jobs from the queue.
> >
> > ...
> >
> ----------------------------------------------------------------------------
> > all.q at compute-1-10             BIP   0/4       0.00     sol-amd64     E
> >
> ----------------------------------------------------------------------------
> > all.q at compute-1-11             BIP   4/4       3.00     sol-amd64
> >  337201 0.50442 run_QA1    alex         r     11/25/2007 00:03:47     1
> >  337204 0.50442 run_QA4    alex         r     11/25/2007 00:06:02     1
> >  347212 0.50070 reduce.m23 wbackes      r     12/14/2007 08:12:32     1
> >  349855 0.51000 gridMathem jeff         r     11/30/2007 13:53:02     1
> >
> ----------------------------------------------------------------------------
> > all.q at compute-1-12             BIP   0/4       0.00     sol-amd64     E
> >
> ----------------------------------------------------------------------------
> > ...
> >
> > I tryed to reboot one of the nodes, but it did not solve the problems.
> > I saw the following messages:
> >
> > NFS write error on host n1sm: I/O error.
> > (file handle: 154000a 2 a a0fb6 52e56c84 a a0fb6 r52e56c84 0)
> > eNFS write erroron host n1sm: I/O error.
> > b(ofile handle: 154000a 2 a a0fb6 52e56c84 a a0fb6 52e56c84 0)
> > NFS write error on host n1sm: I/O error.
> > o(tfile handle: 154000a 2 a a0fb6 52e56c84 a a0fb6 52e56c84 0)
> > NFS write error on host n1sm: I/O error.
> > i(file handle: 14n000a 2 a a0fb652e56c84 a a0fb6 52e56c84 0)
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>


-- 
--tmac

RedHat Certified Engineer #804006984323821 (RHEL4)
RedHat Certified Engineer #805007643429572 (RHEL5)

Principal Consultant, RABA Technologies



More information about the gridengine-users mailing list