[GE users] NSF write errors on nodes not accepting jobs

FL lengyel at gmail.com
Sun Dec 16 22:52:50 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Here is a problem with a Sun Solaris cluster, on which two nodes stopped
accepting jobs.

> There is a problem with now 2 cluster nodes. They seem to be up and
> running but they won't accept any jobs from the queue.
>
> ...
> ----------------------------------------------------------------------------
> all.q at compute-1-10             BIP   0/4       0.00     sol-amd64     E
> ----------------------------------------------------------------------------
> all.q at compute-1-11             BIP   4/4       3.00     sol-amd64
>  337201 0.50442 run_QA1    alex         r     11/25/2007 00:03:47     1
>  337204 0.50442 run_QA4    alex         r     11/25/2007 00:06:02     1
>  347212 0.50070 reduce.m23 wbackes      r     12/14/2007 08:12:32     1
>  349855 0.51000 gridMathem jeff         r     11/30/2007 13:53:02     1
> ----------------------------------------------------------------------------
> all.q at compute-1-12             BIP   0/4       0.00     sol-amd64     E
> ----------------------------------------------------------------------------
> ...
>
> I tryed to reboot one of the nodes, but it did not solve the problems.
> I saw the following messages:
>
> NFS write error on host n1sm: I/O error.
> (file handle: 154000a 2 a a0fb6 52e56c84 a a0fb6 r52e56c84 0)
> eNFS write erroron host n1sm: I/O error.
> b(ofile handle: 154000a 2 a a0fb6 52e56c84 a a0fb6 52e56c84 0)
> NFS write error on host n1sm: I/O error.
> o(tfile handle: 154000a 2 a a0fb6 52e56c84 a a0fb6 52e56c84 0)
> NFS write error on host n1sm: I/O error.
> i(file handle: 14n000a 2 a a0fb652e56c84 a a0fb6 52e56c84 0)
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list