[GE users] deleting large numbers of jobs

tmac tmacmd at gmail.com
Thu Apr 24 15:49:31 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

SGE 6.0u7 all around
Master/shadows RHEL4u2
BDB via RPC on Solaris 10

When we try to delete a large number of jobs (with large being more
than *just* a couple hundred)
the master stops responding. Sometimes it comes back, sometimes not.

This morning, we deleted 330+ array jobs. The master hung. We waited 4
minutes and qstat/qmon was still not responding.
The master itself seemed OK.

The service was restarted on the master/slaves.

Anyone have any idea as to what might be going on?

-- 
--tmac

RedHat Certified Engineer #804006984323821 (RHEL4)
RedHat Certified Engineer #805007643429572 (RHEL5)

Principal Consultant

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list