[GE users] 6.2 all queues dropped

Olesen, Mark Mark.Olesen at emcontechnologies.com
Fri Nov 14 14:18:04 GMT 2008


> Today I came back after 4 weeks off and found one test job left which
> was never scheduled.
> 
> qstat -j on this job gave me the scheduling info "All queues dropped
> because of overload or full".
> qstat -f didn't list anything.
> 
> Restarting the whole cluster didn't change anything.
> 
> I had to remove the hostgroup in the hostlist configuration entry of
> every single clusterqueue and put it back in again to get everything
> working again. Would there have been an easier way to do so? Or is
> this a problem of my configuration?

Did you check (and clear) the error states on the various queues?
If one of the many other jobs hit something nasty (eg, failed licence
check in prolog), it can set the queue into an 'E' error state and thus
prevent anything from getting scheduled.

/mark
This e-mail message and any attachments may contain 
legally privileged, confidential or proprietary Information, 
or information otherwise protected by law of EMCON 
Technologies, its affiliates, or third parties. This notice 
serves as marking of its "Confidential" status as defined 
in any confidentiality agreements concerning the sender 
and recipient. If you are not the intended recipient(s), 
or the employee or agent responsible for delivery of this 
message to the intended recipient(s), you are hereby 
notified that any dissemination, distribution or copying 
of this e-mail message is strictly prohibited. 
If you have received this message in error, please 
immediately notify the sender and delete this e-mail 
message from your computer.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88760

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list