[GE users] Problem with clearing a suspended status from a queue instance.

reuti reuti at staff.uni-marburg.de
Wed Oct 20 16:48:56 BST 2010


Am 20.10.2010 um 16:11 schrieb mad:

> 
> On Wed, Oct 20, 2010 at 9:24 AM, reuti <reuti at staff.uni-marburg.de> wrote:
> Hi,
> 
> Am 20.10.2010 um 14:49 schrieb mad:
> 
> > I have tried to remove the Ss status from the queue instance by using qmon and clicking on force and resume.  That does not change the status.  I rebooted the host on which the host exists; that did not resume the queue.
> >
> > There are no jobs in the queue instance.
> >
> > This queue instance in the het queue is part of a subordinate queue het-2hr.
> >
> > [root at ted g03]# qmod -usq het at compute-0-31
> > [root at ted g03]# qmod -usq het at compute-0-31.local
> > [root at ted g03]# qmod -usq het-2hr at compute-0-31.local
> > Queue instance "het-2hr at compute-0-31.local" is already in the specified state: unsuspended
> > [root at ted g03]# qmod -usq het at compute-0-31.local
> >
> > The queue instance for het at compute-0-31.local in qmon still shows a status of Ss
> >
> > I am running ROCKS  5.0 , Centos 2.6.18-53.1.14.el5, Grid Engine 6.1u4
> 
> I think an uppercase S means subordinated. Is there anything running in a superordinated queue?
> 
> -- Reuti
> 
> het is the queue including compute-0-30 through compute-0-33 which is subordinate to het-24hr and het-2hr.

Subordinated by slot? There is an issue where the last slot won't get un-suspended again and stays in "S".


> 
> het-24hr is a queue containing compute-0-30.  There is currently a job taking all slots in this queue.
> 
> het-2hr is a queue containing compute-0-31 and compute-0-32.  There are no jobs in this queue.
> 
> compute-0-31 is the instance showing the Ss status. 
> 
> CLUSTER QUEUE                   CQLOAD   USED  AVAIL  TOTAL aoACDS  cdsuE  
> -------------------------------------------------------------------------------
> het                               0.20      0     24     32      0      8 
> het-24hr                          0.78      8      0      8      0      0 
> het-2hr                           0.00      0     16     16      0      0 

This is strange: why is the "het" queue not having 8 slosts in "S" but in error state "E". Does:

$ qstat -f

show an error for this queue on certain machines?

--- Reuti


> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=288630
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=288654

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list