[GE users] subordinate queue that stays suspended

Reuti reuti at staff.uni-marburg.de
Thu May 11 20:51:58 BST 2006


Am 11.05.2006 um 17:31 schrieb Bill Knebel:

> Reuti,
>
> The nodes are dual-cpu Xeons. In all.q each node (14, 15, and 16)  
> is defined as having 2 slots. The same slot number is defined in  
> the bootstrap.q for these nodes.  We only want one job running per  
> cpu in each node at any one time. In the qconf for the bootstrap.q  
> the "subordinate_list" item is all.q. That is the extent of the  
> subordinate queue setup.
>
> Our goal is to suspend the use of nodes 14, 15, and 16 in all.q  
> when jobs are submitted to the bootstrap.q. The bootstrap.q is  
> configured to only use nodes 14, 15, and 16.

But this way you could get three jobs on a node. One in bootstrap.q  
and two in all.q, as all.q is only suspended if both slots in  
bootstrap.q are used up. Anyway, this isn't your issue. With qstat -f  
and qhost -q you don't see still anything running on these machines?

-- Reuti


> Bill
>
> Reuti wrote:
>
>> Hi,
>>
>> Am 11.05.2006 um 14:41 schrieb Bill Knebel:
>>
>>> I have three nodes out of six in all.q that are subordinate to   
>>> bootstrap.q  When jobs are completed on bootstrap.q the three   
>>> subordinate nodes in all.q remain in the capital "S" state and  
>>> do  not accept jobs.    I have tried forcing unsuspend on all.q  
>>> but  grid engine says the queue is not in the suspend state.    
>>> This is a  relatively recent occurrence.  In the past, when jobs  
>>> completed on  bootstrap.q, the three nodes on all.q that were  
>>> affected returned  to the normal state and began accepting and  
>>> running jobs. Any ideas?
>>
>>
>> are these dual-cpu nodes and how many slots are defined for each   
>> queue and how many are used? What is the detailed setting for the   
>> subordinate queue and subordination?
>>
>> -- Reuti
>>
>>> Bill
>>>
>>> -- 
>>> Bill Knebel, PharmD, Ph.D.
>>> Principal Scientist
>>> Metrum Research Group
>>> 2 Tunxis Road
>>> Suite 112
>>> Tariffville, CT 06081
>>> email: billk at metrumrg.com
>>> tel: (860) 930-1370
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>
>
> -- 
> Bill Knebel, PharmD, Ph.D.
> Principal Scientist
> Metrum Research Group
> 2 Tunxis Road
> Suite 112
> Tariffville, CT 06081
> email: billk at metrumrg.com
> tel: (860) 930-1370
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list