[GE users] Temporary removing jobs from the queues

dangruhn Dan.Gruhn at groupw.com
Tue Jul 7 13:57:04 BST 2009


Margaret,

mad wrote:
> I need to free up some slots on our system.  One user has submitted  
> two jobs which are taking up all the resources.   I would like to  
> "suspend" one of her jobs to allow use of the cluster by other users.
>
>
> I have tried suspend and hold through qmon.  However, the slots are  
> still occupied.
>
>   qstat -g c
> CLUSTER QUEUE                   CQLOAD   USED  AVAIL  TOTAL aoACDS   
> cdsuE
> -------------------------------------------------------------------------------
> all.q                             0.98     72      0     72       
> 0      0
>
>
> and I cannot qlogin
>
>   qlogin
> Your job 13522 ("QLOGIN") has been submitted
> waiting for interactive job to be scheduled ...timeout (4 s) expired  
> while waiting on socket fd 4
>
>
> Your "qlogin" request could not be scheduled, try again later.
>
> I do not want to kill the job.  How can I free up some of the slots?
>   
One possibility is to either suspend or hold (I can't remember which one 
is the best) and then restart the job.  This will put the job back in 
pending but it won't be eligible for execution until the suspend/hold is 
released.

The down side is that this job will be starting over from scratch. Is 
this okay or is that what you meant by saying you don't want to kill the 
job?
> Also how do I hold the user's jobs waiting on the queue so that I can  
> release them in a manner that keeps some of the slots open for other  
> users?
>
> ----------------------------------------------------------------------------
> all.q at compute-0-8.local        BIP   4/4       4.00     lx26-amd64
>    13512 0.25000 user1_SOLVER user1        s     07/06/2009  
> 21:08:09     4
> ----------------------------------------------------------------------------
>
> Although this job is "suspended", it is still running on compute-0-8  
> and taking up four CPUs.
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=206003
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>   

-- 
Dan Gruhn
Group W Inc.
8315 Lee Hwy, Suite 303
Fairfax, VA, 22031
PH: (703) 752-5831
FX: (703) 752-5851

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=206005

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list