[GE users] Fwd: subnode with empty slots but jobs in queue

reuti reuti at staff.uni-marburg.de
Mon Dec 6 18:17:40 GMT 2010


Am 06.12.2010 um 19:14 schrieb jlforrest:

> On 12/6/2010 10:04 AM, reuti wrote:
> 
>>> Right now compute-0-8 is down, although qstat still shows
>>> some jobs for it. (Why would this happen?)
>> 
>> SGE assumes some network problems. You will have to use `qdel -f ...` to get rid of these jobs.
> 
> I've now done that.
> 
>>> The qstat output for compute-0-7 shows
>>> 
>>> all.q at compute-0-7.local        BIP   0/48/48        29.05    lx26-amd64
>> 
>> So, all 48 out of 48 seem to be used up.
>> 
>>> and then it shows 48 serial jobs underneath! Yet, ssh-ing to
>>> compute-0-7 and running ps clearly only shows 29 jobs running
>> 
>> What is `qstat -g t -l h=compute-0-7.local -s r` showing?
> 
> It shows nothing. But, it also shows nothing for the
> nodes that are working correctly, e.g. consider compute-0-0
> whose status is shown as
> 
> compute-0-0    lx26-amd64    4  4.97    7.8G  831.5M   11.7G   75.7M
> 
> Running 'qstat -g t -l h=compute-0-0 -s' results in
> no output. Is this correct?

No, I forgot to mention -u "*" in addition to get the list of all users' jobs.

-- Reuti


> Cordially,
> 
> -- 
> Jon Forrest
> Research Computing Support
> College of Chemistry
> 173 Tan Hall
> University of California Berkeley
> Berkeley, CA
> 94720-1460
> 510-643-1032
> jlforrest at berkeley.edu
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=302522
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=302523

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list