[GE users] Fwd: subnode with empty slots but jobs in queue

jlforrest jlforrest at berkeley.edu
Mon Dec 6 18:14:38 GMT 2010

On 12/6/2010 10:04 AM, reuti wrote:

>> Right now compute-0-8 is down, although qstat still shows
>> some jobs for it. (Why would this happen?)
> SGE assumes some network problems. You will have to use `qdel -f ...` to get rid of these jobs.

I've now done that.

>> The qstat output for compute-0-7 shows
>> all.q at compute-0-7.local        BIP   0/48/48        29.05    lx26-amd64
> So, all 48 out of 48 seem to be used up.
>> and then it shows 48 serial jobs underneath! Yet, ssh-ing to
>> compute-0-7 and running ps clearly only shows 29 jobs running
> What is `qstat -g t -l h=compute-0-7.local -s r` showing?

It shows nothing. But, it also shows nothing for the
nodes that are working correctly, e.g. consider compute-0-0
whose status is shown as

compute-0-0    lx26-amd64    4  4.97    7.8G  831.5M   11.7G   75.7M

Running 'qstat -g t -l h=compute-0-0 -s' results in
no output. Is this correct?


Jon Forrest
Research Computing Support
College of Chemistry
173 Tan Hall
University of California Berkeley
Berkeley, CA
jlforrest at berkeley.edu


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list