[GE users] Jobs being suspended incorrectly

opoplawski orion at cora.nwra.com
Fri Apr 16 22:23:02 BST 2010


On 04/07/2010 08:29 AM, reuti wrote:
>
> can you please post the queue definitions.

Okay, here we go:

$ qstat -u \* | grep apapane
   16680 0.56000 run_cora.c dombroski    S     04/16/2010 15:01:34 
mpi at apapane.cora.nwra.com          8

$ qstat -f | grep apapane
admin.q at apapane.cora.nwra.com  BIPC  0/0/1          0.03     lx26-amd64
ivm.q at apapane.cora.nwra.com    BIPC  0/0/4          0.03     lx26-amd64
compute.q at apapane.cora.nwra.co BIPC  0/4/4          0.03     lx26-amd64    S
mpi at apapane.cora.nwra.com      PC    0/4/4          0.03     lx26-amd64

Why does compute.q at apapane show 4 slots used?
Why is the job in S when it is in the mpi queue?
Looks like a bug to me.

Here are the queue definitions:

[orion at orca trunk]$ qconf -sq ivm.q
qname                 ivm.q
hostlist              apapane.cora.nwra.com
seq_no                0
load_thresholds       np_load_avg=1
suspend_thresholds    np_load_avg=1.05
nsuspend              1
suspend_interval      00:05:00
priority              0
min_cpu_interval      00:05:00
processors            UNDEFINED
qtype                 BATCH INTERACTIVE
ckpt_list             simple
pe_list               make mpi mpirr smp
rerun                 FALSE
slots                 4
tmpdir                /tmp
shell                 /bin/csh
prolog                NONE
epilog                NONE
shell_start_mode      posix_compliant
starter_method        NONE
suspend_method        NONE
resume_method         NONE
terminate_method      NONE
notify                00:00:60
owner_list            alexand
user_lists            ivm orion
xuser_lists           NONE
subordinate_list      compute.q=3, mpi=1
complex_values        NONE
projects              NONE
xprojects             NONE
calendar              NONE
initial_state         default
s_rt                  INFINITY
h_rt                  INFINITY
s_cpu                 INFINITY
h_cpu                 INFINITY
s_fsize               INFINITY
h_fsize               INFINITY
s_data                INFINITY
h_data                INFINITY
s_stack               INFINITY
h_stack               INFINITY
s_core                INFINITY
h_core                INFINITY
s_rss                 INFINITY
h_rss                 INFINITY
s_vmem                INFINITY
h_vmem                INFINITY
[orion at orca trunk]$ qconf -sq compute.q
qname                 compute.q
hostlist              @compute
seq_no                5
load_thresholds       NONE,[@interactive=np_load_short=1]
suspend_thresholds    NONE,[@interactive=np_load_short=1.05]
nsuspend              1
suspend_interval      00:03:00
priority              0
min_cpu_interval      00:05:00
processors            UNDEFINED
qtype                 BATCH INTERACTIVE
ckpt_list             simple
pe_list               make mpi mpirr smp
rerun                 TRUE
slots                 4,[@dualproc=2],[@octproc=8]
tmpdir                /tmp
shell                 /bin/csh
prolog                NONE
epilog                NONE
shell_start_mode      posix_compliant
starter_method        NONE
suspend_method        NONE
resume_method         NONE
terminate_method      NONE
notify                00:00:60
owner_list            NONE
user_lists            NONE
xuser_lists           NONE
subordinate_list      NONE
complex_values        NONE
projects              NONE
xprojects             NONE
calendar              NONE
initial_state         default
s_rt                  INFINITY
h_rt                  INFINITY
s_cpu                 INFINITY
h_cpu                 INFINITY
s_fsize               INFINITY
h_fsize               INFINITY
s_data                INFINITY
h_data                INFINITY
s_stack               INFINITY
h_stack               INFINITY
s_core                INFINITY
h_core                INFINITY
s_rss                 INFINITY
h_rss                 INFINITY
s_vmem                INFINITY
h_vmem                INFINITY
[orion at orca trunk]$ qconf -sq mpi
qname                 mpi
hostlist              @mpi
seq_no                5,[@dualproc=9]
load_thresholds       NONE,[@interactive=np_load_short=1]
suspend_thresholds    NONE,[@interactive=np_load_short=1.05]
nsuspend              1
suspend_interval      00:03:00
priority              0
min_cpu_interval      00:05:00
processors            UNDEFINED
qtype                 NONE
ckpt_list             simple
pe_list               make mpi mpirr smp
rerun                 TRUE
slots                 4,[@dualproc=2],[@octproc=8]
tmpdir                /tmp
shell                 /bin/csh
prolog                NONE
epilog                NONE
shell_start_mode      posix_compliant
starter_method        NONE
suspend_method        NONE
resume_method         NONE
terminate_method      NONE
notify                00:00:60
owner_list            NONE
user_lists            NONE
xuser_lists           NONE
subordinate_list      all.q=1, compute.q=1
complex_values        NONE
projects              NONE
xprojects             NONE
calendar              NONE
initial_state         default
s_rt                  INFINITY
h_rt                  INFINITY
s_cpu                 INFINITY
h_cpu                 INFINITY
s_fsize               INFINITY
h_fsize               INFINITY
s_data                INFINITY
h_data                INFINITY
s_stack               INFINITY
h_stack               INFINITY
s_core                INFINITY
h_core                INFINITY
s_rss                 INFINITY
h_rss                 INFINITY
s_vmem                INFINITY
h_vmem                INFINITY


-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA/CoRA Division                    FAX: 303-415-9702
3380 Mitchell Lane                  orion at cora.nwra.com
Boulder, CO 80301              http://www.cora.nwra.com

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=253745

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list