[GE users] SGE 6.2: jobs queued indefinitely

Bart Willems b-willems at northwestern.edu
Tue Sep 23 15:33:08 BST 2008


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Lubos,

the job stayed in the qw state, but there were no errors/warnings.

This is the content of the bootstrap file:

# cat bootstrap
# Version: 6.2
#
admin_user              sge
default_domain          none
ignore_fqdn             true
spooling_method         classic
spooling_lib            libspoolc
spooling_params        
/opt/gridengine/default/common;/opt/gridengine/default/spool/qmaster
binary_path             /opt/gridengine/bin
qmaster_spool_dir       /opt/gridengine/default/spool/qmaster
security_mode           none
listener_threads        2
worker_threads          2
scheduler_threads       1
jvm_threads             0

And this is the output from qhost -q:

# qhost -q
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO 
SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -   
   -
compute-0-0             lx26-amd64      4  0.00    7.8G   77.6M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-1             lx26-amd64      4  0.00    7.8G  105.7M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-10            lx26-amd64      4  0.00    7.8G  114.4M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-11            lx26-amd64      4  0.00    7.8G  104.0M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-12            lx26-amd64      4  0.00    7.8G  101.1M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-13            lx26-amd64      4  0.00    7.8G  184.6M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-14            lx26-amd64      4  0.00    7.8G  100.6M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-15            lx26-amd64      4  0.00    7.8G  187.1M  996.2M  
40.0K
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-16            lx26-amd64      4  0.00    7.8G  100.8M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-17            lx26-amd64      4  0.00    7.8G  105.7M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-18            lx26-amd64      4  0.00    7.8G  107.7M  996.2M  
92.0K
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-19            lx26-amd64      4  0.00    7.8G   85.5M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-2             lx26-amd64      4  0.00    7.8G  101.5M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-20            lx26-amd64      4  0.00    7.8G  188.9M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-21            lx26-amd64      4  0.00    7.8G  168.2M  996.2M  
52.0K
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-22            lx26-amd64      4  0.00    7.8G  101.5M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-23            lx26-amd64      4  0.00    7.8G   99.2M  996.2M  
92.0K
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-24            lx26-amd64      4  0.00    7.8G  111.7M  996.2M  
92.0K
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-25            lx26-amd64      4  0.00    7.8G  113.0M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-26            lx26-amd64      4  0.01    7.8G  101.3M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-27            lx26-amd64      4  0.00    7.8G  116.5M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-28            lx26-amd64      4  0.00    7.8G   96.7M  996.2M  
92.0K
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-29            lx26-amd64      4  0.00    7.8G  117.0M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-3             lx26-amd64      4  0.00    7.8G  101.9M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-30            lx26-amd64      4  0.00    7.8G  101.4M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-31            lx26-amd64      4  0.05    7.8G  104.2M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-32            lx26-amd64      4  0.00    7.8G  101.3M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-33            lx26-amd64      4  0.00    7.8G  103.7M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-34            lx26-amd64      4  0.00    7.8G  100.9M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-35            lx26-amd64      4  0.00    7.8G  106.1M  996.2M  
40.0K
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-36            lx26-amd64      4  0.00    7.8G  100.3M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-37            lx26-amd64      4  0.00    7.8G  106.1M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-38            lx26-amd64      4     -    7.8G       -  996.2M   
   -
   conference.q         BIC   0/4      u
   debug.q              BI    0/4      u
   longserial.q         BIC   0/4      aAu
   shortserial.q        BIC   0/4      au
compute-0-39            lx26-amd64      4     -    7.8G       -  996.2M   
   -
   conference.q         BIC   0/4      u
   debug.q              BI    0/4      u
   longserial.q         BIC   1/4      aAu
   shortserial.q        BIC   0/4      au
compute-0-4             lx26-amd64      4  0.00    7.8G  105.9M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-40            lx26-amd64      4  0.00    7.8G  104.5M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-41            lx26-amd64      4  0.00    7.8G  192.0M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-42            lx26-amd64      4  0.00    7.8G  105.4M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-43            lx26-amd64      4  0.00    7.8G  106.2M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-44            lx26-amd64      4  0.00    7.8G  102.1M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-45            lx26-amd64      4  0.00    7.8G  106.0M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-47            lx26-amd64      8  0.00   15.7G  179.1M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-48            lx26-amd64      8  0.00   15.7G  117.8M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-49            lx26-amd64      8  0.01   15.7G  107.4M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-5             lx26-amd64      4  0.00    7.8G  100.7M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-50            lx26-amd64      8  0.00   15.7G  148.1M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-51            lx26-amd64      8  0.00   15.7G  109.9M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-52            lx26-amd64      8  0.00   15.7G  109.7M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-53            lx26-amd64      8  0.00   15.7G  123.4M  996.2M  
32.0K
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-54            lx26-amd64      8  0.00   15.7G  107.2M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-55            lx26-amd64      8  0.01   15.7G  136.2M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-56            lx26-amd64      8  0.00   15.7G  107.3M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-57            lx26-amd64      8  0.00   15.7G  126.8M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-58            lx26-amd64      8  0.00   15.7G  106.4M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-59            lx26-amd64      8  0.01   15.7G  107.6M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-6             lx26-amd64      4  0.00    7.8G  102.3M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-60            lx26-amd64      8  0.00   15.7G  107.1M  996.2M   
 0.0
   shortparallel.q      PC    0/8
compute-0-61            lx26-amd64      8  0.00   15.7G  106.7M  996.2M   
 0.0
   shortparallel.q      PC    0/8
compute-0-62            lx26-amd64      8  0.02   15.7G  107.2M  996.2M   
 0.0
   shortparallel.q      PC    0/8
compute-0-63            lx26-amd64      8  0.00   15.7G  116.4M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-64            lx26-amd64      8  0.00   15.7G  109.4M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-65            lx26-amd64      8  0.02   15.7G  114.2M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-66            lx26-amd64      8  0.00   15.7G  116.0M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-67            lx26-amd64      8  0.01   15.7G  137.6M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-68            lx26-amd64      8  0.01   15.7G  137.5M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-69            lx26-amd64      8  0.02   15.7G  126.7M  996.2M 
120.0K
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-7             lx26-amd64      4  0.02    7.8G  111.6M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-70            lx26-amd64      8  0.00   15.7G  123.7M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-71            lx26-amd64      8  0.01   15.7G  122.6M  996.2M 
120.0K
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-72            lx26-amd64      8  0.00   15.7G  123.5M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-73            lx26-amd64      8  0.01   15.7G  140.8M  996.2M 
120.0K
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-74            lx26-amd64      8  0.00   15.7G  128.1M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-75            lx26-amd64      8  0.00   15.7G  107.1M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-76            lx26-amd64      8  0.00   15.7G  110.8M  996.2M   
 0.0
   conference.q         BIC   0/8
   debug.q              BI    0/8
   longserial.q         BIC   0/8
   shortserial.q        BIC   0/8
compute-0-77            lx26-amd64     16     -   31.4G       -  996.2M   
   -
   conference.q         BIC   0/16     u
   debug.q              BI    0/16     u
   longserial.q         BIC   0/16     aAu
   shortserial.q        BIC   0/16     auE
compute-0-8             lx26-amd64      4  0.00    7.8G  184.0M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4
compute-0-9             lx26-amd64      4  0.00    7.8G  100.1M  996.2M   
 0.0
   conference.q         BIC   0/4
   debug.q              BI    0/4
   longserial.q         BIC   0/4
   shortserial.q        BIC   0/4


Compute nodes 38, 39, and 77 are currently down.

Thanks,
Bart


> Hi Bart,
> can you try following?
>
> Do following commands :
> qsub -b y sleep 5
> qconf -tsm
> qstat
> sleep 5
> qstat
>
> Did the job stayed in qw state? Where there any error/warning messages?
> If there were, try qping on master host and execd hosts. Does it work?
>
> Please attach the bootstrap file and qhost -q output.
>
> Lubos.
>
>
> On 09/23/08 15:37, Bart Willems wrote:
>> Hi All,
>>
>> we have just upgraded from SGE 6.1u4 to SGE 6.2. All backed-up
>> configuration settings were restored successfully, but we are having
>> problems getting jobs to run. In particular, submitted jobs remain in
>> the
>> queued state even with the cluster empty:
>>
>> $ qstat -u bart
>> job-ID  prior   name       user         state submit/start at     queue
>>                       slots ja-task-ID
>> -----------------------------------------------------------------------------------------------------------------
>>   46003 0.00000 submit_hel bart         qw    09/23/2008 08:25:02
>>                           1
>>
>>
>> Using qstat -j to get some more info starts of with a gdi error message:
>>
>> $ qstat -j 46003
>> error: can't unpack gdi request
>> error: error unpacking gdi request: bad argument
>> failed receiving gdi request
>> ==============================================================
>> job_number:                 46003
>> exec_file:                  job_scripts/46003
>> submission_time:            Tue Sep 23 08:25:02 2008
>> owner:                      bart
>> uid:                        505
>> group:                      bart
>> gid:                        505
>> sge_o_home:                 /home/bart
>> sge_o_log_name:             bart
>> sge_o_path:
>> /export/apps/sm/bin:/opt/gridengine/bin/lx26-amd64:/opt/nwu/bin:/export/apps/mpich2/bin:/usr/kerberos/bin:/opt/gridengine/bin/lx26-amd64:/usr/java/jdk1.5.0_10/bin:/export/apps/condor/bin:/export/apps/condor/sbin:/opt/atipa/acms/bin:/opt/atipa/acms/lib:/usr/local/bin:/bin:/usr/bin:/opt/Bio/ncbi/bin:/opt/Bio/mpiblast/bin/:/opt/Bio/hmmer/bin:/opt/Bio/EMBOSS/bin:/opt/Bio/clustalw/bin:/opt/Bio/t_coffee/bin:/opt/Bio/phylip/exe:/opt/Bio/mrbayes:/opt/Bio/fasta:/opt/Bio/glimmer/bin://opt/Bio/glimmer/scripts:/opt/Bio/gromacs/bin:/opt/eclipse:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/maven/bin:/opt/openmpi/bin/:/opt/pathscale/bin:/opt/rocks/bin:/opt/rocks/sbin:/home/bart/bin
>> sge_o_shell:                /bin/bash
>> sge_o_workdir:              /bigdisk/bart/test
>> sge_o_host:                 fugu
>> account:                    sge
>> cwd:                        /bigdisk/bart/test
>> merge:                      y
>> hard resource_list:         h_cpu=36000
>> mail_list:                  bart at fugu.local
>> notify:                     FALSE
>> job_name:                   submit_helloworld_short.sh
>> jobshare:                   0
>> shell_list:                 /bin/bash
>> env_list:
>> script_file:                submit_helloworld_short.sh
>>
>>
>> So there is no info on why the job won't run, even though job scheduling
>> info is set to true in qmon. But I don't see the associated variable in
>> the output of qconf -sconf:
>>
>> # qconf -sconf
>> global:
>> execd_spool_dir              /opt/gridengine/default/spool
>> mailer                       /bin/mail
>> xterm                        /usr/bin/X11/xterm
>> load_sensor                  none
>> prolog                       none
>> epilog                       none
>> shell_start_mode             posix_compliant
>> login_shells                 sh,ksh,csh,tcsh
>> min_uid                      0
>> min_gid                      0
>> user_lists                   none
>> xuser_lists                  none
>> projects                     none
>> xprojects                    none
>> enforce_project              false
>> enforce_user                 auto
>> load_report_time             00:00:40
>> max_unheard                  00:05:00
>> reschedule_unknown           00:00:00
>> loglevel                     log_warning
>> administrator_mail           none
>> set_token_cmd                none
>> pag_cmd                      none
>> token_extend_time            none
>> shepherd_cmd                 none
>> qmaster_params               none
>> execd_params                 none
>> reporting_params             accounting=true reporting=true \
>>                              flush_time=00:00:15 joblog=true
>> sharelog=00:00:00
>> finished_jobs                100
>> gid_range                    20000-20100
>> qlogin_command               /opt/gridengine/bin/rocks-qlogin.sh
>> rsh_command                  /usr/bin/ssh
>> rlogin_command               /usr/bin/ssh
>> rsh_daemon                   /usr/sbin/sshd -i -o Protocol=2
>> qlogin_daemon                /usr/sbin/sshd -i -o Protocol=2
>> rlogin_daemon                /usr/sbin/sshd -i -o Protocol=2
>> max_aj_instances             2000
>> max_aj_tasks                 75000
>> max_u_jobs                   0
>> max_jobs                     0
>> auto_user_oticket            0
>> auto_user_fshare             1000
>> auto_user_default_project    none
>> auto_user_delete_time        86400
>> delegated_file_staging       false
>> qrsh_command                 /usr/bin/ssh
>> rsh_command                  /usr/bin/ssh
>> rlogin_command               /usr/bin/ssh
>> rsh_daemon                   /usr/sbin/sshd
>> qrsh_daemon                  /usr/sbin/sshd
>> reprioritize                 0
>>
>>
>> The output of qstat -g c (some nodes are down so AVAIL < TOTAL)
>>
>> # qstat -g c
>> CLUSTER QUEUE                   CQLOAD   USED  AVAIL  TOTAL aoACDS
>> cdsuE
>> -------------------------------------------------------------------------------
>> conference.q                      0.00      0    392    416      0
>> 24
>> debug.q                           0.00      0    392    416      0
>> 24
>> longserial.q                      0.00      1    392    416      0
>> 24
>> shortparallel.q                   0.00      0     24     24      0
>> 0
>> shortserial.q                     0.00      0    392    416      0
>> 24
>>
>>
>> I also checked that /opt/gridengine/bin/lx26-amd64/sge_execd is running
>> on
>> the compute nodes.
>>
>> In case it helps: we also seem to have retained jobs that used
>> checkpointing and were running before the upgrade. These are now also in
>> the queued state.
>>
>> Any help would be most appreciated.
>>
>> Thanks,
>> Bart
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list