[GE users] All queues dropped because of overload or full

McCalla, Mac macmccalla at hess.com
Wed Dec 12 16:40:20 GMT 2007

hi Alexandre,

what does "qhost -q -h <hostname>" show you?  perhaps all available slots are already consumed on your hosts.


Mac McCalla 

From: Alexandre Racine [mailto:Alexandre.Racine at mhicc.org] 
Sent: Wednesday, December 12, 2007 9:37 AM
To: users at gridengine.sunsource.net
Subject: [GE users] All queues dropped because of overload or full

Hi all, I am having this messages: All queues dropped because of overload or full.

Looking with "top", the processors works, there is a lot of memory available... qhost seems alright... I don't see why I get this. There is only mabe the mem field in the "qstat -j" that sounds impossible. Or is this the total amount of memory that has been used? (used and freed). The only references that I found in the archives are from 2004... Thanks.

More details:

$ qhost
global                  -               -     -       -       -       -       -
server1                 lx24-amd64     16 14.26   30.4G    2.6G    1.9G     0.0
server2                 lx24-amd64      8  8.39   15.7G    6.8G    1.9G     0.0
server3                 lx24-amd64      8  8.12   14.6G  577.7M    2.0G     0.0

$ qstat -j 139
script_file:                script.sh
job-array tasks:            1-24:1
usage    2:                 cpu=20:06:52, mem=3467.62004 GBs, io=0.00000, vmem=60.770M, maxvmem=62.227M
usage    3:                 cpu=07:04:53, mem=1250.25426 GBs, io=0.00000, vmem=62.016M, maxvmem=63.266M
usage    4:                 cpu=07:02:46, mem=1247.70159 GBs, io=0.00000, vmem=62.156M, maxvmem=63.492M
usage    5:                 cpu=07:04:38, mem=1249.53348 GBs, io=0.00000, vmem=62.008M, maxvmem=63.316M
usage    6:                 cpu=16:03:12, mem=2834.15624 GBs, io=0.00000, vmem=62.113M, maxvmem=63.023M
usage    7:                 cpu=15:17:48, mem=2707.94392 GBs, io=0.00000, vmem=62.156M, maxvmem=62.578M
usage    8:                 cpu=07:02:46, mem=1247.34336 GBs, io=0.00000, vmem=62.148M, maxvmem=63.484M
usage   10:                 cpu=20:09:24, mem=3475.70453 GBs, io=0.00000, vmem=60.832M, maxvmem=62.266M
usage   11:                 cpu=14:32:42, mem=2568.06738 GBs, io=0.00000, vmem=62.016M, maxvmem=63.016M
usage   12:                 cpu=07:14:50, mem=1283.31948 GBs, io=0.00000, vmem=62.156M, maxvmem=63.504M
usage   13:                 cpu=07:15:51, mem=1282.46496 GBs, io=0.00000, vmem=62.012M, maxvmem=63.359M
usage   14:                 cpu=15:56:08, mem=2813.61103 GBs, io=0.00000, vmem=62.125M, maxvmem=63.430M
usage   15:                 cpu=14:38:33, mem=2592.12483 GBs, io=0.00000, vmem=62.156M, maxvmem=63.312M
usage   16:                 cpu=07:17:23, mem=1290.37961 GBs, io=0.00000, vmem=62.137M, maxvmem=63.574M
usage   18:                 cpu=20:09:23, mem=3482.93681 GBs, io=0.00000, vmem=60.832M, maxvmem=62.289M
usage   19:                 cpu=14:26:19, mem=2549.31135 GBs, io=0.00000, vmem=62.016M, maxvmem=63.324M
usage   20:                 cpu=07:22:26, mem=1305.89071 GBs, io=0.00000, vmem=62.160M, maxvmem=63.617M
usage   21:                 cpu=07:23:30, mem=1304.96487 GBs, io=0.00000, vmem=62.004M, maxvmem=63.328M
usage   22:                 cpu=15:08:08, mem=2672.33798 GBs, io=0.00000, vmem=62.117M, maxvmem=63.551M
usage   23:                 cpu=14:23:55, mem=2548.95621 GBs, io=0.00000, vmem=62.148M, maxvmem=63.609M
usage   24:                 cpu=15:04:51, mem=2669.49002 GBs, io=0.00000, vmem=62.246M, maxvmem=63.523M
scheduling info:            queue instance "all.q at oregano.statgen.local" dropped because it is full
                            queue instance "all.q at wasabi01.statgen.local" dropped because it is full
                            queue instance "all.q at PAPRIKA" dropped because it is full
                            All queues dropped because of overload or full

Alexandre Racine
Projets spéciaux
514-461-1300 poste 3304
alexandre.racine at mhicc.org

