No subject


Wed Jan 12 20:38:46 GMT 2011


01/08/2009 11:26:37|listen|vi64-x4150c-sca11|E|commlib error: got read error
(closing "vi64-x4150c-sca11/qstat/112")

However, if I submit another job, both jobs were scheduled all to be finished.

It's weird but later I submit another job from the qmaster host and this time 
the first 8 tasks were scheduled and executed.
However, after that, no further scheduling of tasks had been made:

[root at vi64-x4150c-sca11 qmaster]# qsub -b y -o /dev/null -j y -w e -js 0 -t 1:32
 sleep 30
Your job-array 31.1-32:1 ("sleep") has been submitted

[root at vi64-x4150c-sca11 qmaster]# qstat -f
queuename                      qtype resv/used/tot. load_avg arch          states
---------------------------------------------------------------------------------
all.q at v4u-m3000a-sca11         BIP   0/8/8          0.00     sol-sparc64      31
0.55500 sleep      root         r     01/08/2009 12:20:02     1 1
    31 0.55500 sleep      root         r     01/08/2009 12:20:02     1 2
    31 0.55500 sleep      root         t     01/08/2009 12:20:02     1 3
    31 0.55500 sleep      root         t     01/08/2009 12:20:02     1 4
    31 0.55500 sleep      root         t     01/08/2009 12:20:02     1 5
    31 0.55500 sleep      root         t     01/08/2009 12:20:02     1 6
    31 0.55500 sleep      root         t     01/08/2009 12:20:02     1 7
    31 0.55500 sleep      root         t     01/08/2009 12:20:02     1 8

############################################################################
- PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
    31 0.00000 sleep      root         qw    01/08/2009 12:19:58     1 9-32:1


I waited 2 minutes and qstat again but its output shows that no further tasks
were executed.

[root at vi64-x4150c-sca11 qmaster]# sleep 120; qstat -f
queuename                      qtype resv/used/tot. load_avg arch          states
---------------------------------------------------------------------------------
all.q at v4u-m3000a-sca11         BIP   0/0/8          0.00     sol-sparc64 
############################################################################
- PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
    31 0.00000 sleep      root         qw    01/08/2009 12:19:58     1 9-32:1

# qstat -j 31
<snip>
job-array tasks:            1-32:1
scheduling info:            queue instance "all.q at v4u-m3000a-sca11" dropped
because it is full
                           All queues dropped because of overload or full


The scheduler thinks that the queue is full although the slots are empty.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=36&dsMessageId=97760

To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list