[GE users] Re: [GE users] Stuck in pending

markdenni at comcast.net markdenni at comcast.net
Tue Dec 6 22:55:48 GMT 2005


Mac -
Here are the outputs:

 % qhost -q -j -h rs101
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
rs101                   lx24-amd64      4  0.00   15.3G  268.6M   31.2G     0.0
   all.q                BIP   0/4      
% qstat -g c -qs d
 % 

-------------- Original message -------------- 

Hi Mark,
    This is normal qstat -j output...it tells you what queue instances are not selectable, not what are selectable.   what does the output
of "qhost -q -j -h rs101"  show? what about "qstat -g c -qs d" (queue disabled or in Error state by chance)?

mac mccalla 




From: markdenni at comcast.net [mailto:markdenni at comcast.net] 
Sent: Tuesday, December 06, 2005 3:45 PM
To: users at gridengine.sunsource.net
Subject: [GE users] Re; {GE users] Stuck in pending



The version of SGE is: 6.0u6
The output of qstat -j when sge_execd is not running on rs101 mentions that
queue instance rs101... is temporarily not available. But when I start it again,
queue instance rs101 is not even mentioned. This is a mystery to me.

In Host Configuration -> execution host  I have set Access to test (of which I am the only member), XAccess to NONE, PROJECTS to NONE, and XProjects to all the projects the other users use so their jobs won't run on this server.)  Perhaps I am not doing this correctly.

When rs101 is not running the output of qstat is:
 % qstat -j 129791
job_number:                 129791
exec_file:                  job_scripts/129791
submission_time:            Thu Dec  1 19:21:33 2005
owner:                      markd
uid:                        1605
group:                      iosd
gid:                        1000
acco unt:                    sge
mail_options:               n   
mail_list:                  markd at rs100.sjs.agilent.com
notify:                     FALSE
job_name:                   newtest
priority:                   1
jobshare:                   0
script_file:                newtest
verify_suitable_queues:     3
version:                    7
scheduling info:            queue instance "all.q at rsl019.sjs.agilent.com" dropped because it is temporarily not available
                            queue instance "all.q at rs101.sjs.agilent.com" dropped because it is temporarily not available
                         &n bsp;  queue instance "all.q at rsl020.sjs.agilent.com" dropped because it is full
                            queue instance "all.q at rsl022.sjs.agilent.com" dropped because it is full
                            has no permission for host "global"

When sgeexecd is running on rs101, then it isn't even mentioned in qstat -j:
 % qstat -j 129791
job_number:                 129791
exec_file:                  job_scripts/129791
submission_time:            Thu Dec  1 19:21:33 2005
owner:                      markd
uid:                        1605
group:                      iosd
gid:                        1000
acco unt:                    sge
mail_options:               n   
mail_list:                  markd at rs100.sjs.agilent.com
notify:                     FALSE
job_name:                   newtest
priority:                   1
jobshare:                   0
script_file:                newtest
verify_suitable_queues:     3
version:                    7
scheduling info:            queue instance "all.q at rsl019.sjs.agilent.com" dropped because it is temporarily not available
                            queue instance "all.q at rsl020.sjs.agilent.com" dropped because it is full
                            queue in stance "all.q at rsl022.sjs.agilent.com" dropped because it is full
                            has no permission for host "global"

Thanks in advance for any suggestions.

- Mark



More information about the gridengine-users mailing list