[GE users] SGE jobs stuck in pending state

emallove ethan.mallove at sun.com
Fri Jul 24 17:13:25 BST 2009


On Fri, Jul/24/2009 11:26:58AM, craffi wrote:
> Does the output of "qstat -f" really not show you the state of your  
> queues and queue instances and only shows the pending jobs?

Correct. Below is the qstat output verbatim. qstat prints the same
info from the qmaster node as from my one other non-qmaster node,
which I assume should always be the case. Interestingly, jobs
submitted as "root" show the same "unable to run" error, but then do
not show up in the qstat -f output, e.g., notice job 9 does not show
up in qstat:

  $ sudo qsub /home/em162155/tmp/hostname.sh
  Unable to run job: warning: root your job is not allowed to run in any queue
  Your job 9 ("hostname.sh") has been submitted.
  Exiting.
  $ qstat -f

  ############################################################################
   - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
  ############################################################################
        1 0.75000 hostname.s em162155     qw    07/15/2009 16:11:46     1
        2 0.74958 hostname.s em162155     qw    07/15/2009 16:21:29     1
        3 0.74955 hostname.s em162155     qw    07/15/2009 16:22:19     1
        4 0.74944 hostname.s em162155     qw    07/15/2009 16:24:47     1
        5 0.74912 hostname.s em162155     qw    07/15/2009 16:32:08     1
        6 0.74911 hostname.s em162155     qw    07/15/2009 16:32:23     1
        8 0.25000 hostname.s em162155     qw    07/23/2009 17:43:42     1

Now, notice job 10 *does* show up in qstat:

  $ qsub /home/em162155/tmp/hostname.sh
  Unable to run job: warning: em162155 your job is not allowed to run in any queue
  Your job 10 ("hostname.sh") has been submitted.
  Exiting.
  $ qstat -f

  ############################################################################
   - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
  ############################################################################
        1 0.75000 hostname.s em162155     qw    07/15/2009 16:11:46     1
        2 0.74962 hostname.s em162155     qw    07/15/2009 16:21:29     1
        3 0.74959 hostname.s em162155     qw    07/15/2009 16:22:19     1
        4 0.74949 hostname.s em162155     qw    07/15/2009 16:24:47     1
        5 0.74920 hostname.s em162155     qw    07/15/2009 16:32:08     1
        6 0.74919 hostname.s em162155     qw    07/15/2009 16:32:23     1
        8 0.29364 hostname.s em162155     qw    07/23/2009 17:43:42     1
       10 0.25000 hostname.s em162155     qw    07/24/2009 12:14:14     1

  $ qconf |& head -1
  GE 6.2u3

-Ethan

> 
> -Chris
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=209349
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=209355

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list