[GE users] Fwd: subnode with empty slots but jobs in queue

jlforrest jlforrest at berkeley.edu
Mon Dec 6 18:55:29 GMT 2010


On 12/6/2010 10:17 AM, reuti wrote:

>> Running 'qstat -g t -l h=compute-0-0 -s' results in
>> no output. Is this correct?
>
> No, I forgot to mention -u "*" in addition to get the list of all users' jobs.

No problem. At least it wasn't me screwing up. The output
is below.

I think I might have some idea of what might be causing
this. compute-0-7 crashed last week, I think on 12/02/2010.
I brought it up soon afterwards. So, the jobs that show
a submit time of before 12/02/2010 are not really there.
I counted and there are 19 of them. This, plus the 29 that
are running, equals 48, which is the number of cores.

So the real question is why did these jobs remain
visible to SGE after compute-0-7 was rebooted.

job-ID  prior   name       user         state submit/start at     queue 
                          master ja-task-ID
------------------------------------------------------------------------------------------------------------------
    6954 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    5874 0.55500 Job descri wendy        r     11/30/2010 14:38:49 
all.q at compute-0-7.local        MASTER
    6959 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    5228 0.55500 Job descri maximoff     r     11/23/2010 15:22:34 
all.q at compute-0-7.local        MASTER
    6980 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6969 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6088 0.55500 Job descri maximoff     r     12/01/2010 11:35:19 
all.q at compute-0-7.local        MASTER
    6965 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6973 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6977 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    5873 0.55500 Job descri wendy        r     11/30/2010 14:37:34 
all.q at compute-0-7.local        MASTER
    5225 0.55500 Job descri maximoff     r     11/23/2010 15:14:34 
all.q at compute-0-7.local        MASTER
    6093 0.55500 Job descri maximoff     r     12/01/2010 11:37:04 
all.q at compute-0-7.local        MASTER
    5224 0.55500 Job descri maximoff     r     11/23/2010 15:13:04 
all.q at compute-0-7.local        MASTER
    6962 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6970 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6091 0.55500 Job descri maximoff     r     12/01/2010 11:36:19 
all.q at compute-0-7.local        MASTER
    6979 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6967 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6971 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6957 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6956 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6961 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6098 0.55500 Job descri maximoff     r     12/01/2010 11:41:49 
all.q at compute-0-7.local        MASTER
    6096 0.55500 Job descri maximoff     r     12/01/2010 11:40:19 
all.q at compute-0-7.local        MASTER
    6084 0.55500 Job descri maximoff     r     12/01/2010 11:11:34 
all.q at compute-0-7.local        MASTER
    6090 0.55500 Job descri maximoff     r     12/01/2010 11:36:04 
all.q at compute-0-7.local        MASTER
    5226 0.55500 Job descri maximoff     r     11/23/2010 15:17:04 
all.q at compute-0-7.local        MASTER
    6978 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    3003 0.55500 QQQ        mforrest     r     10/29/2010 11:33:56 
all.q at compute-0-7.local        MASTER
    6960 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6958 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6085 0.55500 Job descri maximoff     r     12/01/2010 11:11:49 
all.q at compute-0-7.local        MASTER
    6087 0.55500 Job descri maximoff     r     12/01/2010 11:34:49 
all.q at compute-0-7.local        MASTER
    5230 0.55500 Job descri maximoff     r     11/23/2010 15:28:04 
all.q at compute-0-7.local        MASTER
    6089 0.55500 Job descri maximoff     r     12/01/2010 11:35:34 
all.q at compute-0-7.local        MASTER
    6099 0.55500 Job descri maximoff     r     12/01/2010 11:42:34 
all.q at compute-0-7.local        MASTER
    6981 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6955 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6974 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6982 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6963 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6964 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6966 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6972 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6976 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6975 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER
    6968 0.55500 T.1.0.N.11 an           r     12/06/2010 09:07:19 
all.q at compute-0-7.local        MASTER

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=302529

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list