[GE users] Script to show pending reasons in SGE

craffi dag at sonsorol.org
Tue May 26 16:31:03 BST 2009


Writing your own tool that deals with the output of "qstat -xml -j  
<jobID>" would be valuable, if you go down this route, please consider  
sharing your work with the community.  I know of several "private"  
scripts that do this but there are few that are available in public.

-Chris


On May 26, 2009, at 2:26 AM, pto wrote:

> Dear all
>
> One of the annoying things I see with SGE 6.x is the reasons I can see
> for pending jobs in an SGE queue.
> Assume that Job 307472 is not running, and I want to know why. Then  
> I use qstat
>
> $ qstat -j 307472
> ==============================================================
> job_number:                 307472
> exec_file:                  job_scripts/307472
> submission_time:            Mon May 25 13:33:58 2009
> owner:                      joe
> uid:                        38
> group:                      users
> <cut one billion irrelevant lines>
> script_file:                netbatch/joe_sim001_28743/9/ 
> nb_sim_worker.sh
> context:                    JOB_NAME=foo
> usage    1:                 cpu=00:00:00, mem=0.00000 GBs, io=0.00000,
> vmem=N/A, maxvmem=N/A
> scheduling info:            queue instance "rush.q at moo165.bar.org"
> dropped because it is temporarily not available
>                            queue instance "batch.q at moo165.bar.org"
> dropped because it is temporarily not available
>                            queue instance "batch.q at moo167.bar.org"
> dropped because it is disabled
>                            queue instance
> "interactive.q at moo099.bar.org" dropped because it is disabled
>                            queue instance
> "interactive.q at moo100.bar.org" dropped because it is disabled
>                            queue instance "batch.q at moo102.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo107.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo109.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo112.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo113.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo116.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo118.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo126.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo128.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo141.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo143.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo149.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo152.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo153.bar.org"
> dropped because it is full
>                            queue instance "batch.q at moo157.bar.org"
> dropped because it is full
>                            <and it continues....>
>
> with 10000 CPUs this is a horrible interface :-(
>
> Has any of you written a script, which filters this output and give
> clear messages such as
> * The grid is fully loaded - no free CPU ressources
> * Your jobs is not running since you require license foo=1 and the
> available number is zero
> * You are not allowed to run
>
> I.e. something MUCH simpler. Rather than writing such a parser I guess
> most of you have
> been facing the same problem, i.e. it is most likely solved by some of
> you already.
> Am I right?
>
> Best
>
> -- 
> Peter Toft <pto at linuxbog.dk>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=198904
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
> ].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=198992

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list