[GE users] slots and more - what are they?

Daniel Templeton Dan.Templeton at Sun.COM
Mon Jul 30 20:56:03 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Alexandre,

What does qstat -j tell you for the jobs which are in the qw state?  
Odds are there aren't enough resources to schedule the remaining jobs, 
possibly because of the job_load_adjustment settings.

Daniel

Alexandre Racine wrote:
>
> I have created this nice recursive program to tests the scheduling of 
> SGE and I can't really gasp the concept of the slots in the sense that 
> I edit the default queue put more processors but I always have a 
> maximum of 7 out of 10 used processors. Why?
>
> So I edit the queue...
>  #/usr/local/sge/sge-root/bin/lx24-x86/qconf -mq all.q
>
> Put 3 slots (but with one the result is the same) and 10 processors to 
> my servers TORQUE2 and torque3 (nevermind the names here)
> qname                 all.q
> hostlist              @allhosts
> seq_no                0
> load_thresholds       np_load_avg=1.75
> [...]
> slots                 
> 3,[TORQUE1.statgen.local=1],[TORQUE2.statgen.local=10], \
>                       [torque3.statgen.local=10]
>
>
> Then lauch my recursive program to test the queue and it never goes 
> other 7 tasks on TORQUE2. Does the number of slots happen to affect 
> something because the results are the same if there is only 1 slot....
>
>
> queuename                      qtype used/tot. load_avg arch          
> states
> ----------------------------------------------------------------------------
> all.q at TORQUE1.statgen.local    BIP   1/1       0.00     lx24-x86
>  109777 0.55500 LanceRecur sgeadmin     r     07/30/2007 15:38:58     1
> ----------------------------------------------------------------------------
> all.q at TORQUE2.statgen.local    BIP   7/10      0.00     lx24-x86
>  109778 0.55500 LanceRecur sgeadmin     r     07/30/2007 15:38:58     1
>  109780 0.55500 LanceRecur sgeadmin     t     07/30/2007 15:38:58     1
>  109781 0.55500 LanceRecur sgeadmin     t     07/30/2007 15:38:58     1
>  109783 0.55500 LanceRecur sgeadmin     t     07/30/2007 15:38:58     1
>  109784 0.55500 LanceRecur sgeadmin     t     07/30/2007 15:38:58     1
>  109786 0.55500 LanceRecur sgeadmin     t     07/30/2007 15:38:58     1
>  109787 0.55500 LanceRecur sgeadmin     t     07/30/2007 15:38:58     1
> ----------------------------------------------------------------------------
> all.q at torque3.statgen.local    BIP   4/10      0.14     lx24-x86
>  109779 0.55500 LanceRecur sgeadmin     r     07/30/2007 15:38:58     1
>  109782 0.55500 LanceRecur sgeadmin     t     07/30/2007 15:38:58     1
>  109785 0.55500 LanceRecur sgeadmin     t     07/30/2007 15:38:58     1
>  109788 0.55500 LanceRecur sgeadmin     t     07/30/2007 15:38:58     1
>
> ############################################################################
>  - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING 
> JOBS
> ############################################################################
>  109789 0.55500 LanceRecur sgeadmin     qw    07/30/2007 15:38:45     1
>  109790 0.55500 LanceRecur sgeadmin     qw    07/30/2007 15:38:45     1
>
>
>
> Thanks.
>
> Alexandre Racine
> Projets spéciaux
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list