[GE users] slots and more - what are they?

Reuti reuti at staff.uni-marburg.de
Wed Aug 1 12:38:25 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]


Am 31.07.2007 um 16:37 schrieb Alexandre Racine:

> Here is the results of those commands.
>
> $ /usr/local/sge/sge-root/bin/lx24-x86/qstat -f
> queuename                      qtype used/tot. load_avg  
> arch          states
> ---------------------------------------------------------------------- 
> ------
> all.q at TORQUE1.statgen.local    BIP   0/1       0.00     lx24-x86
> ---------------------------------------------------------------------- 
> ------
> all.q at TORQUE2.statgen.local    BIP   0/10      0.00     lx24-x86
> ---------------------------------------------------------------------- 
> ------
> all.q at torque3.statgen.local    BIP   0/10      0.00     lx24-x86
>
> ###################################################################### 
> ######
>  - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS -  
> PENDING JOBS
> ###################################################################### 
> ######
>  109828 0.00000 LanceRecur sgeadmin     qw    07/31/2007  
> 10:31:21     1
>  109829 0.00000 LanceRecur sgeadmin     qw    07/31/2007  
> 10:31:21     1
>  109830 0.00000 LanceRecur sgeadmin     qw    07/31/2007  
> 10:31:21     1
>
> $ /usr/local/sge/sge-root/bin/lx24-x86/qstat -j 109828
> ==============================================================
> job_number:                 109828
> exec_file:                  job_scripts/109828
> submission_time:            Tue Jul 31 10:31:21 2007
> owner:                      sgeadmin
> uid:                        20100
> group:                      sgeadmin
> gid:                        20100
> sge_o_home:                 /home/sgeadmin
> sge_o_log_name:             sgeadmin
> sge_o_path:                 /tmp/109827.1.all.q:/usr/local/bin:/ 
> bin:/usr/bin
> sge_o_shell:                /bin/bash
> sge_o_workdir:              /home/sgeadmin/alextest/tmp
> sge_o_host:                 TORQUE1
> account:                    sge
> cwd:                        /home/sgeadmin/alextest/tmp
> path_aliases:               /tmp_mnt/ * * /
> stderr_path_list:           /home/sgeadmin/alextest/tmp
> mail_options:               abe
> mail_list:                  alexandre.racine at mhicc.org
> notify:                     FALSE
> job_name:                   LanceRecursif.sh
> stdout_path_list:           /home/sgeadmin/alextest/tmp
> jobshare:                   0
> shell_list:                 /bin/bash
> env_list:
> job_args:                   2,SGE
> script_file:                /home/sgeadmin/alextest/LanceRecursif.sh
> scheduling info:            queue instance  
> "all.q at TORQUE1.statgen.local" dropped because it is full
>
>
> The weird part is that the queue is empty...

You mean, the job will not start in any of the queue instances at all?

-- Reuti


>
>
>
>
> Alexandre Racine
> Projets spéciaux
>
>
>
> -----Original Message-----
> From: Daniel Templeton [mailto:Dan.Templeton at Sun.COM]
> Sent: Mon 2007-07-30 15:56
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] slots and more - what are they?
>
> Alexandre,
>
> What does qstat -j tell you for the jobs which are in the qw state?
> Odds are there aren't enough resources to schedule the remaining jobs,
> possibly because of the job_load_adjustment settings.
>
> Daniel
>
> Alexandre Racine wrote:
>>
>> I have created this nice recursive program to tests the scheduling of
>> SGE and I can't really gasp the concept of the slots in the sense  
>> that
>> I edit the default queue put more processors but I always have a
>> maximum of 7 out of 10 used processors. Why?
>>
>> So I edit the queue...
>>  #/usr/local/sge/sge-root/bin/lx24-x86/qconf -mq all.q
>>
>> Put 3 slots (but with one the result is the same) and 10  
>> processors to
>> my servers TORQUE2 and torque3 (nevermind the names here)
>> qname                 all.q
>> hostlist              @allhosts
>> seq_no                0
>> load_thresholds       np_load_avg=1.75
>> [...]
>> slots
>> 3,[TORQUE1.statgen.local=1],[TORQUE2.statgen.local=10], \
>>                       [torque3.statgen.local=10]
>>
>>
>> Then lauch my recursive program to test the queue and it never goes
>> other 7 tasks on TORQUE2. Does the number of slots happen to affect
>> something because the results are the same if there is only 1  
>> slot....
>>
>>
>> queuename                      qtype used/tot. load_avg arch
>> states
>> --------------------------------------------------------------------- 
>> -------
>> all.q at TORQUE1.statgen.local    BIP   1/1       0.00     lx24-x86
>>  109777 0.55500 LanceRecur sgeadmin     r     07/30/2007  
>> 15:38:58     1
>> --------------------------------------------------------------------- 
>> -------
>> all.q at TORQUE2.statgen.local    BIP   7/10      0.00     lx24-x86
>>  109778 0.55500 LanceRecur sgeadmin     r     07/30/2007  
>> 15:38:58     1
>>  109780 0.55500 LanceRecur sgeadmin     t     07/30/2007  
>> 15:38:58     1
>>  109781 0.55500 LanceRecur sgeadmin     t     07/30/2007  
>> 15:38:58     1
>>  109783 0.55500 LanceRecur sgeadmin     t     07/30/2007  
>> 15:38:58     1
>>  109784 0.55500 LanceRecur sgeadmin     t     07/30/2007  
>> 15:38:58     1
>>  109786 0.55500 LanceRecur sgeadmin     t     07/30/2007  
>> 15:38:58     1
>>  109787 0.55500 LanceRecur sgeadmin     t     07/30/2007  
>> 15:38:58     1
>> --------------------------------------------------------------------- 
>> -------
>> all.q at torque3.statgen.local    BIP   4/10      0.14     lx24-x86
>>  109779 0.55500 LanceRecur sgeadmin     r     07/30/2007  
>> 15:38:58     1
>>  109782 0.55500 LanceRecur sgeadmin     t     07/30/2007  
>> 15:38:58     1
>>  109785 0.55500 LanceRecur sgeadmin     t     07/30/2007  
>> 15:38:58     1
>>  109788 0.55500 LanceRecur sgeadmin     t     07/30/2007  
>> 15:38:58     1
>>
>> ##################################################################### 
>> #######
>>  - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS -  
>> PENDING
>> JOBS
>> ##################################################################### 
>> #######
>>  109789 0.55500 LanceRecur sgeadmin     qw    07/30/2007  
>> 15:38:45     1
>>  109790 0.55500 LanceRecur sgeadmin     qw    07/30/2007  
>> 15:38:45     1
>>
>>
>>
>> Thanks.
>>
>> Alexandre Racine
>> Projets spéciaux
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list