[GE users] question/problem with queue assignments and PE jobs

templedf dan.templeton at sun.com
Tue May 5 16:25:51 BST 2009


Well, let's start with the obvious.  Is the standard_pe PE listed in the 
primary queue's pe_list?

Daniel

cjf001 wrote:
> Guys -
>
> I've got a question / problem regarding PE vs non-PE jobs. Here's
> what I'm observing, in as concise a description as possible.... :)
>
> SGE6.2u2, 2 cluster queues named "primary" and "secondary" (catchy, eh ?)
> The scheduler config has been changed to sort by sequence number, and
> the primary queue has been assigned a seq_number of 50, and the
> secondary queue a seq_number of 75. The primary queue allows users
> only from the "test" group (which I'm in) - the secondary queue allows
> all users.
>
> I use this command several times to submit non-PE jobs:
>
> qsub -shell no -cwd -V -b y -p -512 -q primary@@Mech_sim,secondary@@* /appl/sun/grid_engine/site_PCSRL/scripts/start_xfdtd_701
>
> and the jobs go to the primary queue on the machines in the "Mech_sim"
> host group, as expected (and desired !). There are only 5 machines in
> the "Mech_sim" host group, so after the 20th time I submit this job
> (there are 4 cores per machine), the jobs start going to the secondary
> queue (on other machines).  Wha-la - just what I expected, and just
> what I wanted.
>
> cjf001 at lxdel01# qstat
> job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
> -----------------------------------------------------------------------------------------------------------------
>      187 100.00000 xfdtd      cjf001       r     05/05/2009 09:51:12 primary at lxdel21.srl.css.mot.co     1
> cjf001 at lxdel01#
>
>
> Now, I submit the following command (after killing all the jobs I
> previously ran) to submit a PE job:
>
> qsub -shell no -cwd -V -b y -p -512 -pe standard_pe 4 -q primary@@Mech_sim,secondary@@* 
> /appl/sun/grid_engine/site_PCSRL/scripts/start_xfdtd_701 4
>
> (all the same except for the "-pe standard_pe 4" addition). This
> job gets assigned to the secondary queue on one of the non-Mech_sim
> machines --
>
> cjf001 at lxdel01# qstat
> job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
> -----------------------------------------------------------------------------------------------------------------
>      186 100.00000 xfdtd      cjf001       r     05/05/2009 09:48:42 secondary at lxdel13.srl.css.mot.     4
> cjf001 at lxdel01#
>
> -- which is NOT expected or desired ! Why the heck doesn't it go to the
> primary queue until those slots are filled, like it does when submitting non-PE jobs ?
>
>
> Ok, I know this is a fairly detailed question, and I've probably left something
> out that's important, but for the life of me I can't figure out why SGE is
> doing this - is there something basic I'm missing, or doing wrong ?
>
>      Thanks for any ideas/advice !
>
>         John
>
>
>
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=190758

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list