[GE users] question/problem with queue assignments and PE jobs

cjf001 john.foley at motorola.com
Tue May 5 15:56:06 BST 2009


Guys -

I've got a question / problem regarding PE vs non-PE jobs. Here's
what I'm observing, in as concise a description as possible.... :)

SGE6.2u2, 2 cluster queues named "primary" and "secondary" (catchy, eh ?)
The scheduler config has been changed to sort by sequence number, and
the primary queue has been assigned a seq_number of 50, and the
secondary queue a seq_number of 75. The primary queue allows users
only from the "test" group (which I'm in) - the secondary queue allows
all users.

I use this command several times to submit non-PE jobs:

qsub -shell no -cwd -V -b y -p -512 -q primary@@Mech_sim,secondary@@* /appl/sun/grid_engine/site_PCSRL/scripts/start_xfdtd_701

and the jobs go to the primary queue on the machines in the "Mech_sim"
host group, as expected (and desired !). There are only 5 machines in
the "Mech_sim" host group, so after the 20th time I submit this job
(there are 4 cores per machine), the jobs start going to the secondary
queue (on other machines).  Wha-la - just what I expected, and just
what I wanted.

cjf001 at lxdel01# qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
     187 100.00000 xfdtd      cjf001       r     05/05/2009 09:51:12 primary at lxdel21.srl.css.mot.co     1
cjf001 at lxdel01#


Now, I submit the following command (after killing all the jobs I
previously ran) to submit a PE job:

qsub -shell no -cwd -V -b y -p -512 -pe standard_pe 4 -q primary@@Mech_sim,secondary@@* 
/appl/sun/grid_engine/site_PCSRL/scripts/start_xfdtd_701 4

(all the same except for the "-pe standard_pe 4" addition). This
job gets assigned to the secondary queue on one of the non-Mech_sim
machines --

cjf001 at lxdel01# qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
     186 100.00000 xfdtd      cjf001       r     05/05/2009 09:48:42 secondary at lxdel13.srl.css.mot.     4
cjf001 at lxdel01#

-- which is NOT expected or desired ! Why the heck doesn't it go to the
primary queue until those slots are filled, like it does when submitting non-PE jobs ?


Ok, I know this is a fairly detailed question, and I've probably left something
out that's important, but for the life of me I can't figure out why SGE is
doing this - is there something basic I'm missing, or doing wrong ?

     Thanks for any ideas/advice !

        John



-- 
###########################################################################
# John Foley                          # Location:  IL93-E1-21S            #
# IT & Systems Administration         # Maildrop:  IL93-E1-35O            #
# Antenna & Mechanical Simulation Grp #    Email: john.foley at motorola.com #
# Motorola, Inc. -  Mobile Devices    #    Phone: (847) 523-8719          #
# 600 North US Highway 45             #      Fax: (847) 523-5767          #
# Libertyville, IL. 60048  (USA)      #     Cell: (847) 460-8719          #
###########################################################################
                 (this email sent using Mozilla on Windows)

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=190733

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list