[GE users] Priority Queues Not Working

dagru d.gruber at sun.com
Thu May 13 18:34:13 BST 2010


In SGE 6.2u5 slotwise subordination is only for serial jobs. 
In parallel jobs the slave job would be signalled and a parallel
job on a qinstance would be counted as 1 slot regardless the 
amount of slots used. 

Daniel

Am Donnerstag, den 13.05.2010, 07:35 -0700 schrieb gracklewolf:
> > Am 06.05.2010 um 22:25 schrieb gracklewolf:
> > 
> > > Yes, I believe it is tightly integrated.  We use OpenMPI 1.4 and our "orte" parallel environment looks like so:
> > 
> > Was OpenMPI compiled with SGE support?
> 
> Yes.  
> $ ompi_info | grep gridengine
>                  MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.4)
> 
> > 
> > What behavior do you notice in detail - you description was a little bit vague.
> 
> We've got a cluster of 50 servers each with 8 cores.  We've got the following 3 queues setup in slotwise subordination:
> 
> qname                 first.q
> hostlist              @nodes
> seq_no                0
> load_thresholds       np_load_avg=1.75
> suspend_thresholds    NONE
> nsuspend              1
> suspend_interval      00:01:00
> priority              0
> min_cpu_interval      00:01:00
> processors            UNDEFINED
> qtype                 BATCH INTERACTIVE
> ckpt_list             NONE
> pe_list               make orte mpi
> rerun                 TRUE
> slots                 8
> tmpdir                /tmp
> shell                 /bin/csh
> prolog                NONE
> epilog                NONE
> shell_start_mode      posix_compliant
> starter_method        NONE
> suspend_method        SIGTSTP
> resume_method         SIGCONT
> terminate_method      NONE
> notify                00:00:60
> owner_list            NONE
> user_lists            ezdockusers
> xuser_lists           NONE
> subordinate_list      slots=8(second.q:0:lr)
> 
> qname                 second.q
> hostlist              @nodes
> seq_no                0
> load_thresholds       np_load_avg=1.75
> suspend_thresholds    NONE
> nsuspend              1
> suspend_interval      00:01:00
> priority              0
> min_cpu_interval      00:01:00
> processors            UNDEFINED
> qtype                 BATCH INTERACTIVE
> ckpt_list             NONE
> pe_list               basic make mpi orte
> rerun                 FALSE
> slots                 8
> tmpdir                /tmp
> shell                 /bin/csh
> shell_start_mode      posix_compliant
> starter_method        NONE
> suspend_method        SIGTSTP
> resume_method         SIGCONT
> terminate_method      NONE
> notify                00:00:60
> subordinate_list      slots=8(third.q:1:sr)
> 
> qname                 third.q
> hostlist              @nodes
> seq_no                0
> load_thresholds       np_load_avg=1.75
> suspend_thresholds    NONE
> nsuspend              0
> suspend_interval      00:01:00
> priority              0
> min_cpu_interval      00:01:00
> processors            UNDEFINED
> qtype                 BATCH INTERACTIVE
> ckpt_list             NONE
> pe_list               make basic orte mpi
> rerun                 TRUE
> slots                 8
> tmpdir                /tmp
> shell                 /bin/csh
> shell_start_mode      posix_compliant
> starter_method        NONE
> suspend_method        SIGTSTP
> resume_method         SIGCONT
> terminate_method      NONE
> notify                00:00:60
> 
> 
> If we submit several 60 slot OpenMPI jobs (-pe orte) to third.q such that there are 5 or more jobs using up all the slots in the cluster, and then we submit any job to first.q or second.q, none of the OpenMPI jobs will suspend.
> 
> I've done some testing and it appears that, while SGE treats the OpenMPI job as taking 60 slots, it only counts as 1 job with respect to the slotwise subordination and thus does not suspend the OpenMPI job.  As a result, our nodes are oversubscribed.
> 
> 
> > 
> > -- Reuti
> > 
> > PS: Please quote the original mail to which you reply, there is a button for it. It's not easy to keep track when you follow more than one thread.
> > 
> > 
> > > 
> > > pe_name            orte
> > > slots              999
> > > user_lists         NONE
> > > xuser_lists        NONE
> > > start_proc_args    /bin/true
> > > stop_proc_args     /bin/true
> > > allocation_rule    $fill_up
> > > control_slaves     TRUE
> > > job_is_first_task  FALSE
> > > urgency_slots      min
> > > accounting_summary TRUE
> > > 
> > > 
> > > SGE appears to be aware of how many slots are taken by each OpenMPI job.
> > > 
> > > ------------------------------------------------------
> > > http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=256442
> > > 
> > > To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=257184
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=257192

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list