[GE users] Queue subordination and custom complexes

Reuti reuti at staff.uni-marburg.de
Tue Apr 1 22:08:28 BST 2008


Am 01.04.2008 um 22:45 schrieb Roberta Gigon:
> Hi,
>
> Perhaps this will help clarify:
>
> I submit a job to webmi_low.q requesting both slots using the  
> whole_node pe.  It runs.
>
> [root at bear ~]$ qsub -q webmi_low.q at bear1.cl.slb.com -pe whole_node  
> 2 /opt/sge/examples/jobs/simple2.sh
> Your job 3976 ("simple2.sh") has been submitted
> [root at bear ~]$ qstat
> job-ID  prior   name       user         state submit/start at      
> queue                          slots ja-task-ID
> ---------------------------------------------------------------------- 
> -------------------------------------------
>    3976 0.55500 simple2.sh root         r     04/01/2008 16:31:10  
> webmi_low.q at bear1.cl.slb.com       2
>
> Then, I submit another job requesting both slots.  It goes into qw  
> mode as expected.  This is good!
>
> [root at bear ~]$ qsub -q webmi_low.q at bear1.cl.slb.com -pe whole_node  
> 2 /opt/sge/examples/jobs/simple2.sh
> Your job 3977 ("simple2.sh") has been submitted
> [root at bear ~]$ qstat
> job-ID  prior   name       user         state submit/start at      
> queue                          slots ja-task-ID
> ---------------------------------------------------------------------- 
> -------------------------------------------
>    3976 0.55500 simple2.sh root         r     04/01/2008 16:31:10  
> webmi_low.q at bear1.cl.slb.com       2
>    3977 0.00000 simple2.sh root         qw    04/01/2008  
> 16:31:21                                    2
>
> Here is where things go badly:  I submit a job into nuclear_hi.q  
> which is supposed to suspend jobs in webmi_low.q.  Instead, it goes  
> into "qw".
>
> [root at bear ~]$ qsub -q nuclear_hi.q at bear1.cl.slb.com /opt/sge/ 
> examples/jobs/simple2.sh
> Your job 3979 ("simple2.sh") has been submitted
> [root at bear ~]$ qstat
> job-ID  prior   name       user         state submit/start at      
> queue                          slots ja-task-ID
> ---------------------------------------------------------------------- 
> -------------------------------------------
>    3976 0.60500 simple2.sh root         r     04/01/2008 16:31:10  
> webmi_low.q at bear1.cl.slb.com       2
>    3977 0.60500 simple2.sh root         qw    04/01/2008  
> 16:31:21                                    2
>    3979 0.50500 simple2.sh root         qw    04/01/2008  
> 16:37:43                                    1

Aha, so it never starts at all. Can you check with "qstat -j 3979"  
for the reason?

-- Reuti


>
> However...
> If I submit two jobs without the -pe flag:
>
> [root at bear ~]$ qsub -q webmi_low.q at bear1.cl.slb.com /opt/sge/ 
> examples/jobs/simple2.sh
> Your job 3980 ("simple2.sh") has been submitted
> [root at bear ~]$ qsub -q webmi_low.q at bear1.cl.slb.com /opt/sge/ 
> examples/jobs/simple2.sh
> Your job 3981 ("simple2.sh") has been submitted
> [root at bear ~]$ qstat
> job-ID  prior   name       user         state submit/start at      
> queue                          slots ja-task-ID
> ---------------------------------------------------------------------- 
> -------------------------------------------
>    3980 0.55500 simple2.sh root         r     04/01/2008 16:40:55  
> webmi_low.q at bear1.cl.slb.com       1
>    3981 0.55500 simple2.sh root         r     04/01/2008 16:40:55  
> webmi_low.q at bear1.cl.slb.com       1
>
> And then submit a job into the other queue, the subordination works.
>
> [root at bear ~]$ qsub -q nuclear_hi.q at bear1.cl.slb.com /opt/sge/ 
> examples/jobs/simple2.sh
> Your job 3982 ("simple2.sh") has been submitted
>
> [root at bear ~]$ qstat
> job-ID  prior   name       user         state submit/start at      
> queue                          slots ja-task-ID
> ---------------------------------------------------------------------- 
> -------------------------------------------
>    3982 0.55500 simple2.sh root         r     04/01/2008 16:42:35  
> nuclear_hi.q at bear1.cl.slb.com      1
>    3980 0.55500 simple2.sh root         S     04/01/2008 16:40:55  
> webmi_low.q at bear1.cl.slb.com       1
>    3981 0.55500 simple2.sh root         S     04/01/2008 16:40:55  
> webmi_low.q at bear1.cl.slb.com       1
>
>
> Any help you can provide is greatly appreciated!
>
>
> Roberta M. Gigon
> Schlumberger-Doll Research
> One Hampshire Street, MD-B253
> Cambridge, MA 02139
> 617.768.2099 - phone
> 617.768.2381 - fax
>
> This message is considered Schlumberger CONFIDENTIAL.  Please treat  
> the information contained herein accordingly.
>
>
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: Tuesday, April 01, 2008 4:04 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Queue subordination and custom complexes
>
> Am 01.04.2008 um 19:49 schrieb Roberta Gigon:
>> I tried this and what I discovered is when I submit a job into
>> queue1 with the -pe flag giving me exclusive use of both slots and
>> then submit another job (with or without the -pe flag) into queue2,
>> the job in queue1 never gets suspended.
>>
>> If, alternatively, I submit two independent jobs into queue1 and
>> then submit a job into queue2, the job suspension works as expected.
>
> What Du you mean in detail: the state in qstat is not going to
> suspend, or the PE application is not being suspended according to
> top and/or ps?
>
> -- Reuti
>
>
>> Any ideas what is going on here?
>>
>> Thanks,
>> Roberta
>>
>> --------------------------------------------------------------------- 
>> -
>> -----------------------
>> Roberta M. Gigon
>> Schlumberger-Doll Research
>> One Hampshire Street, MD-B253
>> Cambridge, MA 02139
>> 617.768.2099 - phone
>> 617.768.2381 - fax
>>
>> This message is considered Schlumberger CONFIDENTIAL.  Please treat
>> the information contained herein accordingly.
>>
>> -----Original Message-----
>> From: Reuti [mailto:reuti at Staff.Uni-Marburg.DE]
>> Sent: Tuesday, April 01, 2008 5:56 AM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] Queue subordination and custom complexes
>>
>> http://gridengine.sunsource.net/servlets/ReadMsg?
>> list=users&msgNo=24049
>>
>> Am 31.03.2008 um 21:56 schrieb Roberta Gigon:
>>> I have a similar situation and am running into difficulties.
>>>
>>> I have queue1 consisting of nodes with two processors.
>>> I have queue2 consisting of the same nodes, but this queue is
>>> subordinate to queue1.
>>>
>>> I have User A who wants both processors on a node or none at all
>>> and submits into queue2.
>>> I have User B who wants only one processor per job and submits into
>>> queue1.
>>>
>>> So... I have User A submit into queue2 using a PE I set up
>>> (whole_node).  His job runs and does indeed take up both slots in
>>> that queue.  When User B submits into queue1, his job also runs.
>>> However, the behavior we are looking for is User A's job should
>>> suspend and User B's should run.
>>>
>>> Next, I tried this: I set up a consumable complex called bearprocs
>>> and set it to 2 on each host.  Then I had User A submit into queue2
>>> using -l bearprocs=2.  This worked fine and gave User A exclusive
>>> use of both processors on the node.  However, now when User B
>>> submits into queue1, the job remains pending and does not suspend
>>> User A's job, presumably because the scheduler checks for the
>>> availability of the consumable bearprocs before looking at
>>> subordination.
>>>
>>> I see the suggestion below from Reuti to attach the complex to the
>>> queue.  Will this solve my problem as well?  If so, do I need to
>>> add it to both queue1 and queue2?  If so, how should User B submit
>>> their job -- -l bearprocs=1?  No -l option?
>>>
>>> Thanks,
>>> Roberta
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> -
>>> -----------------------
>>> Roberta M. Gigon
>>> Schlumberger-Doll Research
>>> One Hampshire Street, MD-B253
>>> Cambridge, MA 02139
>>> 617.768.2099 - phone
>>> 617.768.2381 - fax
>>>
>>> This message is considered Schlumberger CONFIDENTIAL.  Please treat
>>> the information contained herein accordingly.
>>>
>>>
>>> -----Original Message-----
>>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>>> Sent: Monday, March 31, 2008 1:37 PM
>>> To: users at gridengine.sunsource.net
>>> Subject: Re: [GE users] Queue subordination and custom complexes
>>>
>>> Hi,
>>>
>>> Am 31.03.2008 um 18:46 schrieb David Olbersen:
>>>> I have the following configuration in my lab cluster:
>>>>
>>>> Q1 runs on machines #1, #2, and #3.
>>>> Q2 runs on the same machines.
>>>> Q2 is configured to have Q1 as a subordinate.
>>>> All machines have 2GB of RAM.
>>>>
>>>> If I submit 3 jobs to Q1 and 3 to Q2, the expected results are
>>>> given: jobs start in Q1 (submitted first) then get suspended while
>>>> jobs in Q2 run.
>>>>
>>>> Awesome.
>>>>
>>>> Next I try specifying hard resource requirements by adding "-hard -
>>>> l mem_free=1.5G" to each job. This still ends up working out,
>>>> probably because the jobs don't actually consume 1.5G of memory.
>>>> The jobs are simple things that drive up CPU utilization by dd'ing
>>>> from /dev/urandom out to /dev/null.
>>>>
>>>> Next, to further replicate my production environment I add a custom
>>>> complex named "cores" that gets set on a per-host basis to the
>>>> number of CPUs the machine has. Please note that we're not using
>>>> "num_proc" because we want some jobs to use fractions of a CPU and
>>>> num_proc is an INT.
>>>>
>>>> So each job will take up 1 "core" and each job has 1 "core".
>>>> With this set up the jobs in Q1 run, and the jobs in Q2 wait. No
>>>> suspension happens at all. Is this because the host resource is
>>>> actually being consumed? Is there any way to get around this?
>>>
>>> yes, you can check the remaining amount of this complex with  
>>> "qhost -
>>> F cores". Or also per job: qstat -j <jobid> when "schedd_job_info
>>> true" in the scheduler setup). Be aware, that only complete queues
>>> can be suspended, and not just some slots of them.
>>>
>>> What you can do: attach the resource to the queues, not to the host.
>>> Hence every queue supplies the specified amount per node on its own.
>>>
>>> (sidenote: to avoid requesting the resource all the time and
>>> specifying the correct queue in addition, you could also have two
>>> resources cores1 and cores2. attach cores1 to Q1 and likewise  
>>> cores2.
>>> qsub -l cores2=1 will also get the Q2 queue).
>>>
>>> -- Reuti
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list