[GE users] Queue subordination and custom complexes

Reuti reuti at Staff.Uni-Marburg.DE
Tue Apr 1 10:55:46 BST 2008


http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=24049

Am 31.03.2008 um 21:56 schrieb Roberta Gigon:
> I have a similar situation and am running into difficulties.
>
> I have queue1 consisting of nodes with two processors.
> I have queue2 consisting of the same nodes, but this queue is  
> subordinate to queue1.
>
> I have User A who wants both processors on a node or none at all  
> and submits into queue2.
> I have User B who wants only one processor per job and submits into  
> queue1.
>
> So... I have User A submit into queue2 using a PE I set up  
> (whole_node).  His job runs and does indeed take up both slots in  
> that queue.  When User B submits into queue1, his job also runs.   
> However, the behavior we are looking for is User A's job should  
> suspend and User B's should run.
>
> Next, I tried this: I set up a consumable complex called bearprocs  
> and set it to 2 on each host.  Then I had User A submit into queue2  
> using -l bearprocs=2.  This worked fine and gave User A exclusive  
> use of both processors on the node.  However, now when User B  
> submits into queue1, the job remains pending and does not suspend  
> User A's job, presumably because the scheduler checks for the  
> availability of the consumable bearprocs before looking at  
> subordination.
>
> I see the suggestion below from Reuti to attach the complex to the  
> queue.  Will this solve my problem as well?  If so, do I need to  
> add it to both queue1 and queue2?  If so, how should User B submit  
> their job -- -l bearprocs=1?  No -l option?
>
> Thanks,
> Roberta
>
>
> ---------------------------------------------------------------------- 
> -----------------------
> Roberta M. Gigon
> Schlumberger-Doll Research
> One Hampshire Street, MD-B253
> Cambridge, MA 02139
> 617.768.2099 - phone
> 617.768.2381 - fax
>
> This message is considered Schlumberger CONFIDENTIAL.  Please treat  
> the information contained herein accordingly.
>
>
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: Monday, March 31, 2008 1:37 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Queue subordination and custom complexes
>
> Hi,
>
> Am 31.03.2008 um 18:46 schrieb David Olbersen:
>> I have the following configuration in my lab cluster:
>>
>> Q1 runs on machines #1, #2, and #3.
>> Q2 runs on the same machines.
>> Q2 is configured to have Q1 as a subordinate.
>> All machines have 2GB of RAM.
>>
>> If I submit 3 jobs to Q1 and 3 to Q2, the expected results are
>> given: jobs start in Q1 (submitted first) then get suspended while
>> jobs in Q2 run.
>>
>> Awesome.
>>
>> Next I try specifying hard resource requirements by adding "-hard -
>> l mem_free=1.5G" to each job. This still ends up working out,
>> probably because the jobs don't actually consume 1.5G of memory.
>> The jobs are simple things that drive up CPU utilization by dd'ing
>> from /dev/urandom out to /dev/null.
>>
>> Next, to further replicate my production environment I add a custom
>> complex named "cores" that gets set on a per-host basis to the
>> number of CPUs the machine has. Please note that we're not using
>> "num_proc" because we want some jobs to use fractions of a CPU and
>> num_proc is an INT.
>>
>> So each job will take up 1 "core" and each job has 1 "core".
>> With this set up the jobs in Q1 run, and the jobs in Q2 wait. No
>> suspension happens at all. Is this because the host resource is
>> actually being consumed? Is there any way to get around this?
>
> yes, you can check the remaining amount of this complex with "qhost -
> F cores". Or also per job: qstat -j <jobid> when "schedd_job_info
> true" in the scheduler setup). Be aware, that only complete queues
> can be suspended, and not just some slots of them.
>
> What you can do: attach the resource to the queues, not to the host.
> Hence every queue supplies the specified amount per node on its own.
>
> (sidenote: to avoid requesting the resource all the time and
> specifying the correct queue in addition, you could also have two
> resources cores1 and cores2. attach cores1 to Q1 and likewise cores2.
> qsub -l cores2=1 will also get the Q2 queue).
>
> -- Reuti
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list