[GE users] post task command

reuti reuti at staff.uni-marburg.de
Fri Nov 14 00:07:18 GMT 2008


Am 14.11.2008 um 00:53 schrieb jpolk:

> mmmm, no...lol, I don't think that's what I want to do, lol....
>
> perhaps I should have said: "...reassigning its machines to another  
> pool"...
>
> It did dawn on me...that I think I can achieve the goal...by assigning
> "priority"...

Aha, in this case I would suggest to look into the urgency policy  
http://www.sun.com/blueprints/1005/819-4325.pdf on page 9 ff. Then  
you could attach the boolean complexes to the machines in pool B  
respective pool C.

Instead of submitting into a queue or requesting nodes, you can  
submit normal jobs without any resource requests and for pool B you  
need just "qsub -l pool_b ..." which will enforce the urgency besides  
being limited to the nodes with this complex attached. In fact: in  
this setup you also need only one queue as you can attach the  
complexes to host/hostgroups in one and the same queue.

-- Reuti


> Job A running in Pool-A, where all 30 machines are assigned....default
> priority
> Job B running in Pool-B, where 3 machines are assigned, these 3 are  
> a subset
>     of Pool-A, but are not used by this job has a higher priority.   
> When
> Job-B
>     finishes in the middle of the night, Job-A will start to use them.
>
> Thanks!
> -Jim
>
>
>
> reuti wrote:
>> Am 13.11.2008 um 20:44 schrieb jpolk:
>>
>>
>>> hmmm....let me research that a bit to see if it will work...
>>>
>>> Ideally, what I need is a "post command" that once a job-task#
>>> is finished it issue a "qalter" command, reassigning its machines
>>> to another job-task#...
>>>
>>
>> But selecting a new job for an idling machine is the duty of SGE. You
>> would like to bypass SGE's scheduler and decide on your own which job
>> should run where? Do the job_tasks depend on each other?
>>
>> -- Reuti
>>
>>
>>
>>> Thanks!
>>> -Jim
>>>
>>>
>>> reuti wrote:
>>>
>>>> Am 13.11.2008 um 02:15 schrieb jpolk:
>>>>
>>>>
>>>>
>>>>> Hi Reuti,....Thanks for your response....
>>>>>
>>>>> hmm,..I will search for more information on a queue "epilogue"....
>>>>> (what a great term, btw)
>>>>>
>>>>> As far as our setup,...
>>>>>
>>>>>
>>>>>
>>>>>> Did I get you setup correct: you have e.g. 16 machines in pool  
>>>>>> A, 8
>>>>>> in pool B and 4 in pool C, while jobs are only allowed during
>>>>>> day to
>>>>>> start in pool B or C. During night, after they are drained, you
>>>>>> would
>>>>>> like to have a setup with 28 nodes just in pool A?
>>>>>>
>>>>>>
>>>>> That's close....we have a simple setup really....but in the  
>>>>> example
>>>>> described, all the pools are available/dedicated for rendering,
>>>>> so in the beginning, all pools are running concurrently.
>>>>> Overnight, "B" and "C" run dry,etc...
>>>>>
>>>>>
>>>> This sounds more like a setup for a calendar. In the general queue
>>>> you will have to define:
>>>>
>>>> $ qconf -sq pool_a.q
>>>> ...
>>>> calendar NONE,[@pool_b_hgrp=on_during_night],
>>>> [@pool_c_hgrp=on_during_night]
>>>>
>>>> For the two other queues:
>>>>
>>>> $ qconf -sq pool_b.q
>>>> ...
>>>> calendar off_during_night
>>>>
>>>> =====================
>>>>
>>>> calendars you can define with e.g. `qconf -acal  
>>>> on_during_night`, see
>>>> `man calendar_conf` for some examples. You could also name the  
>>>> other
>>>> on "on_during_day" if you prefer.
>>>>
>>>> =====================
>>>>
>>>> To avoid oversubscription of the nodes as there might still be some
>>>> jobs in pool B/C during the state change (until they are  
>>>> drained) you
>>>> will need either to limit the total number of slots per exechost in
>>>> "complex_values slots=8", or a resource quota (RQS) for it.
>>>>
>>>> -- Reuti
>>>>
>>>>
>>>>
>>>>
>>>>>> Are these 16+8+4 machines bound to dedicated nodes, or is it more
>>>>>> like having this amount of slots for each job type?
>>>>>>
>>>>>>
>>>>> mmm, I believe the former....each of our rendering machines has 8
>>>>> cores,
>>>>> split into 4 threads each, so two jobs can run simultaneously per
>>>>> machine
>>>>> typically, though we can modify that if need be.
>>>>>
>>>>> Thanks again!
>>>>> -Jim
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> reuti wrote:
>>>>>
>>>>>
>>>>>> Hi Jim,
>>>>>>
>>>>>> Am 12.11.2008 um 22:06 schrieb jpolk:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Just recently, we successfully implemented a strategy to  
>>>>>>> allocate
>>>>>>> all
>>>>>>> our rendering
>>>>>>> machines into three "pools",...such that jobs from Project A  
>>>>>>> only
>>>>>>> run in
>>>>>>> Pool A,
>>>>>>> and so forth.  We do this by specifying a hostname list  
>>>>>>> either at
>>>>>>> submission time
>>>>>>> with "qsub", or later with "qalter".
>>>>>>>
>>>>>>> We have a little conundrum however ;-)....
>>>>>>>
>>>>>>> During the wee hours of the night, the job(s, 'cause sometimes
>>>>>>> it's
>>>>>>> only
>>>>>>> one long job)
>>>>>>> in Pool-B and Pool-C finish and so the machines in those  
>>>>>>> pools go
>>>>>>> idle.
>>>>>>> We'd like to
>>>>>>> have a way to automatically reallocate the machines from those
>>>>>>> idle
>>>>>>> pools back into
>>>>>>> Pool-A which is still running (in our example).
>>>>>>>
>>>>>>> I see that there is a way to send mail to users when a jobtask
>>>>>>> finishes.
>>>>>>> I wonder if there is a similar way to execute a command
>>>>>>> (probably a
>>>>>>> "qalter" command)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> in principle it could be done in a queue epilog, means to scan  
>>>>>> the
>>>>>> waiting jobs and change their resource request. But I'm not sure,
>>>>>> whether is the easiest approach.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> when a jobtask finishes which would make the adjustment.  I  
>>>>>>> spent
>>>>>>> sometime looking
>>>>>>> thru the various syntax of "qalter", but I couldn't find  
>>>>>>> anything
>>>>>>> that
>>>>>>> appeared to do the trick.
>>>>>>>
>>>>>>> Does anybody have any experience with this? or might suggest
>>>>>>> another
>>>>>>> alternative method?
>>>>>>> Any and all input very much welcomed.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Did I get you setup correct: you have e.g. 16 machines in pool  
>>>>>> A, 8
>>>>>> in pool B and 4 in pool C, while jobs are only allowed during
>>>>>> day to
>>>>>> start in pool B or C. During night, after they are drained, you
>>>>>> would
>>>>>> like to have a setup with 28 nodes just in pool A?
>>>>>>
>>>>>> Are these 16+8+4 machines bound to dedicated nodes, or is it more
>>>>>> like having this amount of slots for each job type?
>>>>>>
>>>>>> -- Reuti
>>>>>>
>>>>>> ------------------------------------------------------
>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>> dsForumId=38&dsMessageId=88650
>>>>>>
>>>>>> To unsubscribe from this discussion, e-mail: [users-
>>>>>> unsubscribe at gridengine.sunsource.net].
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> ------------------------------------------------------
>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>> dsForumId=38&dsMessageId=88654
>>>>>
>>>>> To unsubscribe from this discussion, e-mail: [users-
>>>>> unsubscribe at gridengine.sunsource.net].
>>>>>
>>>>>
>>>> ------------------------------------------------------
>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>> dsForumId=38&dsMessageId=88686
>>>>
>>>> To unsubscribe from this discussion, e-mail: [users-
>>>> unsubscribe at gridengine.sunsource.net].
>>>>
>>>>
>>>>
>>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=88695
>>>
>>> To unsubscribe from this discussion, e-mail: [users-
>>> unsubscribe at gridengine.sunsource.net].
>>>
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessageId=88708
>>
>> To unsubscribe from this discussion, e-mail: [users- 
>> unsubscribe at gridengine.sunsource.net].
>>
>>
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=88713
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88714

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list