[GE users] question on slotwise preemption configuration

malcolmdavis malcolm.davis at bms.com
Thu Jul 22 22:16:11 BST 2010


A slightly related question:

We abandoned trying slotwise preemption for what sounds like the 6.2u5 bug, but before we upgrade to 6.2u6 and try again, we also had a deeper configuration issue.  The slotwise preemption examples all assume a downward branching tree, i.e. queue A can preempt queue B and queue C.  What we want is actually an inversion of that.  We have two separate queues A and B that have different resource requirements, and we want jobs from either to be able to preempt jobs from queue C.  This raises two questions.  First, there is the question of the syntax for specifying that arrangement.  Second, there is the question of whether this can be made to work in conjunction with host based slot limits.  We need the host based slot limits to keep queues A and B from oversubscribing the system, but at least in 6.2u5 if C was filling all of the slots on a host, then none of the A or B jobs would ever get assigned to that host, so none of the C jobs would ever get preempted.  If we played games to make sure that C could only take one less slot than A and B on the host, one A/B job would get dispatched to the host, one C would get suspended, but the suspension didn't free the slot, so a second A/B job would never get dispatched.

The problem is that the preemption is designed to be a follow-on consequence of a dispatch.  I would suggest that the preemption should be part of the dispatch itself, i.e. the scheduler should realize it can dispatch a job to the host because there is a preemptable job currently running there.  And of course, preempted jobs shouldn't be counted against a host's slot limit.

So the real question is does 6.2u6 solve this problem?  Or will we still need to forego preemption?

Malcolm

>-----Original Message-----
>From: cjf001 [mailto:john.foley at motorola.com]
>Sent: Thursday, July 22, 2010 4:46 PM
>To: users at gridengine.sunsource.net
>Subject: Re: [GE users] question on slotwise preemption configuration
>
>Stephen -
>
>looks like a winner ! I'll give it a try, and send you a separate email
>regarding the qmaster binary.
>
>   Thanks,
>
>      John
>
>
>stephendennis wrote:
>> Hi John
>>
>> We work around the necessity of providing the slot counts with syntax
>and
>> configuration like this.
>>
>> subordinate_list=slots=8(low:2:sr), \
>>    [@4_slot=slots=4(low:0:sr)], \
>>    [@8_slot=slots=8(low:1:sr)]
>>
>> As you can see we set host groups for each range of slot counts.
>>
>> There is a bug in 6.2u5 which leaves jobs in suspension sometimes
>> http://gridengine.sunsource.net/issues/show_bug.cgi?id=3233
>>
>> SGE Engineering tells me that it is fixed in 6.2u6
>>
>> We have a patched version of lx24-amd64 qmaster for version 6.2u5
>which addresses
>> the issue which I can provide if you contact me of list.
>>
>> Thanks
>> Stephen
>> ________________________________________
>> From: cjf001 [john.foley at motorola.com]
>> Sent: Thursday, July 22, 2010 3:47 PM
>> To: users at gridengine.sunsource.net
>> Subject: [GE users] question on slotwise preemption configuration
>>
>> Guys -
>>
>> I have a question regarding slotwise preemption. (I did a
>> search of the archive and only found 2 entries with this topic,
>> so I don't think this has been covered... would be nice if the
>> searcher could search the bodies of the posts, and not just the
>> subjects....)
>>
>> Anyway, I have 2 cluster queues - "primary" and "secondary".
>> "secondary" is subordinate to "primary". It was this way before
>> I upgraded to v6.2u5 from v6.2u2; it still works as before -
>> that is, if I fill the slots on a host with jobs in the secondary
>> queue, and then submit a job into the primary queue, all the
>> jobs in the secondary queue are suspended until the primary job
>> finishes. As expected.
>>
>> So, I would like to take advantage of the slotwise preemption feature.
>> After reading the page at:
>>
>>
>http://wikis.sun.com/display/gridengine62u6/How+To+Use+Slotwise+Preempti
>on
>>
>> and the queue_conf man page, I'm left wondering if this is going to
>> work for me, because.....
>>
>> ...the descriptions show a syntax of (for my case, of my "primary"
>> queue) :
>>
>> "subordinate_list slots=2(secondary)"
>>
>> where the "2" is the total number of slots on the host. And that's
>> my problem - my queues ("primary" and "secondary") are across all
>> my hosts - and some of them have 2 slots, and some have 4 slots, and
>> some even have 8 slots.
>>
>> So, is there a way to specify my "primary" queue config that will
>> work for all the hosts, regardless of how many slots they have ?
>>
>> (SGE knows how many slots there are on each host anyway, so I'm
>wondering
>> why we should even need to specify this number ?...)
>>
>> What am I missing ?
>>
>>       Thanks !
>>
>>          John
>>
>>
>>
>> --
>>
>########################################################################
>###
>> # John Foley                          # Location:  IL93-E1-21S
>#
>> # IT&  Systems Administration         # Maildrop:  IL93-E1-35O
>#
>> # Antenna&  Mechanical Simulation Grp #    Email:
>john.foley at motorola.com #
>> # Motorola, Inc. -  Mobile Devices    #    Phone: (847) 523-8719
>#
>> # 600 North US Highway 45             #      Fax: (847) 523-5767
>#
>> # Libertyville, IL. 60048  (USA)      #     Cell: (847) 460-8719
>#
>>
>########################################################################
>###
>>                 (this email sent using SeaMonkey on Windows)
>>
>> ------------------------------------------------------
>>
>http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessage
>Id=269789
>>
>> To unsubscribe from this discussion, e-mail: [users-
>unsubscribe at gridengine.sunsource.net].
>>
>>
>> ---------------------------------------------------------------------
>>
>>
>> Notice from Univa UD Postmaster:
>>
>>
>> This email message is for the sole use of the intended recipient(s)
>and may contain confidential and privileged information. Any
>unauthorized review, use, disclosure or distribution is prohibited. If
>you are not the intended recipient, please contact the sender by reply
>email and destroy all copies of the original message. This message has
>been content scanned by the Univa UD Tumbleweed MailGate.
>>
>>
>>
>> ---------------------------------------------------------------------
>>
>> ------------------------------------------------------
>>
>http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessage
>Id=269792
>>
>> To unsubscribe from this discussion, e-mail: [users-
>unsubscribe at gridengine.sunsource.net].
>
>
>
>--
>########################################################################
>###
># John Foley                          # Location:  IL93-E1-21S
>#
># IT & Systems Administration         # Maildrop:  IL93-E1-35O
>#
># Antenna & Mechanical Simulation Grp #    Email:
>john.foley at motorola.com #
># Motorola, Inc. -  Mobile Devices    #    Phone: (847) 523-8719
>#
># 600 North US Highway 45             #      Fax: (847) 523-5767
>#
># Libertyville, IL. 60048  (USA)      #     Cell: (847) 460-8719
>#
>########################################################################
>###
>               (this email sent using SeaMonkey on Windows)
>
>------------------------------------------------------
>http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessage
>Id=269798
>
>To unsubscribe from this discussion, e-mail: [users-
>unsubscribe at gridengine.sunsource.net].

This message (including any attachments) may contain confidential, proprietary, privileged and/or private information.  The information is intended to be for the use of the individual or entity designated above.  If you are not the intended recipient of this message, please notify the sender immediately, and delete the message and any attachments.  Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=269799

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list