Opened 16 years ago
Last modified 10 years ago
#201 new enhancement
IZ1273: consumable not working as suspend thresholds
Reported by: | andy | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | sge | Version: | 6.0 |
Severity: | Keywords: | scheduling | |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=1273]
Issue #: 1273 Platform: All Reporter: andy (andy) Component: gridengine OS: All Subcomponent: scheduling Version: 6.0 CC: [_] reuti [_] uddeborg [_] Remove selected CCs Status: NEW Priority: P3 Resolution: Issue type: ENHANCEMENT Target milestone: --- Assigned to: andreas (andreas) QA Contact: andreas URL: * Summary: consumable not working as suspend thresholds Status whiteboard: Attachments: Issue 1273 blocks: Votes for issue 1273: Opened: Tue Sep 14 03:42:00 -0700 2004 ------------------------ I came accross a weird behavior: In 6.0 (and 5.3p6) it is supported to use consumables as load thresholds, however it is not working for suspend thresholds. The makes the following setup impossible, I call it "slot preemption": Since Grid Engine does not allow to limit the number of slots per host *and* use suspend on subordinate together (if the host is full Grid Engine cannot schedule a job to a full host, even if it would suspend the "child" queue), the following setup would implement an elegant workaround for this problem: 1. a consumable attribute "nslots" is defined, on the host level the total number of "nslots" is defined, it typically would have the value of the number of CPUs on that host. All jobs but those running in the "low priority queue" are requesting the "nslots" resource. qsub -l nslots=1 ... Jobs submitted to the "low priority queue" are not requesting the "nslots" setting, but the queue woulkd be configured as follows: slots <ncpus> load_thresholds <whatever_is_required> suspend_threshold nslots=0 In this setup a job which is started in the higher priorities queue on that host would suspend the loaw priority queue with theeffect that no new jobs are started and running jobs are suspended. The scheme works nicely for "load_thresholds", however it does not work for "suspend_thresholds". ------- Additional comments from sgrell Thu Nov 4 03:24:18 -0700 2004 ------- The implementation should also take into account, that in one scheduler run two jobs could be started, of which one would be suspended right away because of the other one. That is already an issue, but with a proper setting of load and supend thesholds, it is very unlikly to happen. We also want, that the suspend threshould is evaluated when ever a job is dispatched. Stephan ------- Additional comments from templedf Mon Dec 6 01:47:03 -0700 2004 ------- Looks easy enough to fix. The question, though, it what the correct behavior is. Here's what the /id/ command does: % id gidtest uid=60003(gidtest) gid=60003 It simply ignored the name. I don't think we have that option. I would assume that if the gid can't be resolved into a name, the gid should be the name, i.e. in the case above, "60003" would be stored as the group name. Another alternative would be to just name the group "UNKNOWN". My only issue with that option is how to tell two unknown groups apart if they're both called "UNKNOWN". Comments? If no one voices an opinion by tomorrow, I will use the gid as the group name. ------- Additional comments from templedf Mon Dec 6 01:48:09 -0700 2004 ------- Oops! Wrong Issue! ------- Additional comments from sgrell Tue Dec 6 08:19:14 -0700 2005 ------- Changed the Subcomponent. Stephan ------- Additional comments from sgrell Mon Dec 12 03:25:26 -0700 2005 ------- This desribes an RFE. Stephan ------- Additional comments from reuti Thu Oct 23 02:45:50 -0700 2008 ------- adding myself as cc.
Note: See
TracTickets for help on using
tickets.