[GE users] Re: [GE users] resource reservation not working

matbradford matthew.bradford at eds.com
Mon Aug 24 10:56:46 BST 2009

    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

>> Hello, Grid Engineers.
>> I think I was mistaken about reservation not working --- It just doesn't
>> work the way I thought it would. What I expected was that, as resources
>> (slots) came free, the scheduler would set them aside for the reserving job
>> until it had accumulated enough to run it. Instead what happens is that the
>> scheduler picks an arbitrary list of nodes that may *or may not* have free
>> slots, and sets those slots aside as they come free. If slots come free
>> that are *not* on this preselected list, they are cheerfully assigned to
>> other
>> jobs, even those of lower priority than the reserving job.
>Could it be that these other nodes were actually not suited for this
>large parallel job? Reason could be 'oneper' is not contained in
>"pe_list" for the corresponding queue instances, reason could be
>resource requests with your job that can be satisfied only at a subset
>of the queue instances, reason could be load thresholds with these queue
>instances etc.
>> The indirect evidence of this was right there in the monitor file
>> ($SGE_ROOT/$SGE_CE?LL/common/schedule) when I had it turned on, but
>> I looked right past it: The reserving job has a list of queue instances
>> associated
>> with it:
>> 3568:1:RESERVING:119?0724115:660:P:oneper?:slots:20.000000
>> 3568:1:RESERVING:119?0724115:660:Q:all.q@?cl023:slots:1.000000?
>> 3568:1:RESERVING:119?0724115:660:Q:all.q@?cl026:slots:1.000000?
>> ...and the list never changes! I suspect now that what happened earlier was
>> that a node *not* on the reserved list came free, and the job I thought was
>> violating the reservation policy was scheduled there. That's certainly what
>> happened with some jobs that were scheduled last night.
>> I suppose there ought to be a request-for-enhancement about this: If
>> the scheduler were smart enough to glom resources *as they became available*,
>> rather than preselecting them (who knows how?), then reservation would
>> probably be a more effective function.
>Jobs' resource reservations are done anew with each scheduling
>interval. So actually the RFE is already implemented ;-)


Was this issue actually resolved, as we get the same issue.

A user has submitted a job requiring 16 slots and the job is sitting at the top of the pending queue, the scheduler has reserved 16 slots for this job. At this point there were only 3 available nodes,  and the scheduler correctly selected the free nodes, and 13 other nodes for reservation. No other small jobs were sitting in the pending queue at this time. Subsequently, other smaller jobs were submitted without a reservation request against them as they only required 1 node. As running jobs finished, I would have expected the scheduler to begin reserving the freed up nodes for the 16 slot job at the top of the pending queue. This isn?t the case, the reserved list of nodes isn?t changing and the smaller jobs are still hopping over the large job.

Any thoughts?


Mat Bradford

More information about the gridengine-users mailing list