Opened 8 years ago

Last modified 7 years ago

#1354 new enhancement

Job reservation with wildcards in PE names doesn't work correctly

Reported by: dlove Owned by:
Priority: normal Milestone:
Component: sge Version: 8.0.0d
Severity: minor Keywords:
Cc:

Description

From https://github.com/gridengine/gridengine/issues/3

Hi,

If you submit jobs with reservation switched on and wildcards in PE names, the scheduler doesn't correctly reserve resources:

qsub -pe 'pe*' 64 -R y -l h_rt=1:00:00

The reservation process just considers one (the first matching?) PE for all jobs that requesting wildcard PEs - not all available.

Cheers,
Andreas


Hi Andreas,

the wildcard PE selects one PE and this PE defines a set of queue instances that might be used for job execution. A reservation is done on a fixed set of queue instances that meet the resource requirements of the parallel job.

The reserved resources for a job can be found in the file $SGE_ROOT/$SGE_CELL/common/schedule after 'MONITOR=1' has been added to 'param' in scheduler configuration. There you will see that the wildcard is gone and that a subset of the matching queue instances is reserved.

Doing this differently would make it necessary to reserve resources multiple times with the current scheduler implementation.

In my opinion this is more an RFE than a bug.

Cheers,

Ernst


Hi Ernst,

ok, if that's the case, it probably needs to be documented somewhere. The current behaviour isn't what users would expect, is it?

Hi again,

the real problem arises due to the fact that all jobs requesting a wildcard pe get a reservation on the same PE. If it would be randomly chosen, everything would be more or less fine again.

We are currently implementing this by using a jsv.

Cheers,
Andreas

Change History (1)

comment:1 Changed 7 years ago by dlove

  • Version changed from 8.0.0a to 8.0.0d

sge_select_parallel_environment says:

When scheduling a reservation we search for the earliest assignment for each
PE and then choose that one that finally gets us the maximum number of slots.

Check whether that's actually happening.

Note: See TracTickets for help on using tickets.