Opened 6 years ago
Closed 6 years ago
#1511 closed defect (fixed)
execd does not remember core binding assignments across restart
Reported by: | markdixon | Owned by: | Dave Love <d.love@…> |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | sge | Version: | 8.1.6 |
Severity: | minor | Keywords: | |
Cc: |
Description
Hi,
At the moment, the execd make decisions on what cores are bound when a job requests them.
If the execd is restarted without killing running jobs, it forgets what cores it has assigned to what job. This means that it can assign the same cores to a new job before being freed by the old one.
The core binding information is held on the execd in the execd_spool_dir, so it should be possible to read it on startup.
Alternatively, moving the core binding decisions into the qmaster would fix both this (and #1479), but obviously is a much bigger job.
Mark
Change History (2)
comment:1 Changed 6 years ago by markdixon
- Version changed from 8.1.7 to 8.1.6
comment:2 Changed 6 years ago by Dave Love <d.love@…>
- Owner set to Dave Love <d.love@…>
- Resolution set to fixed
- Status changed from new to closed
Note: See
TracTickets for help on using
tickets.
In 4794/sge: