Custom Query (431 matches)
Results (121 - 123 of 431)
Ticket | Resolution | Summary | Owner | Reporter |
---|---|---|---|---|
#1583 | fixed | Project object usage in spool should only be updated if it has changed | Mark Dixon <m.c.dixon@…> | markdixon |
Description |
From the commit:
Commit prepared against 8.1.9 Note that usage stored in the spool can still end up considerably out of date due to #1554. |
|||
#1480 | fixed | Prevent root-owned files in execd active_job spool area | markdixon | |
Description |
The new cgroup/cpuset code uses a couple of routines for switching effective uid/gid which appear to be causing some problems. Some of the side symptoms include the following files in the execd spool sometimes being owned by root:
That last entry is a directory created for a SLAVE task. It being root-owned can cause jobs to fail with a "can't open pid file" error message. The execd appears to have the correct euid/egid when entering the cgroup code, so I have removed the offending function calls. I don't know if there's a good reason for them that I've not noticed in limited testing. Potential patch attached. Cheers, Mark -- Mark Dixon Email : m.c.dixon@… HPC/Grid Systems Support Tel (int): 35429 Information Systems Services Tel (ext): +44(0)113 343 5429 University of Leeds, LS2 9JT, UK 0001-Prevent-root-owned-files-in-execd-active_job-spool-a.patch |
|||
#1483 | fixed | Prevent cgroup/cpuset code from killing shepherd at job en | Mark Dixon <m.c.dixon@…> | markdixon |
Description |
Prevent cgroup/cpuset code from killing shepherd at job end When the execd_params option USE_CGROUPS is enabled, the cgroup/cpuset cleanup code checks for and kills processes related to the job. This includes the shepherd, triggering the job cleanup signal handler. However, as the execd also kills the shepherd elsewhere, this can cause the job cleanup code to be traversed twice as many times as usual. This has been seen to be a problem when the node running the job master qrsh's back into itself. In that case, the most obvious symptoms are:
(interpreted as -/- in qacct) Suggested patch to skip the shepherd is attached. All the best, Mark -- Mark Dixon Email : m.c.dixon@… HPC/Grid Systems Support Tel (int): 35429 Information Systems Services Tel (ext): +44(0)113 343 5429 University of Leeds, LS2 9JT, UK 0001-Prevent-cgroup-cpuset-code-from-killing-shepherd-at-.patch |