Opened 5 years ago

Closed 5 years ago

#1551 closed defect (fixed)

Spool not flushed at qmaster exit

Reported by: markdixon Owned by: Mark Dixon <m.c.dixon@…>
Priority: normal Milestone:
Component: sge Version: 8.1.8
Severity: minor Keywords: spool


The qmaster takes great pains to flush out all the spool objects when it does a normal exit. This is important because gridengine also tries hard to rate-limit the frequency of object updates, meaning some little-used objects can be permanently out of date.

However, it also takes great pains NOT to flush the spool objects if it notices that another qmaster has fiddled with the files - to avoid file corruption.

Unfortunately, an "if" test is reversed, so it only actually flushes in the condition of maximal chance of file corruption and doesn't otherwise.

Patch follows to correct this, prepared against 8.1.8.

As this code path has very rarely been used, probably worth testing it a bit before putting into production!

Attachments (1)

0001-Fix-1551-Fix-do-do-not-final-spool-at-qmaster-shutdo.patch (1.3 KB) - added by markdixon 5 years ago.

Download all attachments as: .zip

Change History (2)

comment:1 Changed 5 years ago by Mark Dixon <m.c.dixon@…>

  • Owner set to Mark Dixon <m.c.dixon@…>
  • Resolution set to fixed
  • Status changed from new to closed

In 4839/sge:

Fix #1551 Fix do/do not final spool at qmaster shutdown
The qmaster is supposed to do a final spool of all objects at shutdown,
unless it is a shutdown because it has detected that a shadow qmaster
has taken over (e.g. by examining act_qmaster).

Unfortunately, the sense of the test was reversed and the final spool was
only done if an active shadow master was detected. This patch restores
the correct sense of the test.

Note: See TracTickets for help on using tickets.