Opened 12 years ago
Last modified 9 years ago
#489 new defect
IZ2493: Qmaster restart takes long time after short duration maintainance shutdown
Reported by: | andreas | Owned by: | |
---|---|---|---|
Priority: | low | Milestone: | |
Component: | sge | Version: | 6.1AR_snapshot3_2 |
Severity: | Keywords: | qmaster | |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=2493]
Issue #: 2493 Platform: All Reporter: andreas (andreas) Component: gridengine OS: All Subcomponent: qmaster Version: 6.1AR_snapshot3_2 CC: [_] bbarth [_] Remove selected CCs Status: NEW Priority: P4 Resolution: Issue type: DEFECT Target milestone: --- Assigned to: ernst (ernst) QA Contact: ernst URL: * Summary: Qmaster restart takes long time after short duration maintainance shutdown Status whiteboard: Attachments: Issue 2493 blocks: Votes for issue 2493: Opened: Fri Feb 15 09:32:00 -0700 2008 ------------------------ DESCRIPTION: If qmaster is shut down for a short duration and restarted it can happen that many execd load reports queue up in qmaster until it has passed the startup phase. E.g. in a cluster with a large number of nodes the amount of 159060 messages in qmasters incoming buffer was observed with qping -info. Although qmaster came up finally, but due to the queued up messages it took additional time until qmaster became available for processing user requests like qstat. ------- Additional comments from andreas Fri Feb 15 10:31:43 -0700 2008 ------- Fixed version. ------- Additional comments from andreas Fri Feb 15 12:27:37 -0700 2008 ------- A related issue is #2483. ------- Additional comments from andreas Thu Feb 21 05:33:57 -0700 2008 ------- Related #2500 will adress the issue of queueing up messages. ------- Additional comments from crei Fri Apr 18 02:43:18 -0700 2008 ------- The real problem is that the reload of the spooled jobs takes so long. Changing message acceptance at qmaster startup will produce other problems (possible takeover of shadow deamon because shadowd thinks qmaster is down)
Note: See
TracTickets for help on using
tickets.