Opened 12 years ago
Last modified 10 years ago
#591 new defect
IZ2776: restore fails when spooling with bdb rpc server and qmaster has a local spool directory
Reported by: | joga | Owned by: | |
---|---|---|---|
Priority: | lowest | Milestone: | |
Component: | sge | Version: | 6.0 |
Severity: | Keywords: | install | |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=2776]
Issue #: 2776 Platform: All Reporter: joga (joga) Component: gridengine OS: All Subcomponent: install Version: 6.0 CC: None defined Status: NEW Priority: P5 Resolution: Issue type: DEFECT Target milestone: --- Assigned to: dom (dom) QA Contact: dom URL: * Summary: restore fails when spooling with bdb rpc server and qmaster has a local spool directory Status whiteboard: Attachments: Issue 2776 blocks: Votes for issue 2776: Opened: Tue Nov 4 04:11:00 -0700 2008 ------------------------ In a cluster with spooling via BDB RPC server, and qmaster has its spooldirectory on a local filesystem, BDB server host != qmaster host. Backup with inst_sge -bup is started on the bdb server host, doesn't report an error, but is incomplete (does not contain the qmaster spooling data). Restore with inst_sge -rst (on the bdb server host) fails, when trying to create the qmaster spool directory, or should it succeed, qmaster (on a different host) will not see the restored directory. Making this a P5, as the scenario is not really realistic: The reason for using the BDB RPC server is to have failover of sge_qmaster via sge_shadowd. But failover with sge_shadowd doesn't work with local qmaster spooldirectory. Evaluation: Backup / restore must be started on the bdb server host (as it usually has a local directory for the spooling data), but the bdb server host cannot access the local qmaster spooldirectory on qmaster host. Suggested Fix: Make the backup / restore a two step process, as this is already done in the qmaster installation: backup / restore is started on the qmaster host, it can backup / restore the qmaster data, when it comes to backing up / restoring the bdb data, the user is asked to start a inst_sge -bup/-rst -db on the bdb server host, backup directory must be on NFS (the same as is used for the qmaster backup). In case of the backup, the backup process on qmaster host can verify that the bdb data is available in the backup directory, once the user acknowledges having done the bdb backup. Work Around: Do not use such a setup - it does not really make sense.
Note: See
TracTickets for help on using
tickets.