Opened 12 years ago
Last modified 10 years ago
#917 new defect
IZ641: GE Adapter must reject unresovlable host resources
Reported by: | rhierlmeier | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | hedeby | Version: | 10_BETA1 |
Severity: | Keywords: | gridengine_adapter | |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=641]
Issue #: 641 Platform: All Reporter: rhierlmeier (rhierlmeier) Component: hedeby OS: All Subcomponent: gridengine_adapter Version: 10_BETA1 CC: None defined Status: NEW Priority: P3 Resolution: Issue type: DEFECT Target milestone: 1.0u5next Assigned to: rhierlmeier (rhierlmeier) QA Contact: rhierlmeier URL: * Summary: GE Adapter must reject unresovlable host resources Status whiteboard: Attachments: Issue 641 blocks: Votes for issue 641: Vote for this issue Opened: Fri Apr 17 06:34:00 -0700 2009 ------------------------ Description The problem can occur if GE Adapter is running on a different host than qmaster If an host resource is assigned to the GE service that is not resolvable on qmaster host the resource goes into ERROR state. The following error message can be found in the log file: 04/17/2009 14:44:19|33|.service.impl.ge.InstallationSequence.execute|W|Install execd on host domU-12-31-39-00-49-B7: step 'Make admin host' failed: Can not make host domU-12-31-39-00-49-B7 to admin host: jgdi error: Exception thrown in operation addAdminHost In such a situation GE Adapter should simply reject the resource. The 'Make admin host' step is the first step of the installation and if it fails the resource has not been modified. The resource can be safely rejected. Evaluation: The problem can only occurs in a unrecommended setup (GE Adapter a qmaster not running on the same host). Suggested Fix/Work Around There is not work around known. If a unresolvable resource is assigned to GE service the resource goes into ERROR state. To solve the problem administrator must shutdown the GE service and delete the file <local spool dir>/spool/<name of ge service>/<name of unresolvable resource>.srf After restart of the GE service the resource has been disappeared. Analysis The problem is that the class ExecdInstallerBase$MakeAdminHostStep. It throws an InstallException instead of an InstallationNotPossibleException. How to test o Setup as system where GE adapter is not running on qmaster host o Define on the host where GE adapter is running dummy host in /etc/hosts o The resource which represents the dummy host to the system o Assign the resource to the GE adapter o It should be rejected ATC: 0.5 PD ETC: 3 PD The fix itself is easy, however the testsuite test can be tricky ------- Additional comments from rhierlmeier Wed Nov 25 07:21:12 -0700 2009 ------- Milestone changed ------- Additional comments from torsten Thu Nov 26 08:42:20 -0700 2009 ------- Starting from Hedeby 1.0u5 there is a different workaround available: to get rid of the resource in ERROR state, simply execute: sdmadm purge_resource -r <resource_name_or_id> No need to delete a file or to shutdown the GE service.
Note: See
TracTickets for help on using
tickets.