Opened 10 years ago

Last modified 9 years ago

#917 new defect

IZ641: GE Adapter must reject unresovlable host resources

Reported by: rhierlmeier Owned by:
Priority: normal Milestone:
Component: hedeby Version: 10_BETA1
Severity: Keywords: gridengine_adapter
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=641]

        Issue #:      641                      Platform:     All         Reporter: rhierlmeier (rhierlmeier)
       Component:     hedeby                      OS:        All
     Subcomponent:    gridengine_adapter       Version:      10_BETA1       CC:    None defined
        Status:       NEW                      Priority:     P3
      Resolution:                             Issue type:    DEFECT
                                           Target milestone: 1.0u5next
      Assigned to:    rhierlmeier (rhierlmeier)
      QA Contact:     rhierlmeier
          URL:
       * Summary:     GE Adapter must reject unresovlable host resources
   Status whiteboard:
      Attachments:


     Issue 641 blocks:
   Votes for issue 641:     Vote for this issue


   Opened: Fri Apr 17 06:34:00 -0700 2009 
------------------------


   Description

   The problem can occur if GE Adapter is running on a different host than qmaster
   If an host resource is assigned to the GE service that is not resolvable on qmaster
   host the resource goes into ERROR state. The following error message can be found
   in the log file:

   04/17/2009 14:44:19|33|.service.impl.ge.InstallationSequence.execute|W|Install
   execd on host domU-12-31-39-00-49-B7: step 'Make admin host' failed: Can not
   make host domU-12-31-39-00-49-B7 to admin host: jgdi error: Exception thrown in
   operation addAdminHost

   In such a situation GE Adapter should simply reject the resource. The 'Make
   admin host' step
   is the first step of the installation and if it fails the resource has not been
   modified.
   The resource can be safely rejected.

   Evaluation:

   The problem can only occurs in a unrecommended setup (GE Adapter a qmaster not
   running on the same host).


   Suggested Fix/Work Around

   There is not work around known. If a unresolvable resource is assigned to GE service
   the resource goes into ERROR state. To solve the problem administrator must
   shutdown the GE service and delete the file

      <local spool dir>/spool/<name of ge service>/<name of unresolvable resource>.srf

   After restart of the GE service the resource has been disappeared.


   Analysis


   The problem is that the class ExecdInstallerBase$MakeAdminHostStep. It throws
   an InstallException instead of an InstallationNotPossibleException.


   How to test

   o Setup as system where GE adapter is not running on qmaster host
   o Define on the host where GE adapter is running dummy host in /etc/hosts
   o The resource which represents the dummy host to the system
   o Assign the resource to the GE adapter
   o It should be rejected


   ATC:  0.5 PD
   ETC:  3 PD

   The fix itself is easy, however the testsuite test can be tricky
               ------- Additional comments from rhierlmeier Wed Nov 25 07:21:12 -0700 2009 -------
   Milestone changed
               ------- Additional comments from torsten Thu Nov 26 08:42:20 -0700 2009 -------
   Starting from Hedeby 1.0u5 there is a different workaround available:

   to get rid of the resource in ERROR state, simply execute:

   sdmadm purge_resource -r <resource_name_or_id>

   No need to delete a file or to shutdown the GE service.

Change History (0)

Note: See TracTickets for help on using tickets.