Opened 50 years ago

Last modified 9 years ago

#903 new defect

IZ599: NPE while executing show_blacklist command

Reported by: torsten Owned by:
Priority: low Milestone:
Component: hedeby Version: 1.0
Severity: Keywords: cli
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=599]

        Issue #:      599           Platform:     All         Reporter: torsten (torsten)
       Component:     hedeby           OS:        All
     Subcomponent:    cli           Version:      1.0            CC:    None defined
        Status:       STARTED       Priority:     P4
      Resolution:                  Issue type:    DEFECT
                                Target milestone: 1.0u5next
      Assigned to:    adoerr (adoerr)
      QA Contact:     adoerr
          URL:
       * Summary:     NPE while executing show_blacklist command
   Status whiteboard:
      Attachments:


     Issue 599 blocks:
   Votes for issue 599:     Vote for this issue


   Opened: Thu Nov 6 00:42:00 -0700 2008 
------------------------


   Description:
   The show_blacklist (sb) command produces a NullPointerException when the
   resource_provider component is STOPPED:

   % sdmadm show_blacklist
   Error:
   java.lang.NullPointerException
           at
   com.sun.grid.grm.cli.cmd.monitoring.ShowBlackListCliCommand$TableModel.<init>(ShowBlackListCliCommand.java:97)
           at
   com.sun.grid.grm.cli.cmd.monitoring.ShowBlackListCliCommand.execute(ShowBlackListCliCommand.java:75)
           at com.sun.grid.grm.cli.AbstractCli.run(AbstractCli.java:278)
           at com.sun.grid.grm.cli.SdmAdm.main(SdmAdm.java:160)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at
   sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
           at
   sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
           at java.lang.reflect.Method.invoke(Method.java:585)
           at
   com.sun.grid.grm.util.MainWrapper$SystemRunThread.run(MainWrapper.java:434)


   Evaluation:
   The problem is rated as a p4 defect. A NPE should never be visible to the user
   but the command would produce an error anyway in this situation, just a bit
   more helpful message for the user. As the resource provider is almost always
   running in a SDM system, this issue occurs very seldom and can be easily worked
   around.


   Suggested fix:
   The system should issue an error message instead of the NPE. The error message
   should be the same as when the complete rp_vm, in which the resource provider
   is situated, is STOPPED:
   % sdmadm show_blacklist
   Error: Resource Provider component cannot be found. It is not started or there
   is a connection problem.


   Work around:
   Start the resource provider with 'sdmadm suc -c resource_provider' on the
   correct host.


   Analysis:
   As the stack trace indicates the NPE occurs in the constructor of
   ShowBlackListCliCommand$TableModel. The call to entry.getResourceIds() returns
   null.

   The underlying problem is in ShowBlackListCommand#execute(). The
   ResourceProvider that is fetched in line 73 is only null, when the whole rp_vm
   is not running. Otherwise a proxy to RP is returned which produces a
   GrmRemoteException once a call is done on the proxy. Therefore extend the if
   check to make sure that RP is there, something like this:

     try {
         if (ret == null || !ret.getState().equals(ComponentState.STARTED)) {
             // RP not known to ComponentService or RP not running => return error
   at once
             throw new GrmException("ShowBlackListCommand.rp.notfound", BUNDLE_NAME);
         }
     } catch (GrmRemoteException ex) {
         // we got a Proxy to RP but we cannot call any methods on it <=> RP not active
         throw new GrmException("ShowBlackListCommand.rp.notfound", ex, BUNDLE_NAME);
     }

   In addition to the check at the beginning it is possible that the connection to
   RP gets lost during the course of the ShowBlackListCommand#execute() method.
   Specifically in getBlackListedResourceIds(): ret.getBlackList(serviceName) can
   throw a GrmRemoteException. In this case, the ShowBlackListResult should be
   constructed with an empty collection instead of a null pointer as the 2nd
   parameter, e.g.:

     try {
         List<ResourceId> resourceids = ret.getBlackList(serviceName);
         return new ShowBlackListResult(serviceName, resourceids);
     } catch (Exception e) {
         return new ShowBlackListResult(serviceName,
   Collections.<ResourceId>emptyList(), e);
     }


   How to test:
   A TS test should be written to test the behavior of the show_blacklist command.
   This should at least include the following scenarios:
   - resource_provider is STARTED, rp_vm is STARTED
   - resource_provider is STOPPED, rp_vm is STARTED
   - rp_vm is STOPPED


   ETC:
   1.5 PD
               ------- Additional comments from aja Tue Nov 3 09:56:16 -0700 2009 -------
   accepted
               ------- Additional comments from rhierlmeier Wed Nov 25 07:21:12 -0700 2009 -------
   Milestone changed

Change History (0)

Note: See TracTickets for help on using tickets.