Opened 11 years ago

Last modified 9 years ago

#595 new defect

IZ2786: Clients return unpredictable error message if qmaster is not available

Reported by: crei Owned by:
Priority: low Milestone:
Component: sge Version: current
Severity: Keywords: communication
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=2786]

        Issue #:      2786                Platform:     All         Reporter: crei (crei)
       Component:     gridengine             OS:        All
     Subcomponent:    communication       Version:      current        CC:    None defined
        Status:       NEW                 Priority:     P4
      Resolution:                        Issue type:    DEFECT
                                      Target milestone: Maintrunk
      Assigned to:    crei (crei)
      QA Contact:     crei
          URL:
       * Summary:     Clients return unpredictable error message if qmaster is not available
   Status whiteboard:
      Attachments:

     Issue 2786 blocks:
   Votes for issue 2786:


   Opened: Wed Nov 12 06:28:00 -0700 2008 
------------------------


There is a timing condition in commlib that leads to different return codes for
the same error. There error codes are interpreted by the clients and there two
different kind of messages are printed.

To reproduce kill your qmaster and execute a command connecting to the qmaster.
The error messages that will appere are either:
  Unable to run job: unable to send message to qmaster using port 32206 on host
"xxx": got send error
or:
  Unable to run job: unable to contact qmaster using port 32206 on host "xxx".

The testsuite test client_setup depends on the correct (the second) error
message and fails now from time to time.

   ------- Additional comments from crei Wed Nov 12 06:29:43 -0700 2008 -------
target milestone

   ------- Additional comments from pollinger Fri Feb 13 03:37:50 -0700 2009 -------
The error message can also be
"commlib error: can't connect to service (Connection refused)"
or
"commlib error: got select error (Connection refused)"


Change History (0)

Note: See TracTickets for help on using tickets.