[GE issues] [Issue 3178] New - qsub -sync/DRMAA clients does not return correct exit status during qmaster failover

ravinallan ravichandra.nallan at sun.com
Fri Nov 13 13:49:40 GMT 2009


http://gridengine.sunsource.net/issues/show_bug.cgi?id=3178
                 Issue #|3178
                 Summary|qsub -sync/DRMAA clients does not return correct exit 
                        |status during qmaster failover
               Component|gridengine
                 Version|6.2
                Platform|Sun
                     URL|
              OS/Version|All
                  Status|NEW
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|DEFECT
                Priority|P3
            Subcomponent|communication
             Assigned to|ravinallan
             Reported by|ravinallan






------- Additional comments from ravinallan at sunsource.net Fri Nov 13 05:49:38 -0800 2009 -------
qsub -sync/DRMAA jobs do not report exit status during a qmaster failover

% qsub -sync yes $SGE_ROOT/examples/jobs/sleeper.sh
Your job 259 ("Sleeper") has been submitted
The qmaster has gone down.  Waiting to reconnect.error: do not accept new event clients. Qmaster is going down
error: commlib error: can't connect to service (Connection refused)
error: unable to contact qmaster using port 7480 on host "xyz"
error: unable to contact qmaster using port 7480 on host "xyz"
...
Reconnected to qmaster.
No information available on job 259's exit status.

The problem occurs due to a timing issue as to which client, execd or the qsub -sync connects after a qmaster has recovered from
unavailability. If execd connects first, it reports job exit status and the job related is removed, before the qsub client can connect and
retrieve the job status

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=36&dsMessageId=226665

To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list