Opened 10 years ago

Last modified 9 years ago

#719 new defect

IZ3137: qrsh jobs mysteriously hang

Reported by: petrik Owned by:
Priority: normal Milestone:
Component: sge Version: 6.2u3
Severity: Keywords: execution
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=3137]

        Issue #:      3137             Platform:     All      Reporter: petrik (petrik)
       Component:     gridengine          OS:        All
     Subcomponent:    execution        Version:      6.2u3       CC:    None defined
        Status:       NEW              Priority:     P3
      Resolution:                     Issue type:    DEFECT
                                   Target milestone: ---
      Assigned to:    pollinger (pollinger)
      QA Contact:     pollinger
          URL:
       * Summary:     qrsh jobs mysteriously hang
   Status whiteboard:
      Attachments:

     Issue 3137 blocks:
   Votes for issue 3137:


   Opened: Thu Sep 24 08:27:00 -0700 2009 
------------------------


qrsh jobs don't launch properly in a SUSE9.0 environment from time to time. This is an intermittent issue.

The process launched by qrsh is hang at:

# strace -p 13035
Process 13035 attached - interrupt to quit
futex(0x7fbfffe860, FUTEX_WAIT, 1, NULL

The sge_shepherd is still lingering on the ps output.

# ps -ef | grep 191718

sgeadmin 13035  8540  0 Jun29 ?        00:00:00 sge_shepherd-191718 -bg

Change History (0)

Note: See TracTickets for help on using tickets.