[GE issues] [Issue 3265] New - array jobs with PE and dependencies killing qmaster

kisielk kamil at zymeworks.com
Mon Apr 26 17:41:08 BST 2010


http://gridengine.sunsource.net/issues/show_bug.cgi?id=3265
                 Issue #|3265
                 Summary|array jobs with PE and dependencies killing qmaster
               Component|gridengine
                 Version|6.2u5
                Platform|All
                     URL|
              OS/Version|All
                  Status|NEW
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|DEFECT
                Priority|P3
            Subcomponent|qmaster
             Assigned to|ernst
             Reported by|kisielk






------- Additional comments from kisielk at sunsource.net Mon Apr 26 09:41:05 -0700 2010 -------
I'm able to reproduce this rather consistently in my 6.2u5 install.

If a an array job is submitted that uses a PE, and it has jobs dependant on it, the qmaster process will crash when the tasks in the array job are completing.

The messages log shows:

04/26/2010 09:27:57|worker|master|C|!!!!!!!!!! JB_ja_tasks not found in element !!!!!!!!!!

Restarting the qmaster just causes it to crash again. Sometimes there is enough time for me to fire off a qdel, but other times I have to manually delete the job directory in the 
qmaster spool.

I have a copy of the spool directory of a job that exhibits this behaviour if that would help in diagnosing the problem.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=36&dsMessageId=255013

To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list