[GE issues] [Issue 2838] New - enforce limit option might fail when execd for a slave or master parallel task is restarted

crei crei at sun.com
Wed Dec 17 12:32:20 GMT 2008


http://gridengine.sunsource.net/issues/show_bug.cgi?id=2838
                 Issue #|2838
                 Summary|enforce limit option might fail when execd for a slave
                        | or master parallel task is restarted
               Component|gridengine
                 Version|6.2u1
                Platform|All
                     URL|
              OS/Version|All
                  Status|NEW
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|DEFECT
                Priority|P3
            Subcomponent|kernel
             Assigned to|ernst
             Reported by|crei






------- Additional comments from crei at sunsource.net Wed Dec 17 04:32:15 -0800 2008 -------
Enforce limits settings might not result in deleting jobs when some of the tasks
of a parallel job run on a execd which was shutdown and restarted again.

This scenario is from testsuite test:

- setup queue for allhosts with h_rt limit (30 seconds)
- use qmaster_params
ENABLE_ENFORCE_MASTER_LIMIT=true,ENABLE_FORCED_QDEL_IF_UNKNOWN=true
- submit a tight integrated parallel job running longer than 30 seconds (e.g.
120 seconds)
- wait till all tasks are running
- shutdown all execds where parts of the pe jobs are running
- start a slave or master job execd again

The job should be terminated before the normal runtime has ended but this is not
the case

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=36&dsMessageId=92952

To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list