[GE users] Job Restart on App Failure

yooniverse yoon.s.chung at chase.com
Wed Mar 24 17:10:15 GMT 2010


    [ The following text is in the "Windows-1252" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

I have a request from a user who would like to know if there is a way to automatically have the job restarted if the app it executes exits for any reason other than from a client-side termination of the job (qdel, breaking a qsub ?sync, etc.) or with a exit code 0.  The app is not very reliable, but it must continue to run without intervention.

I know that SGE can restart a job if the execd terminates abnormally (e.g., server crash), but was wondering if there is an interesting way to make SGE behave this way without having to customize his submission script to have some kind of conditional logic to resubmit.

Any thoughts?

Thanks,
Yoon

This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase & Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you.



More information about the gridengine-users mailing list