[GE users] Job Restart on App Failure

yooniverse yoon.s.chung at chase.com
Thu Mar 25 14:13:13 GMT 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

I'll suggest it and see if that's possible.  Thanks.

-----Original Message-----
From: rayson [mailto:rayrayson at gmail.com] 
Sent: Wednesday, March 24, 2010 1:17 PM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] Job Restart on App Failure

You can exit the job with code 99:

http://gridengine.sunsource.net/nonav/source/browse/~checkout~/gridengine/doc/htmlman/htmlman8/sge_shepherd.html?pathrev=V62u5_TAG

Rayson



On 3/24/10, yooniverse <yoon.s.chung at chase.com> wrote:
>
>
>
> Hi,
>
>
>
> I have a request from a user who would like to know if there is a way to
> automatically have the job restarted if the app it executes exits for any
> reason other than from a client-side termination of the job (qdel, breaking
> a qsub ?sync, etc.) or with a exit code 0.  The app is not very reliable,
> but it must continue to run without intervention.
>
>
>
> I know that SGE can restart a job if the execd terminates abnormally (e.g.,
> server crash), but was wondering if there is an interesting way to make SGE
> behave this way without having to customize his submission script to have
> some kind of conditional logic to resubmit.
>
>
>
> Any thoughts?
>
>
>
> Thanks,
>
> Yoon
>
> This transmission may contain information that is privileged, confidential,
> legally privileged, and/or exempt from disclosure under applicable law. If
> you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution, or use of the information contained
> herein (including any reliance thereon) is STRICTLY PROHIBITED. Although
> this transmission and any attachments are believed to be free of any virus
> or other defect that might affect any computer system into which it is
> received and opened, it is the responsibility of the recipient to ensure
> that it is virus free and no responsibility is accepted by JPMorgan Chase &
> Co., its subsidiaries and affiliates, as applicable, for any loss or damage
> arising in any way from its use. If you received this transmission in error,
> please immediately contact the sender and destroy the material in its
> entirety, whether in electronic or hard copy format. Thank you.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=251148

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=251326

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list