[GE users] Eqw because of exit_status 100

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Tue Jun 26 14:48:37 BST 2007


On Tue, 26 Jun 2007, SLIM H.A. wrote:

>
> Dear all
>
> If a program exits with code 100 SGE marks the job as being in error
> ("E") state and tries to rerun it. Some of our users run an application
> that returns 100 as a general error state but there is no reason to
> rerun the program with the same input, it simply failed. qacct -j gives
> these lines:
>
> failed       30  : rescheduling on application error
> exit_status  100
>
> The users submit these jobs in bulk causing qstat to produce an
> unnecessary lengthy list of jobs in error.
>
> Is there a way to avoid this (other than the application not using 100
> as an exit code)?

You simly add FORBID_APPERROR=true to the qmaster_params list of your global 
cluster configuration as described in sge_conf(5).

Regards,
Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list