[GE users] Drmaa and how it handles tasks in error (exit_status == 100)

King, Marion G. mgking at hess.com
Sat Jan 26 15:50:08 GMT 2008


> Hello all.
> 
> I am submitting jobs to the gridengine using drmaa. I have two
> questions about how drmaa communicates 
>  with the gridengine (SGE) when a job's task exits with status = 100.
> 
> First, from my understanding, if a job or task exits with status =
> 100, the SGE sets an error flag and reschedules
>  that task unless FORBID_APPERROR has been set in the gridengine conf.
> Is a response sent back to the drmaa
>  lib causing drmaa_wait() to return (*I assume this is the case*). If
> so, is that particular task still in the queue
>  (any queue...)? 
> 
> Second, from the Job States diagram, Eqw is considered FAILED by
> drmaa. I'm curious whether I can issue a
>  drmaa_control( jobid, DRMAA_CONTROL_RESUME ... ) with the task in
> error or if at that point the job's task
>  is unavailable? And if using drmaa_control() is even possible, would
> the task be in the error queue or simply
>  re-queued? btw, I'm using the C bindings.
> 
> Thanks for any help.
> 
> Marion
> 



More information about the gridengine-users mailing list