[GE users] job_pid: Permission denied

reuti reuti at staff.uni-marburg.de
Wed Mar 3 08:48:58 GMT 2010


Am 02.03.2010 um 20:25 schrieb rsayde:

> OK, I think I understand it now. It was my application that was  
> returning error code 99. It seems like the sge scripts see that as a  
> reschedule request.

Was the call the last executable line in the script? Then this is  
taken by the interpreting shell as the value which is returned to  
calling process.


> So for now I have wrapped my application in a script that checks for  
> non-zero return status. The script now only returns 0 or 1 and I am  
> no longer having problems.
>
> But I guess I don't understand why any of this would matter.
>
> I probably won't turn off sge requeuing unless there is another good  
> reason to do so.
>
> So is it possible that the reason it couldn't requeue the job is  
> because a return of 99 from the application is not the same as a  
> return of 99 from the shepherd script?

No, the origin doesn't matter and is the same in the end. Does the  
setup of your application need some precautions which are for any  
reason only present for the first call of it and leaves any file/ 
directory after the first call in a dirty state, which crashes the  
application for sure?

Maybe mapping the 99 to 100 as application error would be better then.  
You could clean up the problematic state before and clear the error  
for a rerun of the job.

-- Reuti

>
> Thanks for the help.
>
> Richard
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=246706
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
> ].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=246816

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list