[GE users] qresub as other user

Andreas Haas Andreas.Haas at Sun.COM
Wed Nov 16 15:55:52 GMT 2005


On Wed, 16 Nov 2005, Sebastian Stark wrote:

> > > Another case where I think qresub as an admin can be useful is the
> > > following:
> > >
> > >   - User submits job
> > >   - Job runs, errors out because of some problem
> > >   - Admin fixes problem
> > >   - Admin qresubs other users jobs
> >
> > There is a job error and a queue error for error conditions. Job
> > error is for cases when the end user made a failure that must be
> > fixed before rerun is possible. The queue error is for problem with
> > set-up problems which can be fixed by admins only.
>
> 99% of the errors in our cluster are like this:
>
>   failed changing into working directory:11/15/2005 17:48:30 [1792:20744]:
>   error: can't chdir to /agbs/cluster/chrisd: No such file or direct
>
> (of course people notice right *after* submitting thousands of jobs...)
>
> In case of a not accessible NFS mount a user error occures that can not be
> fixed by the user.

I understand. The problem here is that one can not tell easily whose
fault it was:

(1) it is clearly the users fault if he used -cwd option to run a job on
    a machine where the qsub current working directory doesn't exist and
    should not exist
(2) it is the admins fault if the directory should exist though but it
    wasn't mounted.

Question is simply how should Grid Engine know whether that directory
should have been available? Not even use of -cwd option can be used as
reasonable indication.

Though Grid Engine could treat this always as queue error, but then
any ordinary user could activate queue error state for the entire
cluster simply by misusing -cwd option ...

Regards,
Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list