[GE users] Jobs remaining in d state

Jinal Jhaveri jajhaveri at lbl.gov
Tue May 16 16:30:14 BST 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Awesome!! Thanks Chris!!




On 5/16/06, Chris Dagdigian <dag at sonsorol.org> wrote:
>
> There is an additional config parameter required to allow users to
> issue "qdel -f" commands.
>
> From the qdel manpage:
>
> > OPTIONS
> >        -f     Force action for running jobs. The job(s) are deleted
> > from  the  list  of  jobs
> >               registered  at  sge_qmaster(8)  even if the sge_execd
> > (8) controlling the job(s)
> >               does not respond to the delete request sent by
> > sge_qmaster(8).
> >
> >        Users which are neither Grid Engine managers nor operators
> > can only use the -f  option
> >        (for  their  own  jobs) if the cluster configuration entry
> > qmaster_params contains the
> >        flag ENABLE_FORCED_QDEL (see sge_conf(5)).  However,
> > behavior for  administrative  and
> >        non-administrative users differs. Jobs are deleted from the
> > Grid Engine database imme-
> >        diately in case of administrators. Otherwise, a regular
> > deletion  is  attempted  first
> >        and a forced cancellation is only executed if the regular
> > deletion was unsuccessful.
>
>
> Regards,
> Chris
>
>
>
>
> On May 16, 2006, at 11:07 AM, Jinal Jhaveri wrote:
>
> > Surprisingly in such cases, qdel -f doesn't show an effect when
> > it's given
> > by the user but when the gridmaster issues it, it works ok. Any
> > suggestions?
> >
> > On 5/16/06, Reuti <reuti at staff.uni-marburg.de> wrote:
> >>
> >> Hi,
> >>
> >> Am 16.05.2006 um 11:56 schrieb Duncan Mortimer:
> >>
> >> > We occasionally see a similar situation, the shepherd hangs around
> >> > and can't be killed (HUP or KILL) - looking through the process
> >> > listing the child process can be found and appears to be a zombie,
> >> > having no parent. This is under Mac OS X.
> >> > Our only solution to clear the locked slot is to reboot the cluster
> >> > node.
> >>
> >> although it will not remove the zombie from the node, you can give
> >> the option -f to qdel to force to free the slot. - Reuti
> >>
> >>
> >> > Duncan
> >> >
> >> > Duncan Mortimer
> >> > duncan at fmrib.ox.ac.uk
> >> >
> >> >
> >> >
> >> >
> >> ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> > For additional commands, e-mail: users-
> >> help at gridengine.sunsource.net
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>



More information about the gridengine-users mailing list