[GE users] qrsh ctrl-c jobs remain pending

Kirk Patton kpatton at transmeta.com
Fri Jan 6 22:27:25 GMT 2006


O.K.,

I think I understand you.  My setup does use '-now n'
I want the jobs submitted with qrsh to be considered interactive, but to pend
if the requested resource is not available.

We were orininally using LSF 4.01 and the dispatch behavior
of SGE was configure to behave like LSF.

Kirk

On Fri, Jan 06, 2006 at 10:47:51PM +0100, Reuti wrote:
> Hi Kirk,
> 
> Am 06.01.2006 um 22:30 schrieb Kirk Patton:
> 
> >Hello Reuti,
> >
> >The queues on the system are configured as
> >
> >qtype                 BATCH INTERACTIVE
> >
> >I do not partition my queues into seperate types.  They
> >are very generic.
> >
> 
> yes, but I have separate queues. And what I was trying to mention  
> was, that -now n changes the request from an interactive queue to a  
> conventional batch queue from SGE's point of view - although it's  
> still interactive. And so also the internal logic will handle this as  
> a batch job.
> 
> If the general opinion now turns out to be, that it should still run  
> in an interactive queue, there could be some logic to handle sighup/ 
> sigterm in a nicer way.
> 
> -- Reuti
> 
> 
> >Kirk
> >
> >On Fri, Jan 06, 2006 at 10:21:43PM +0100, Reuti wrote:
> >>I forgot:
> >>
> >>Am 06.01.2006 um 22:18 schrieb Reuti:
> >>
> >>>Wow, I wasn't aware of this:
> >>>
> >>>reuti at master:~> qrsh -l hostname=node34 echo \$QUEUE
> >>>login
> >>>reuti at master:~> qrsh -l hostname=node34 -now n echo \$QUEUE
> >>>vast
> >>>reuti at master:~>
> >>>
> >>>is this intended? So your jobs I think are also running in a non-
> >>>interactive queue.
> >>>
> >>>Maybe the sighup shouldn't been caught by an interactive qrsh (and
> >>>it should still run in an interactive queue) - I mean:
> >>>
> >>>root     13514  1778 13514  \_ sshd: reuti [priv]
> >>>reuti    13517 13514 13514      \_ sshd: reuti at pts/4
> >>>reuti    13518 13517 13518          \_ -sh
> >>>reuti    13681 13518 13681              \_ qrsh -l hostname=node34 -
> >>>now n /home/reuti/test.sh
> >>>
> >>>after closing the window:
> >>>
> >>>reuti    13681     1 13681 qrsh -l hostname=node34 -now n /home/
> >>>reuti/test.sh
> >>>
> >>
> >>killing this lonely process still leaves the job pending. - Reuti
> >>
> >>>-- Reuti
> >>>
> >>>
> >>>Am 06.01.2006 um 21:42 schrieb Kirk Patton:
> >>>
> >>>>Here is how to reproduce the issue.
> >>>>
> >>>>Submit a job that is guaranteed to pend.  For testing, I submitted
> >>>>to a
> >>>>disabled queue...
> >>>>
> >>>>Next, kill the terminal window where the qrsh job was submitted
> >>>>from.
> >>>>
> >>>>Check on the job and it should still be pending.
> >>>>
> >>>>One of my users has stated that the jobs that remain hanging around
> >>>>do start up and consume processor time once the resource becomes
> >>>>available.
> >>>>
> >>>>My test case has been hanging for the last few minutes.
> >>>>
> >>>>Thanks,
> >>>>Kirk
> >>>>
> >>>>On Thu, Jan 05, 2006 at 11:52:46PM +0100, Reuti wrote:
> >>>>>Am 05.01.2006 um 23:38 schrieb Kirk Patton:
> >>>>>
> >>>>>>Hello all,
> >>>>>>
> >>>>>>I have an annoying issue with qrsh jobs and SGE.  If a user
> >>>>>>ctrl-c's a job before it is dispatched, it remains pending in
> >>>>>>the queue until a user manually run 'qdel job_id'.
> >>>>>>
> >>>>>>Does anyone know why jobs would behave in this manner?
> >>>>>>Any fixes?
> >>>>>>
> >>>>>
> >>>>>I don't see this on x86, SuSE 9.3, SGE 6.0u6. You mean qrsh or  
> >>>>>qrsh
> >>>>><command> and pressing ctrl-c just after <return>?
> >>>>>
> >>>>>Is QLOGIN also mentioned as job name for this pending job? - Reuti
> >>>>>
> >>>>>>Thanks,
> >>>>>>Kirk
> >>>>>>
> >>>>>>
> >>>>>>-- 
> >>>>>>Kirk Patton
> >>>>>>Unix Administrator
> >>>>>>Transmeta Inc.
> >>>>>>Tel. 408 919-3055
> >>>>>>
> >>>>>>----------------------------------------------------------------- 
> >>>>>>--
> >>>>>>--
> >>>>>>To unsubscribe, e-mail: users- 
> >>>>>>unsubscribe at gridengine.sunsource.net
> >>>>>>For additional commands, e-mail: users-
> >>>>>>help at gridengine.sunsource.net
> >>>>>
> >>>>>------------------------------------------------------------------ 
> >>>>>--
> >>>>>-
> >>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>>For additional commands, e-mail: users- 
> >>>>>help at gridengine.sunsource.net
> >>>>>
> >>>>
> >>>>-- 
> >>>>Kirk Patton
> >>>>Unix Administrator
> >>>>Transmeta Inc.
> >>>>Tel. 408 919-3055
> >>>>
> >>>>------------------------------------------------------------------- 
> >>>>--
> >>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>For additional commands, e-mail: users- 
> >>>>help at gridengine.sunsource.net
> >>>
> >>>-------------------------------------------------------------------- 
> >>>-
> >>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >
> >-- 
> >Kirk Patton
> >Unix Administrator
> >Transmeta Inc.
> >Tel. 408 919-3055
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

-- 
Kirk Patton
Unix Administrator
Transmeta Inc.
Tel. 408 919-3055

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list