[GE users] Question regarding SGE submission methods (scheduling timeouts)

Reuti reuti at staff.uni-marburg.de
Thu Jun 22 18:15:44 BST 2006


Am 22.06.2006 um 18:53 schrieb Kogan, Felix:

> Yes, thanks, combination of "max_unheard" (which by default is too
> short, I think) and "reschedule_unknown" might help in some cases  
> (when
> accidental rerunning while original job is not quite dead is  
> harmless).
>
>
> I certainly would like to request this enhancement. How hard is it to
> get a sunsource account? The page you suggested doesn't have any entry
> forms, only query. Where can I find the explanation of the necessary
> procedures?

http://gridengine.sunsource.net/servlets/ReadMsg? 
listName=users&msgNo=14390

-- Reuti


> Thanks,
>
> Felix
>
> -----Original Message-----
> From: Ron Chen [mailto:ron_chen_123 at yahoo.com]
> Sent: Thursday, June 22, 2006 3:25 AM
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] Question regarding SGE submission methods
> (scheduling timeouts)
>
> BTW, another useful flag is "reschedule_unknown".
>
> The timeout feature makes sense, and if you want to request it
> to be added to the next version of SGE, you can enter an
> enchancement request to the issue DB:
>
> http://gridengine.sunsource.net/servlets/ProjectIssues
>
> But before you do that, you will need to get a sunsource account
> first.
>
>  -Ron
>
>
>
> --- "Kogan, Felix" <Felix-Kogan at deshaw.com> wrote:
>> Oh, you mean a wrapper for SGE submission utilities. I thought
>> it was
>> about the wrapper for the jobs themselves. Well, how would you
>> get a job
>> ID in the wrapper in case of sqrsh? Also, users often submit
>> hundreds of
>> jobs in short succession using sqsub (even though we try
>> encourage them
>> to use job arrays). The approach you described would mean
>> hundreds of
>> processes on the calling host waiting for the jobs to be
>> scheduled and
>> constantly calling qstat. Not a very healthy situation. No,
>> this really
>> would be useful and convenient if this functionality existed
>> in the
>> scheduler.
>>
>> Thanks for mentioning "max_unheard". I've missed it somehow.
>> It is
>> really useful.
>>
>> --
>> Felix
>>
>> -----Original Message-----
>> From: Rayson Ho [mailto:rayrayson at gmail.com]
>> Sent: Tuesday, June 20, 2006 1:51 PM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] Question regarding SGE submission
>> methods
>> (scheduling timeouts)
>>
>> Don't understand why pending jobs wouldn't be solved by this -
>> as long
>> as as you can get the job's status via qstat, then you can
>> qdel the
>> job.
>>
>> Also, to handle zombie jobs, see "max_unheard" in sge_conf(5).
>>
>> Rayson
>>
>>
>>
>> On 6/20/06, Kogan, Felix <Felix-Kogan at deshaw.com> wrote:
>>> Sure, we thought about this. This wouldn't solve the problem
>> of
>> pending
>>> jobs and you're right about the jobs "running" on the "dead"
>> nodes.
>>> We've created a reaper script that deletes such "stuck" jobs
>> (we call
>>> them zombies) periodically. Do you know about any other
>> method of
>>> getting rid off zombies?
>>>
>>
>>
> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
>>
>>
> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
>>
>>
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list