[GE users] array job has taken over

mbay2002 jeff at haferman.com
Wed Aug 12 22:33:52 BST 2009


reuti wrote:
> Am 12.08.2009 um 20:04 schrieb mbay2002:
> 
>> Hi -
>> I'm pretty new to SGE, I know very basic administrative stuff and am
>> still learning.
>>
>> We have a vanilla install with 144 nodes, each with 8 cores.  We have
>> not done any policy configuration.
>>
>> A user has submitted an array job with 36000 elements.  It was the
>> earliest job submitted, and now her job has taken over every core  
>> on the
>> cluster.
>>
>> We have several users waiting who just need a few cores, some just  
>> need
>> a single core.  Many of their jobs have a higher priority than the
>> running array job.
>>
>> But, what happens is that everytime the array job index increases,
>> another one starts to execute as the core is freed up.
> 
> You can put the job on hold:
> 
> $ qhold <jobid>
> 
> the running ones will continue, but no new ones will start.
> 


After I posted, I tried
qalter -h <jobid>

Is this any different than qhold (especially since I noticed that the
qhold manpage says to use qalter to remove the hold).

This is a good temporary solution, but hopefully I can come up with a
policy driven solution so that I do not need to babysit the jobs.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=212057

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list