[SGE-discuss] suspend threshold ping-pong
stella_levin2003 at yahoo.com
Sat Oct 8 23:11:18 BST 2011
Yes. In our environment the jobs are memory consuming and the memory is a critical resource.
In some cases we cannot predict correctly the size of the job.
Thanks for reply.
From: Reuti <reuti at staff.uni-marburg.de>
To: Stella Levin <stella_levin2003 at yahoo.com>
Cc: "sge-discuss at liv.ac.uk" <sge-discuss at liverpool.ac.uk>
Sent: Friday, October 7, 2011 1:51 AM
Subject: Re: [SGE-discuss] suspend threshold ping-pong
Am 02.10.2011 um 11:29 schrieb Stella Levin:
> Hi sge-discuss group,
> we defined
> suspend_thresholds mt_mem_swap_io=1
> mt_mem_swap_io=1 when "writing to swap" happens ("so" column of vmstat)
> We experience "ping-pong" behavior with suspend - continue of jobs.
> The job starts to write to swap and it is suspended, after the suspension no other jobs writing to swap and within suspend_interval the job is continued... and suspended again and continued again.
> Sometimes we cannot predict exactly the size of the job, and they start to swap.
> - Is it possible to continue the job with different threshold conditions, for example there is a free memory on the host for the job, or something similar
> - Other options to solve the problem ?
unfortunately this is a known "bug/feature/issue" and there is already an issue for it:
Though it's not implemented.
Your current setup is to schedule jobs to a node until it starts to swap?
> Thanks a lot.
> SGE-discuss mailing list
> SGE-discuss at liv.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the SGE-discuss