[GE users] job not killed when h_cpu limit reached

Rene Salmon rsalmon at tulane.edu
Fri Aug 20 21:22:59 BST 2004


I Think I know what the problem is.  All the MPI Slave processes use the
cpu all the time so they all reached their cpu time limit and got killed.

The MPI Master processes however all it does is spawn some slaves then
sleeps and that is about it so my MPI master processes has a cputime
utilization of 00:00:31 seconds which is why it is still in the queues not
doing anything but sleeping.

I do not really want to use wallclock time to fix this.  Does anyone have
any ideas on how to get SGE to kill the entire MPI job once any of the
Slave processes reach its cpu limit?

Thank you
Rene






On Fri, 20 Aug 2004, Reuti wrote:

> Hi,
>
> >I have some queues setup which run parallel jobs and single cpu jobs.
> >I have set these cpu time limits on all the queues.
>
> maybe, for your application the wallclock time is a better choice as a limit.
> When the jobs are mostly idling, they will not generate much used cpu time.
>
> Cheers - Reuti
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list