[GE users] Schedule one job at a time

sgenedharvey sge at nedharvey.com
Mon Feb 1 03:38:51 GMT 2010

> many jobs are scheduled
> and started at the same time which gives us problems with some other
> resources.

If there's a way to limit the rate of starting jobs, I don't know it.  Instead, if the problem is too many jobs starting simultaneously, I would suggest:

(a) You could decrease the scheduling interval.  Instead of deploying jobs to execution hosts every 15 seconds, you could deploy every ... 1 or 2 or 3 seconds.  This could lower the probability of starting two jobs at the same time.  Also, you could limit the rate at which you submit jobs to the queue.  Instead of "for ((i=0; i<100; i++)) ; do qsub somejob ; done" you could do something like "for ((i=0; i<100 ; i++)) ; do qsub somejob ; sleep 3 ; done"

(b) you could make each job sleep a random number of seconds before starting.  This would again decrease the probability of collision.

(c) If that's not good enough ... if you truly need mutually exclusive, serial launching of multiple jobs to multiple machines to run in parallel ... Then I haven't helped at all, you're back where you started.  You either need SGE to have some mechanism to limit the job dispatch rate ... or you need some 3rd party mutex package to eliminate the race condition of the critical region.


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list