[GE users] automatic suspension on full cluster

massot bernard.massot at ens.fr
Wed Feb 10 14:00:52 GMT 2010


I'm trying to deal with the issue of full cluster (no available slot). I
have a cluster to which all users have equal access. The problem is that
when there's no remaining slot, some users have to wait until the end of
jobs of other users, who sometimes run a lot of jobs. It's unfair. On
the other hand using individual job quotas is often a waste of resources
since you can have a cluster that is not full.

Here is the ideal configuration for my cluster. Anyone can submit as
many jobs as he wants if the cluster is not full. If the cluster is full
and someone wants to submit a job, instead of having this job pending,
the person who runs the biggest number of jobs gets one of his jobs
suspended, and a slot is freed so the job of the first person can run.
As soon as slots are available again, jobs suspended because of full
cluster are resumed.
I could build a system based on cron jobs suspending and resuming jobs,
and adjusting the "slots" queue attributes on the fly, but that sounds
like a quite ugly solution.
Can you think of an elegant way to configure my ideal cluster?
Bernard Massot


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list