[GE users] Reserving resources or machines for queued jobs?
neil.baker at crl.toshiba.co.uk
Mon Oct 6 20:58:14 BST 2008
I've been reading through the resources chapter of the Grid Engine manual
and have been running some initial tests to try it out. It does appear to
be very powerful, but I have been left wondering what will happen to jobs in
the following scenario and if it is possible for jobs to reserve machines?
We have some users that have large jobs requiring 3GB+ of memory (almost all
of a machine's memory), but the vast majority of jobs require less than
512MB memory. Our execution hosts all have 4GB memory and between 4 and 8
slots. Both types of jobs have the same priority.
As a result, if I create a memory resource that is used up as jobs are run
(with each job specifying the memory it requires to run), won't the jobs
requiring less memory always jump the queue because there will never be
quite enough memory to allow the large jobs to run? i.e. the jobs requiring
almost all the memory be left waiting until a machine becomes available with
enough memory (which will only occur when all the smaller jobs are
completed)? At busy times, the queue is never empty of these smaller jobs
for a week or more, so I'm worried that the larger jobs be run.
If this is the case, is there a way that a large job can reserve a resource
(well reserve a machine) so that no new jobs are run on a machine and that
existing running jobs on it gradually complete freeing up the necessary
resource needed for this large job to run?
More information about the gridengine-users