[GE users] Cannot run because resources requested are not available
reuti at staff.uni-marburg.de
Wed Aug 18 11:09:31 BST 2010
Am 18.08.2010 um 10:03 schrieb spow_:
> Sorry for the delayed answer, the access to the problematic cluster is very restricted.
> > Date: Mon, 16 Aug 2010 16:43:05 +0200
> > From: reuti at staff.uni-marburg.de
> > To: users at gridengine.sunsource.net
> > Subject: Re: [GE users] Cannot run because resources requested are not available
> > Hi,
> > Am 16.08.2010 um 16:32 schrieb spow_:
> > > I am requesting two instances because i read than mem_free may result in oversubscription if used alone.
> > not when you make it (mem_free) consumable and attach a feasible value in the exechost definition. When the measured value is lower than SGE's internal bookkeeping of this complex, this will be used.
> > > The mem_token is supposed to reserve the amount of RAM at sumbission, whereas mem_free does not guarantee it (from what I have understood).
> > >
> > > I found this tip in a discussion in which you participated. You had a preference for using h_vmem, but it kills the jobs that are wrongly defined, so I'd rather use token+free for now.
> > http://gridengine.info/2009/12/01/adding-memory-requirement-awareness-to-the-scheduler
> > Just use what fits better to your needs.
> I'll be looking into this if the current method keeps failing.
> > > As for the parallel jobs, they run fine if no resources requests are made.
> > > The number of slots defined in the PE is equal to 2 times the PE 'size' : i.e. MPI-4 that is used by a queue spanning from host 1 to host 4 has 8 slots (because hosts are dual-core).
> > What allocation_rule and what did you request in `qsub`?
> My AR is $round_robin.
> The qsub looks like this : qsub -hard -l mem_token=1G -l mem_free=1G -pe "mpi*" 8 <jobname>
> I have then removed mem_free as a consumable in the exechost (and left mem_token and slots in the exechost consumable/fixed attributes).
so it's only a load value any longer?
> If I now submit qsub -hard -l mem_free=1G -pe "mpi*" 8 <jobname> (i.e. no mem_token request) it does work.
Then you might indeed consume more than available, as it's only a snapshot of the acutal usage of memory, which may vary over time. When "mem_free" is made consumable, the lower of a) the measured free memory or b) the computed consumable complex will be taken into account.
> The problem seems to come from the consumable definition in the exechost, but I do not have this problem on the test cluster. (where only mem_token and slots are defined, so it's basically the same configuration)
The complex definition is the same, and also the RQSs (if any are defined) are the same?
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users