[GE users] Cannot run because resources requested are not available
miomax_ at hotmail.com
Wed Aug 18 09:03:51 BST 2010
[ The following text is in the "iso-8859-1" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some special characters may be displayed incorrectly. ]
Sorry for the delayed answer, the access to the problematic cluster is very restricted.
> Date: Mon, 16 Aug 2010 16:43:05 +0200
> From: reuti at staff.uni-marburg.de
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Cannot run because resources requested are not available
> Am 16.08.2010 um 16:32 schrieb spow_:
> > I am requesting two instances because i read than mem_free may result in oversubscription if used alone.
> not when you make it (mem_free) consumable and attach a feasible value in the exechost definition. When the measured value is lower than SGE's internal bookkeeping of this complex, this will be used.
> > The mem_token is supposed to reserve the amount of RAM at sumbission, whereas mem_free does not guarantee it (from what I have understood).
> > I found this tip in a discussion in which you participated. You had a preference for using h_vmem, but it kills the jobs that are wrongly defined, so I'd rather use token+free for now.
> Just use what fits better to your needs.
I'll be looking into this if the current method keeps failing.
> > As for the parallel jobs, they run fine if no resources requests are made.
> > The number of slots defined in the PE is equal to 2 times the PE 'size' : i.e. MPI-4 that is used by a queue spanning from host 1 to host 4 has 8 slots (because hosts are dual-core).
> What allocation_rule and what did you request in `qsub`?
My AR is $round_robin.
The qsub looks like this : qsub -hard -l mem_token=1G -l mem_free=1G -pe "mpi*" 8 <jobname>
I have then removed mem_free as a consumable in the exechost (and left mem_token and slots in the exechost consumable/fixed attributes).
If I now submit qsub -hard -l mem_free=1G -pe "mpi*" 8 <jobname> (i.e. no mem_token request) it does work.
The problem seems to come from the consumable definition in the exechost, but I do not have this problem on the test cluster. (where only mem_token and slots are defined, so it's basically the same configuration)
> As asked: does a parallel job run w/o and resource request?
Parallel jobs run just fine if no resources requests are made at all.
More information about the gridengine-users