[GE users] Cannot run because resources requested are not available
reuti at staff.uni-marburg.de
Mon Aug 16 15:43:05 BST 2010
Am 16.08.2010 um 16:32 schrieb spow_:
> I am requesting two instances because i read than mem_free may result in oversubscription if used alone.
not when you make it (mem_free) consumable and attach a feasible value in the exechost definition. When the measured value is lower than SGE's internal bookkeeping of this complex, this will be used.
> The mem_token is supposed to reserve the amount of RAM at sumbission, whereas mem_free does not guarantee it (from what I have understood).
> I found this tip in a discussion in which you participated. You had a preference for using h_vmem, but it kills the jobs that are wrongly defined, so I'd rather use token+free for now.
Just use what fits better to your needs.
> As for the parallel jobs, they run fine if no resources requests are made.
> The number of slots defined in the PE is equal to 2 times the PE 'size' : i.e. MPI-4 that is used by a queue spanning from host 1 to host 4 has 8 slots (because hosts are dual-core).
What allocation_rule and what did you request in `qsub`?
> I have assumed this is correct because it's a part of an older configuration I was asked not to modify.
As asked: does a parallel job run w/o and resource request?
> > Date: Mon, 16 Aug 2010 16:18:08 +0200
> > From: reuti at staff.uni-marburg.de
> > To: users at gridengine.sunsource.net
> > Subject: Re: [GE users] Cannot run because resources requested are not available
> > Hi,
> > Am 16.08.2010 um 16:12 schrieb spow_:
> > > I'm trying to copy the configuration I have built on my test server (2 hosts) to the real one (20+ hosts).
> > > The versions are slightly different, which may cause the error : I am trying to copy my configuration on a N1 6.0u2, and did my tests on a slightly newer version (6.1 I think).
> > > In the end, the new 6.2u6 should be installed on the cluster, but I'll probably be gone before this happens.
> > >
> > > When trying to run a parallel job which requests hard resources (e.g. qsub -hard -l mem_free=1G -l mem_token=1G
> > why are you requesting two instances for the memory?
> > > -pe "mpi*" 4 jobname) on the 6.0u2 managing the 20+ hosts cluster, I have the following statements when clicking the 'Why' button :
> > >
> > > Cannot run because resources requested are not available for parallel job.
> > > Cannot run because available slots combined under PE "name_of_PE" are not in range of job.
> > - PE attached to a queue - does a job w/o mem_free/token run?
> > - Number of slots in the PE definition reflects the number of cores in the cluster?
> > -- Reuti
> > > The only difference I can think of between the 2 configurations is that the one I used to work on has a tight integration whereas the one i'm currently working on doesn't (i.e. control slaves = true, job is first task = true).
> > > I have defined mem_token and mem_free as consumables, and added them in the host configuration.
> > > I have been carefully reviewing hosts definitions, queue configurations, complexes definitions ... and I cannot think of any mechanism that would block the jobs that way. I also tried different variations of the qsub (removing -hard or part of the arguments, setting mem_token=50M ... ), but it doesn't work.
> > >
> > > Thanks for having read,
> > > have a nice day,
> > > GQ
> > ------------------------------------------------------
> > http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274720
> > To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users