[GE users] Cannot run because resources requested are not available

reuti reuti at staff.uni-marburg.de
Mon Aug 16 15:43:05 BST 2010


Hi,

Am 16.08.2010 um 16:32 schrieb spow_:

> I am requesting two instances because i read than mem_free may result in oversubscription if used alone.

not when you make it (mem_free) consumable and attach a feasible value in the exechost definition. When the measured value is lower than SGE's internal bookkeeping of this complex, this will be used.


> The mem_token is supposed to reserve the amount of RAM at sumbission, whereas mem_free does not guarantee it (from what I have understood).
> 
> I found this tip in a discussion in which you participated. You had a preference for using h_vmem, but it kills the jobs that are wrongly defined, so I'd rather use token+free for now.

http://gridengine.info/2009/12/01/adding-memory-requirement-awareness-to-the-scheduler

Just use what fits better to your needs.


> As for the parallel jobs, they run fine if no resources requests are made.
> The number of slots defined in the PE is equal to 2 times the PE 'size' : i.e. MPI-4 that is used by a queue spanning from host 1 to host 4 has 8 slots (because hosts are dual-core).

What allocation_rule and what did you request in `qsub`?


> I have assumed this is correct because it's a part of an older configuration I was asked not to modify.

As asked: does a parallel job run w/o and resource request?

-- Reuti


> > Date: Mon, 16 Aug 2010 16:18:08 +0200
> > From: reuti at staff.uni-marburg.de
> > To: users at gridengine.sunsource.net
> > Subject: Re: [GE users] Cannot run because resources requested are not available
> > 
> > Hi,
> > 
> > Am 16.08.2010 um 16:12 schrieb spow_:
> > 
> > > I'm trying to copy the configuration I have built on my test server (2 hosts) to the real one (20+ hosts).
> > > The versions are slightly different, which may cause the error : I am trying to copy my configuration on a N1 6.0u2, and did my tests on a slightly newer version (6.1 I think).
> > > In the end, the new 6.2u6 should be installed on the cluster, but I'll probably be gone before this happens.
> > > 
> > > When trying to run a parallel job which requests hard resources (e.g. qsub -hard -l mem_free=1G -l mem_token=1G
> > 
> > why are you requesting two instances for the memory?
> > 
> > 
> > > -pe "mpi*" 4 jobname) on the 6.0u2 managing the 20+ hosts cluster, I have the following statements when clicking the 'Why' button :
> > > 
> > > Cannot run because resources requested are not available for parallel job.
> > > Cannot run because available slots combined under PE "name_of_PE" are not in range of job.
> > 
> > - PE attached to a queue - does a job w/o mem_free/token run?
> > - Number of slots in the PE definition reflects the number of cores in the cluster?
> > 
> > -- Reuti
> > 
> > 
> > > The only difference I can think of between the 2 configurations is that the one I used to work on has a tight integration whereas the one i'm currently working on doesn't (i.e. control slaves = true, job is first task = true).
> > > I have defined mem_token and mem_free as consumables, and added them in the host configuration.
> > > I have been carefully reviewing hosts definitions, queue configurations, complexes definitions ... and I cannot think of any mechanism that would block the jobs that way. I also tried different variations of the qsub (removing -hard or part of the arguments, setting mem_token=50M ... ), but it doesn't work.
> > > 
> > > Thanks for having read,
> > > have a nice day,
> > > GQ
> > 
> > ------------------------------------------------------
> > http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274720
> > 
> > To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274728

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list