[GE users] Cannot run because resources requested are not available

spow_ miomax_ at hotmail.com
Mon Aug 16 15:12:17 BST 2010


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

I'm trying to copy the configuration I have built on my test server (2 hosts) to the real one (20+ hosts).
The versions are slightly different, which may cause the error : I am trying to copy my configuration on a N1 6.0u2, and did my tests on a slightly newer version (6.1 I think).
In the end, the new 6.2u6 should be installed on the cluster, but I'll probably be gone before this happens.

When trying to run a parallel job which requests hard resources (e.g. qsub -hard -l mem_free=1G -l mem_token=1G -pe "mpi*" 4 jobname) on the 6.0u2 managing the 20+ hosts cluster, I have the following statements when clicking the 'Why' button :

Cannot run because resources requested are not available for parallel job.
Cannot run because available slots combined under PE "name_of_PE" are not in range of job.

The only difference I can think of between the 2 configurations is that the one I used to work on has a tight integration whereas the one i'm currently working on doesn't (i.e. control slaves = true, job is first task = true).
I have defined mem_token and mem_free as consumables, and added them in the host configuration.
I have been carefully reviewing hosts definitions, queue configurations, complexes definitions ... and I cannot think of any mechanism that would block the jobs that way. I also tried different variations of the qsub (removing -hard or part of the arguments, setting mem_token=50M ... ), but it doesn't work.

Thanks for having read,
have a nice day,
GQ



More information about the gridengine-users mailing list