[GE users] question about parallel enviroment and messages about queue instance

reuti reuti at staff.uni-marburg.de
Tue May 25 17:52:09 BST 2010


Am 25.05.2010 um 17:30 schrieb jnorris:

> The slots that are granted are what I put in the email
> 
> #$ -pe ompi  9-10

But will this grant 9 or 10 slots? What is happening, when you enter just 10 and in the other case just 20?

-- Reuti


> 
> in the ompi pe it is a total of 232.
> 
> 
> 
> reuti wrote:
>> Hi,
>> 
>> Am 21.05.2010 um 23:44 schrieb jnorris:
>> 
>> 
>>> Hello to all,
>>> 
>>> I qsubed a job and the job went into run but it just hangs there - very 
>>> simple job just simple "hello world" on each node it runs.
>>> 
>>> 
>>> The parallel enviroment that I pass has min 9 max 10 - when I change 
>>> this to min 9 max 20 - runs fine
>>> 
>> 
>> can you check how many slots are granted for the job (all 20 in case of 9-20)?
>> 
>> Maybe it's a problem with the algorithm of the application, which fails for 10 slots.
>> 
>> -- Reuti
>> 
>> 
>> 
>>> A user on my cluster wants to do a min 9 and max 10 - not sure why this 
>>> cause an hang of the job.
>>> 
>>> 
>>> qsub -j ###
>>> 
>>> reveals the following:
>>> 
>>> parallel environment:  ompi range: 9,10
>>> verify_suitable_queues:     2
>>> project:                    joseph_project
>>> usage    1:                 cpu=00:00:00, mem=0.00000 GBs, io=0.00000, 
>>> vmem=180.562M, maxvmem=235.559M
>>> scheduling info:            queue instance 
>>> "ethernet.q at c11.elcapitan.ucmerced.edu" dropped because it is 
>>> temporarily not available
>>>                           queue instance 
>>> "ethernet.q at c19.elcapitan.ucmerced.edu" dropped because it is 
>>> temporarily not available
>>>                           queue instance 
>>> "ethernet.q at c22.elcapitan.ucmerced.edu" dropped because it is 
>>> temporarily not available
>>>                           queue instance 
>>> "ethernet.q at c33.elcapitan.ucmerced.edu" dropped because it is 
>>> temporarily not available
>>>                           queue instance 
>>> "ethernet.q at c54.elcapitan.ucmerced.edu" dropped because it is 
>>> temporarily not available
>>>                           queue instance 
>>> "ethernet.q at c58.elcapitan.ucmerced.edu" dropped because it is full
>>>                           queue instance 
>>> "ethernet.q at c59.elcapitan.ucmerced.edu" dropped because it is full
>>> 
>>> -- 
>>> Joseph Norris
>>> Application Developer & Server Administrator
>>> 209-228-4576
>>> jnorris at ucmerced.edu
>>> 
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=258164
>>> 
>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>> 
>> 
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=258435
>> 
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>> 
> 
> -- 
> Joseph Norris   
> Applications Developer/Server Admin
> 209-228-4576
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=258474
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=258491

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list