[GE users] Strange behavior with tight integration: no free queue for job

reuti reuti at staff.uni-marburg.de
Thu Nov 27 21:01:52 GMT 2008


Am 26.11.2008 um 10:24 schrieb jlopez:

> <snip>
>> As many mpirun are used in your setup, maybe a previous task (which
>> should have already left the node) was still active on the node  
>> cn099.
>>
>> Is this happening all the time or only for certain jobs?
>>
>> -- Reuti
>>
>>
>>
> The message "no free queues" appears only in a very small portion  
> of the
> mpi jobs. I have been analyzing the logs and even for upc jobs the
> message does not appear usually. The other fact is that even if this
> message appears several times in the logs only a few of the jobs that
> got this message finally fail, the rest are still able to continue.
>
> One doubt, if you run a qrsh -inherit to a slave node and later on
> before this qrsh is finished you send a new qrsh, does it fail because
> of no free queues? If so I could do some tests.

The jobs will fail, if too many qrsh are send to a node than granted  
by SGE. But IIRC the error message would be different then.

-- Reuti

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=90144

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list