[GE users] Strange behavior with tight integration: no free queue for job

jlopez jlopez at cesga.es
Wed Nov 26 09:24:15 GMT 2008


Hi Reuti,

reuti wrote:
> Am 20.11.2008 um 16:29 schrieb jlopez:
>
>   
>> reuti wrote:
>>     
>>> Hi Javier, Am 20.11.2008 um 11:13 schrieb jlopez:
>>>       
>>>> Hi Reuti, reuti wrote:
>>>>         
>>>>> Hi, Am 19.11.2008 um 18:23 schrieb jlopez:
>>>>>           
>>>>>> Hi all, Today we have seen an strange issue with an mpi
>>>>>>             
>>>>> which MPI implementation?
>>>>>           
>>>> HP-MPI
>>>>         
>>> which version - at least 2.2.5?
>>>       
>>  Yes, we are using: HP MPI 02.02.05.01 Linux IA64
>>     
>>>>>> job that uses tight integration. The job started but after less  
>>>>>> than 5 seconds it finished.
>>>>>>             
>>>>> What mpirun/mpiexec syntax did you use?
>>>>>           
>>>> The mpirun is launched with the following options: mpirun -prot - 
>>>> hostfile $TMPDIR/machines and the environment variable MPI_REMSH  
>>>> is set to "$TMPDIR/rsh" to use the tight integration. The rsh  
>>>> wrapper prints the qrsh commands it launches and this were the  
>>>> qrsh executed:
>>>>         
>>> Is this just one mpirun below?
>>>       
>> Probably not, in this case the mpirun is called indirectly by a  
>> Berkely UPC program so it seems internally it generates several  
>> calls to mpirun.
>>     
>>> Usually HP-MPI collects the tasks for every node (even when there  
>>> are several lines) and makes only one time rsh, the others are  
>>> created as threads. (At least: this is my observation - we have  
>>> only executables of our applications with embedded HP-MPI.)
>>>       
>> Yes, I have seen the same behavior when running HP-MPI directly,  
>> just one rsh per node. In this case is a bit tricky because mpirun  
>> calls are generated internaly by UPC.
>>     
>>> master node=10.128.1.32 slave nodes=4*10.128.1.12 /  
>>> 4*10.128.1.40 / 1*10.128.1.99 This was the intended allocation  
>>> with 10 slots?
>>>       
>> The actual job allocation (according to the logs) were 4 nodes of  
>> num_proc 8: .32, .12, .40 and .99. Where .32 was the master and the  
>> other were slaves (the job requested 4 mpi slots with num_proc=8)
>>
>> The comment you do about the connections is very interesting. I do  
>> not know why it does 4 connections to each node (it could be a  
>> reconnection attemp) but it is very surprising that the .99 is the  
>> only node were it only does 1 connection. The only reason I can see  
>> is the problem of "no free queues" in this node. Do you know what  
>> happens when you get a "no free queues" message in one execd? Is  
>> the qrsh command hanged in the master until it is scheduled?
>>     
>
> As many mpirun are used in your setup, maybe a previous task (which  
> should have already left the node) was still active on the node cn099.
>
> Is this happening all the time or only for certain jobs?
>
> -- Reuti
>
>
>   
The message "no free queues" appears only in a very small portion of the 
mpi jobs. I have been analyzing the logs and even for upc jobs the 
message does not appear usually. The other fact is that even if this 
message appears several times in the logs only a few of the jobs that 
got this message finally fail, the rest are still able to continue.

One doubt, if you run a qrsh -inherit to a slave node and later on 
before this qrsh is finished you send a new qrsh, does it fail because 
of no free queues? If so I could do some tests.

Thanks,
Javier

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89911

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

    [ Part 2, Text/X-VCARD (charset: UTF-8 "Internet-standard Unicode") ]
    [ (Name: "jlopez.vcf") 14 lines. ]
    [ Unable to print this part. ]



More information about the gridengine-users mailing list