[GE users] Transfer Queues

Reuti reuti at staff.uni-marburg.de
Tue Aug 8 07:19:55 BST 2006


Hi,

Am 07.08.2006 um 18:05 schrieb Chris Chambers:

> Hi,
>   Thanks for the reply.  The ports are the same for both clusters,  
> but the cell names are different. Do they have to be the same?   
> Also, the way we set it up the $SGE_ROOT is the same of both  
> clusters, it is a shared directory.  We only did this after it  
> seemed like there was communication problems, but it didnt't fix  
> anything.

did you try to set the SGE_CELL to the name of the cell being  
queried, just at the beginning of the load sensor?

-- Reuti


> Thanks,
>
> Chris
>
> Reuti wrote:
>> Am 04.08.2006 um 22:37 schrieb Chris Chambers:
>>> Reuti,
>>>     Thanks for all your help.  I am still having some trouble  
>>> getting the load sensor to work, and I was wondering if you know  
>>> how it communicates to the remote cluster.  For example, does it  
>>> use rlogin or telnet, or something else?
>> It uses SGE's own protocol, and although it might run (as  
>> suggested in the Howto) on the local headnode, it will query the  
>> remote cluster by using a different $SGE_ROOT. You are using the  
>> same cell name and ports for both clusters?
>> -- Reuti
>>> Thanks,
>>>
>>> Chris
>>>
>>> Reuti wrote:
>>>> Hi,
>>>> Am 02.08.2006 um 23:52 schrieb Chris Chambers:
>>>>> Yes, I defined the complex for the load value.  And each time I  
>>>>> hit return the number of jobs waiting is returned.  However,  
>>>>> the number stays the same since they never get run on the  
>>>>> transfer queue.
>>>> this should be the jobs in the other (i.e. remote) cluster.
>>>>> When you look at the transfer queue, using either qmon or qstat  
>>>>> -f should the transfer queue have values for the load average?
>>>> The load is just the load of the machine, where the load sensor  
>>>> resides. It has no practical meaning, but should be available of  
>>>> course. You have an execd running there, as he will execute the  
>>>> load sensor. You are using the same ports for qmaster and execd  
>>>> on both clusters?
>>>> But the Howto was for SGE 5.x, so it might need some tuning to  
>>>> get it working under 6.0. E.g. there is no longer a "COMMD_PORT"  
>>>> as the environment variables are different.
>>>> -- Reuti
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> Reuti wrote:
>>>>>> Am 02.08.2006 um 22:22 schrieb Chris Chambers:
>>>>>>> Hi Reuti,
>>>>>>>   Thanks for the reply.
>>>>>>>
>>>>>>>   I did adjust the load sensor to the cluster settings.   
>>>>>>> However, I am not sure if it works correctly.  When I run it  
>>>>>>> from the command line, it returns the number of jobs that are  
>>>>>>> waiting and then it hangs.  It seems to be waiting for input,  
>>>>>>> but I am not sure exactly what this input is supposed to be.
>>>>>> Each time you press return you should get a new value. You  
>>>>>> also defined the complex for the load value? - Reuti
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> Reuti wrote:
>>>>>>>> Hi,
>>>>>>>> Am 02.08.2006 um 20:07 schrieb Chris Chambers:
>>>>>>>>> Hello,
>>>>>>>>>    I am trying to set up transfer queues and am having some  
>>>>>>>>> difficulties.
>>>>>>>>>
>>>>>>>>> I have two clusters that I am trying to tie together with  
>>>>>>>>> transfer queues.  I followed the instructions laid out in  
>>>>>>>>> the how-to, but it seems like the clusters aren't  
>>>>>>>>> communicating with each other.
>>>>>>>>>
>>>>>>>>> Whenever I try to submit a job to the transfer queue it  
>>>>>>>>> simply says that the queue is temporarily not available.  I  
>>>>>>>>> thought that this might be the load sensor, but I haven't  
>>>>>>>>> been able to figure out a problem with it.
>>>>>>>> you adjusted the load sensor to your cluster settings? Is  
>>>>>>>> the load sensor is working if started from the command line?  
>>>>>>>> - Reuti
>>>>>>>> --------------------------------------------------------------- 
>>>>>>>> ------
>>>>>>>> To unsubscribe, e-mail: users- 
>>>>>>>> unsubscribe at gridengine.sunsource.net
>>>>>>>> For additional commands, e-mail: users- 
>>>>>>>> help at gridengine.sunsource.net
>>>>>>>
>>>>>>> ---------------------------------------------------------------- 
>>>>>>> -----
>>>>>>> To unsubscribe, e-mail: users- 
>>>>>>> unsubscribe at gridengine.sunsource.net
>>>>>>> For additional commands, e-mail: users- 
>>>>>>> help at gridengine.sunsource.net
>>>>>> ----------------------------------------------------------------- 
>>>>>> ----
>>>>>> To unsubscribe, e-mail: users- 
>>>>>> unsubscribe at gridengine.sunsource.net
>>>>>> For additional commands, e-mail: users- 
>>>>>> help at gridengine.sunsource.net
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> ---
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail: users- 
>>>>> help at gridengine.sunsource.net
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list