[GE users] Transfer Queues

Chris Chambers chambech at mls.jpl.nasa.gov
Mon Aug 7 17:05:44 BST 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,
   Thanks for the reply.  The ports are the same for both clusters, but 
the cell names are different. Do they have to be the same?  Also, the 
way we set it up the $SGE_ROOT is the same of both clusters, it is a 
shared directory.  We only did this after it seemed like there was 
communication problems, but it didnt't fix anything.

Thanks,

Chris

Reuti wrote:
> Am 04.08.2006 um 22:37 schrieb Chris Chambers:
> 
>> Reuti,
>>     Thanks for all your help.  I am still having some trouble getting 
>> the load sensor to work, and I was wondering if you know how it 
>> communicates to the remote cluster.  For example, does it use rlogin 
>> or telnet, or something else?
> 
> It uses SGE's own protocol, and although it might run (as suggested in 
> the Howto) on the local headnode, it will query the remote cluster by 
> using a different $SGE_ROOT. You are using the same cell name and ports 
> for both clusters?
> 
> -- Reuti
> 
> 
>> Thanks,
>>
>> Chris
>>
>> Reuti wrote:
>>> Hi,
>>> Am 02.08.2006 um 23:52 schrieb Chris Chambers:
>>>> Yes, I defined the complex for the load value.  And each time I hit 
>>>> return the number of jobs waiting is returned.  However, the number 
>>>> stays the same since they never get run on the transfer queue.
>>> this should be the jobs in the other (i.e. remote) cluster.
>>>> When you look at the transfer queue, using either qmon or qstat -f 
>>>> should the transfer queue have values for the load average?
>>> The load is just the load of the machine, where the load sensor 
>>> resides. It has no practical meaning, but should be available of 
>>> course. You have an execd running there, as he will execute the load 
>>> sensor. You are using the same ports for qmaster and execd on both 
>>> clusters?
>>> But the Howto was for SGE 5.x, so it might need some tuning to get it 
>>> working under 6.0. E.g. there is no longer a "COMMD_PORT" as the 
>>> environment variables are different.
>>> -- Reuti
>>>>
>>>> Thanks,
>>>>
>>>> Chris
>>>>
>>>> Reuti wrote:
>>>>> Am 02.08.2006 um 22:22 schrieb Chris Chambers:
>>>>>> Hi Reuti,
>>>>>>   Thanks for the reply.
>>>>>>
>>>>>>   I did adjust the load sensor to the cluster settings.  However, 
>>>>>> I am not sure if it works correctly.  When I run it from the 
>>>>>> command line, it returns the number of jobs that are waiting and 
>>>>>> then it hangs.  It seems to be waiting for input, but I am not 
>>>>>> sure exactly what this input is supposed to be.
>>>>> Each time you press return you should get a new value. You also 
>>>>> defined the complex for the load value? - Reuti
>>>>>> Thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> Reuti wrote:
>>>>>>> Hi,
>>>>>>> Am 02.08.2006 um 20:07 schrieb Chris Chambers:
>>>>>>>> Hello,
>>>>>>>>    I am trying to set up transfer queues and am having some 
>>>>>>>> difficulties.
>>>>>>>>
>>>>>>>> I have two clusters that I am trying to tie together with 
>>>>>>>> transfer queues.  I followed the instructions laid out in the 
>>>>>>>> how-to, but it seems like the clusters aren't communicating with 
>>>>>>>> each other.
>>>>>>>>
>>>>>>>> Whenever I try to submit a job to the transfer queue it simply 
>>>>>>>> says that the queue is temporarily not available.  I thought 
>>>>>>>> that this might be the load sensor, but I haven't been able to 
>>>>>>>> figure out a problem with it.
>>>>>>> you adjusted the load sensor to your cluster settings? Is the 
>>>>>>> load sensor is working if started from the command line? - Reuti
>>>>>>> --------------------------------------------------------------------- 
>>>>>>>
>>>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list