[GE users] How to configure GE to send jobs to two clusters

Reuti reuti at Staff.Uni-Marburg.DE
Fri Apr 25 15:34:40 BST 2008


Hi,

Am 25.04.2008 um 11:38 schrieb Esteban Freire:

>>> We have two GE clusters configured, one installed locally and the  
>>> other one in an external machine running qmaster server. We would  
>>> like to know if it is possible configure our submitter host to  
>>> submit jobs to two different qmasters, I think we can get this  
>>> playing with the variable $SGE_CELL but I would appreciate some  
>>> help.
>>
>> are you sharing the $SGE_ROOT between both clusters?
>> The $SGE_CELL is not "default" in at least one of the clusters?
>> The sge_qmaster/sge_execd ports are the same in both clusters?
>> The same user accounts exist in both clusters?
>> The /home is shared?
>>
> No, we are not sharing the $SGE_ROOT between both clusters, they  
> are different installations, and they are installed in different  
> paths, and besides they are different versions, one is installed  
> with *SGE 6.0u6* and the other one with *GE 6.1u3*
>
> In this moments, the $SGE_CELL is "default" in both clusters. How  
> to should I configure it?
>
> The sge_qmaster/sge_execd ports are not the same in both clusters,  
> should they are the same?
>
> Yes, same user accounts exist in both cluster, but /home is not  
> shared.

the question in addition is: are also the user-ids the same, i.e. the  
numerical representation? As you don't share the /home, you would  
need some kind of file-staging from one cluster to the other and copy  
the results back after the job ran.

>>> Other question in which we are interested is, can we also have a  
>>> qstat result for both qmasters?
>>
>> If you write a wrapper for qstat (and qsub), which will select a  
>> different cellname on-the-fly, it should work. It will lookup the  
>> other qmaster by getting its name from the act_qmaster file which  
>> is different. In case you don't share $SGE_ROOT, you can just copy  
>> the static information in $SGE_ROOT/other_cell to the other cluster.
>>
> Sorry, how can I select a different cellname with qstat?

If you would have the same version of SGE in both clusters (with  
identical ports for sge_qmaster) and a copy of $SGE_ROOT/ 
other_cluster in the other cluster to get the proper act_qmaster  
file, you could issue:

reuti at server:~> SGE_CELL=dummy qstat
error: cell directory "/usr/sge/dummy" doesn't exist

I have of course no other cell directory there, so I get the error.

-- Reuti


>> -- Reuti
>>
>> PS: Another option could be: http://gridengine.sunsource.net/howto/ 
>> TransferQueues/transferqueues.html The included scripts are for  
>> 5.3 and will need some tweaking to work under 6.0.
>>
>>
> Ok, I have to read this.
>
> Thanks,
> Esteban
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list