[GE users] Multiple Existing Clusters

Reuti reuti at staff.uni-marburg.de
Thu Nov 3 22:04:59 GMT 2005

Hi Mat,

Am 02.11.2005 um 15:24 schrieb Bradford, Matthew:

> Dear all,
> We currently have a set up that consists of several PC clusters,  
> each with its own master server and configuration which we have  
> previously had very little to do with. The clusters are normally  
> running parallel, mpi applications within their own cell. Each  
> cluster is also set up so that the internal cluster nodes are  
> hidden from the general network, with only the master server  
> visible to the network. Normally, a user remotely logs in to the  
> master server and uses it as a submission host. As is usual, each  
> project submits their jobs to their own cluster, and there is no  
> communication or integration of any of the clusters. All clusters  
> are operating within the same domain.
> We are now looking at how this can be improved. There are probably  
> several things I don't really understand and would appreciate any  
> advice that I could get.
> 1. What is the best way of integrating all the clusters so that a  
> user can submit a job without needing to submit it to a specific  
> cluster.
>         If we were starting from scratch then I would assume that  
> the simplest way would be to not have master server's running on  
> each independent cluster, but have a single, central master server,  
> with cluster queues and parallel environments set up that manage  
> each cluster. The user would then only need to submit their job to  
> the master server, stating that it should run within a parallel  
> environment, without needing to identify the specific queue in  
> which it runs. (As explained in http://gridengine.sunsource.net/ 
> servlets/ReadMsg?list=users&msgNo=13455 ). This type of set-up  
> would require that each node within the cluster is visible from the  
> central master server and therefore each node would require a  
> separate network connection.
> 2. Is using TransferQueues a good way of integrating several clusters?
>         If we were not looking to modify the existing set-up of the  
> clusters a great deal, is the alternative to use TransferQueues. If  
> I understand this correctly, a local cluster can have additional  
> queues set-up as transfer queues for each remote cluster that  
> should be made available. A user would then log on to their own  
> cluster, submit a job, and it may be sent to any of the other  
> clusters for which a transfer queue exists. This mechanism would  
> not require the internal cluster nodes be made available to the  
> general network, but would mean that each individual cluster would  
> have to be administered separately.
> 3. Could we integrate clusters using the Globus-SGE integration.
>         I am not sure how this will function, but assume that a  
> Globus component will act as the central submission point, and will  
> make submissions to any of the SGE controlled clusters. Do the  
> clusters feed back their resource level to the Globus component,  
> allowing Globus to decide which cluster the submitted job should be  
> sent to? Is this really intended for global rather than campus grid  
> set ups?
> 4. Is there an alternative way of integrating several existing  
> clusters into a single campus grid environment?
> I'm not sure if questions like this are suitable for this mailing  
> list, but I'm not sure where else to go. I have looked at several  
> "How to" docs, but am still a bit unclear on the best way forward.
you ordered the choices as I would order them to try. The question to  
1. would be, whether you have managed switches and could this way  
have a VPN for just all compute nodes, and don't have to worry where  
it is located in the campus. For sure it's separated from the normal  
workstations of the users or Internet, and not reachable from the  
outside world. At one point it might get hard for the file server, so  
the question: how many grids and nodes are we speaking about?

Also transfer queues are a good choice, as it's working without any  
other software to be installed. But the question is the mentioned  
restriction on the Howto page: is it possible to have the same  
namespace and common filesystem?

-- Reuti

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list