[GE users] Virtualization and GridEngine

Ignacio Martin Llorente llorente at dacya.ucm.es
Mon Nov 10 20:28:41 GMT 2008


As a final comment, I have been evaluating your proposal to have a  
cluster able to both execute jobs and run VMs. You could have two  
front-ends (that could be hosted on the same system) for the cluster  
and dynamically enable/disable worker nodes for OpenNebula and SGE.  
For example, in a 10-node  cluster, you could enable 2 nodes as SGE  
execution hosts and 8 nodes as ONE (OpenNebula) hosts, and following  
to meet a peak demand in the SGE cluster you could suspend/migrate VMs  
running in 5 of the ONE nodes, disable them (it is easy, just the  
execution of a disable command) and add the 5 nodes to the SGE cluster.

Thanks, for us, it is very important to know requirements from  
potential end users.


Ignacio M. Llorente, Full Professor (Catedratico): web http://dsa-research.org/llorente 
  and blog http://imllorente.dsa-research.org
DSA Research Group:  web http://dsa-research.org and blog http://blog.dsa-research.org
Globus GridWay Metascheduler: http://www.GridWay.org
OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org

On 10/11/2008, at 19:09, daireb wrote:

> Ignacio,
> Eek... two different threads for us to swap emails in!
> ----- "Ignacio Martin Llorente" <llorente at dacya.ucm.es> wrote:
>>> In our case I can see our compute farm becoming a useful way of
>>> providing cheap development 'hardware' for software engineers to
>>> play around with (ala EC2). I doubt we are going run a permanent
>>> home server on a virtual cluster anytime soon but certainly
>>> development servers and compute farm execution hosts are good
>>> candidates.
>> Sure, but if you are planning to do this, Why don't you virtualize  
>> the
>> whole cluster?. So you can boot on-demand the execution hosts with  
>> the
>> required pre-configured environments, the development servers... with
>> added benefits such as consolidation, dynamic resizing...
>> As alternative, if for performance reasons you want to use bare metal
>> for the execution of the jobs, you could use SGE for the management  
>> of
>> the jobs, and OpenNebula for the management of the VMs in the same
>> cluster. In both managers you can specify the hosts, so can
>> dynamically allocate hosts to both managers.
> Maybe one day it will be possible to simply virtualise the whole  
> cluster. However as it stands currently there are still performance  
> advantages to using bare metal and maximising resources. In the case  
> of desktop machines I think it will be a while before you can  
> virtualise them and get good performance (pci passthrough is in the  
> works). We are interested in launching VMs dynamically on desktop  
> machines to maximise unused cpu cycles - it may be that OpenNebula  
> would be a better option for this. I thought it might be a smoother  
> transition for us to allow for jobs and VMs side by side on the  
> compute farm. We can roll out VM usage in stages without completely  
> overhauling everything in one go.
> But I do get your points.
>> Job priority is used to allocate available slots. When your cluster
>> has 5 slots, the manager submits the 5 jobs with higher priority.  
>> When
>> you submit a service with an ordering to start the VMs. The manager
>> can not boot a VM until the previous one has finished the booting
>> process.  In addition you need a rollback process in case one of the
>> VMs fails...
> I see what you mean. I suppose there is always a way though - maybe  
> use job dependencies? A job has to run inside the VM before the next  
> VM (job) can start. Rollback might be done by restarting/submitting  
> the dependency from scratch and/or using EXIT=99 to reschedule on  
> failure. But again these are just hacks to try to replicate a proper  
> VM manager like OpenNebula. It all comes back to whether there  
> really are any advantages in using the same scheduler to manage VMs  
> and jobs simultaneously..... You're starting to convince me!
>> My experience is that you can get that but at the expense of
>> efficiency and functionality. I understand what you mean. We come  
>> from
>> the computing world and we originally thought that was possible.
>> However job and VM management are quite different. Now we see the
>> benefits of the decoupling between job and VM management.
> I started this thread to try and get my head around the differences  
> between job scheduling and emerging VM managers like OpenNebula.  
> Your input has certainly helped - thanks!
> Daire
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88412
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
> ].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list