[GE users] Virtualization and GridEngine

Ignacio Martin Llorente llorente at dacya.ucm.es
Thu Nov 6 14:40:07 GMT 2008


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hi,

Years ago, when we decided to design and develop the OpenNebula VM  
Manager, we evaluated existing job managers (mainly SGE and GridWay)  
in order to adapt their capabilities for the management of the VM life- 
cycle. However, we found the following limitations:

1. VM structure: The definition of a VM requires a special treatment:  
images with fixed and variable parts for migration, contextualization  
parameters...
2  VM life-cycle: VM management requires fixed and transient states  
for contextualization, live migration?
3  VM duration: VM runs for very long time periods (?forever?)
4  VM groups (services): VMs are not independent entities. A services  
consist of a group of interconnected VMs (not array in the job  
management sense)
5  VM elasticity: Groups of VMs can grow to satisfy a given SLO, or  
even you could dynamically update the memory os CPU requirements of a  
running VM
6  Finally, the aim of the scheduling heuristics is different. While  
in job management we try to optimize performance criteria such as  
turnaround time, throughput?; in VM management, we focus on capacity  
provision, for example
probability of SLA violation for a given cost of provisioning  
including support for server consolidation, partitioning?

That does not mean that virtualization can not be integrated with job  
managers. I know two approaches:

A. VMs to Provide pre-Created Software Environments for Jobs

As described by Andreas, some job managers provide extensions of job  
execution managers to create per-job basis VMs so as to provide a pre- 
defined environment for job execution. Those approaches still manage  
jobs and the VMs are bounded to a given host and only exist during job  
execution.


B. Job Managers on top of a Virtualized infrastructure

A SGE cluster service can run on top of a virtual infrastructure,  
managed for example by OpenNebula, see:

http://gridgurus.typepad.com/grid_gurus/2008/10/elastic-managem.html

Notice that this approach provides a full separation between the  
service and the infrastructure. In other words, you run two  
independent managers: the job manager and the VM manager. In addition,  
you could add a new manager, the service manager. For example Hedeby (http://hedeby.sunsource.net/ 
) could be used to request new virtual worker nodes on demand when the  
number of pending jobs exceed a given threshold.

Regards


--
Ignacio M. Llorente, Full Professor (Catedratico): web http://dsa-research.org/llorente 
  and blog http://imllorente.dsa-research.org
DSA Research Group:  web http://dsa-research.org and blog http://blog.dsa-research.org
Globus GridWay Metascheduler: http://www.GridWay.org
OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org







On 06/11/2008, at 14:29, Andreas.Haas at Sun.COM wrote:

> Hi Diare,
>
> I know at least of one case where exactly this has been done:
>
> - prolog/epilog used to launch/collapse the VM
> - starter method used to start the job inside the VM via ssh
> - termination/suspension/migration be implemented similarly
>
> it's probably not the worst solution, but I can not say for sure,
> if it went into production finally.
>
> Regards,
> Andreas
>
> On Thu, 6 Nov 2008, Daire Byrne wrote:
>
>> Hi,
>>
>> I was wondering if anybody has ever considered using GridEngine to  
>> manage virtualized machines? There are many virtual machine  
>> managers appearing these days (e.g. Redhat's oVirt, OpenNebula,  
>> XenServer etc.) which all are moving towards managing many VMs on a  
>> cluster with features such as load balancing, live migration and  
>> server hardware matching.
>>
>> Apart from live migration I think that most of the features of a VM  
>> manager could be replicated in GridEngine where a VM is a job.  
>> Controlling the job controls the VM with some starter_method,  
>> suspend_method, resume_method and terminate_method scripting. I did  
>> a quick search on Google but didn't really find anything relevant  
>> which usually means either nobody has tried it or it's a stupid  
>> idea! With two queues (say) the VMs could be started on one and  
>> then once they start up they become part of another execution queue.
>>
>> Daire
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88197
>>
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
>> ].
>>
>
> http://gridengine.info/
>
> Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551  
> Kirchheim-Heimstetten
> Amtsgericht Muenchen: HRB 161028
> Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland  
> Boemer
> Vorsitzender des Aufsichtsrates: Martin Haering
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88208
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
> ].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88216

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list