[GE users] How to combine SGE and Condor under the same submit system?

Reuti reuti at staff.uni-marburg.de
Tue Sep 16 11:21:10 BST 2008


Hi,

Am 16.09.2008 um 09:14 schrieb Atle Rudshaug:

> Reuti wrote:
>> Hi,
>>
>> Am 15.09.2008 um 16:32 schrieb Atle Rudshaug:
>>
>>> I want to combine a dedicated cluster and the office workstations  
>>> into a grid like system. I want  large mpi jobs to run on a Rocks  
>>> cluster (which uses SGE) and smaller (mpi and non-mpi) jobs to  
>>> run on available SMP workstations (using some technique for cycle  
>>> scavenging with checkpointing and migration).
>>>
>>> AFAIK Condor is better for cycle scavenging and checkpointing/ 
>>> migration. Is there a way to combine SGE (for the cluster mpi  
>>> jobs) and Condor (for cycle scavenging on the workstations) under  
>>> the same submit system (GridWay? Condor-G? Condor-C?)? Or can SGE  
>>> be used for the workstation cycle scavenging, from the already up  
>>> and running Rocks cluster, as well?
>>
>> you can combine them by using just the checkpointing feature of  
>> Condor:
>>
>> http://gridengine.sunsource.net/howto/checkpointing.html (near the  
>> end Condor integration is outlined)
>>
>> and still submit all jobs in SGE. The workstations you can e.g.  
>> handle with a calendar, so that they are available only outside  
>> the working hours.
>>
>> In contrast to a real Condor cluster (where no NFS need to be  
>> present), you will either need it or script some file staging on  
>> your own.
>>
>> -- Reuti
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>
> Ok, will migration be possible with this solution? A problem with  
> the calendar setup is that there are different workstations  
> available for longer periods of time throughout the working day as  
> well (e.g. user out on consultant work). It would be nice to have a  
> system that will utilize these automatically, saving the sys-admin  
> the work of setting up calendars every day for different machines.
>
> Have I understood correctly that if I set up a Condor grid for  
> scavenging, GridWay (or some other tool) could be used to submit  
> jobs to both the SGE and the Condor grid? Or can SGE send jobs to  
> Condor or vice-versa?

a) Only the checkpointing feature of Condor will be used, means that  
you have to recompile your programs with a standalone version of  
Condor, and then execute them under SGE as outlined in the Howto.  
This might be necessary, if you have to migrate the jobs in the  
morning when their runtime is longer than the nightly hours.

b) You can implement some kind of transfer-queue in SGE, which will  
forward the jobs to Condor to which the workstation machines are  
connected. An outline you will find here: http:// 
gridengine.sunsource.net/howto/TransferQueues/transferqueues.html  
which will need some rework for 6.0 and cooperation with Condor.

c) As said, the main difference is, that applications running under  
Condor will get their system calls to the kernel caught and  
redirected to the submitting machine to access input and output  
files. Hence the mentioned NFS is not necessary.

Whether file staging and/or NFS is an otions for you, you must decide  
when implementing it.

If possible, I would first try to implement it only by using SGE,  
then if necessary adding checkpointing with the Condor libraries, and  
only if necessary introduce a second queueing system for some nodes  
as a last resort (first by using a transfer queue, only if it's not  
working giving access to Condor directly to the users). But GridWay/ 
Globus if far too much for a local cluster IMO.

-- Reuti


>
> - Atle
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list