[GE users] Jobs for cluster management?

Chris Dagdigian dag at sonsorol.org
Fri Dec 23 15:06:16 GMT 2005


The problem is that nobody manages their cluster the same way,  
especially when it comes to the specifics of system and OS management.

For instance, most operators of larger clusters would never concern  
themselves with the details of a single node - they build their  
infrastructures so as to allow for a complete unattended bare-metal  
OS installation over the network.  When these systems exist, all you  
need to do is touch a TFTP config file and remotely power cycle a  
node to have it completely wiped and replaced with the newest image.   
Updating systems becomes a single click operator task that can either  
be done by hand on a busy system whenever the situation allows or the  
process can be trivially scripted.  Deep SGE integration is unnecessary.

I guess my take then is that there will never be a suitable one- 
solution-fits all way to do this within Grid Engine and I'd rather  
have the developers working on scheduling/job-related RFE's and  
enhancements.

Although ...

If we could narrow this down into a targeted RFE then it would  
certainly be worth doing. For instance -- what about a enhancement  
request that would allow a SGE operator to assign a custom status  
message associated with disabled state "d" queues? The status message  
would allow us to discern why nodes are disabled ("broken" vs  
"needs_bios_update")  and we could also use qselect or XML qstat  
output to programatically discover the nodes that are in "d" state  
because they require maintenance.

Having a custom state message associated with "d" queues would be  
pretty useful. Other ideas? If we are going to make a RFE targeted  
towards system management it should probably be very detailed and  
specific as to how it will work.

-Chris





On Dec 23, 2005, at 8:10 AM, Jon Lockley wrote:

> Well, kind of.
>
> You'd have "qmod -d" everything you need to upgrade and make a list  
> of all
> those nodes. Then frequently run a cron (or similar) job to check when
> these nodes become empty. After applying the upgrade you then need to
> remove it from the node list (so that you don't try to upgrade it
> again). You shouldn't rely on the queue state as a test of upgrades as
> there are all sorts of reasons why it might be disabled.
>
> So yes it's do-able with some scripting but *if* other folks are  
> doing the
> same stuff would it make any sense (and is it worth the hassle!) to  
> make
> it part of SGE?
>
> Thanks,
>
> Jon
>
> On Fri, 23 Dec 2005, Chris Dagdigian wrote:
>
>>
>> Sounds like using Grid Engine's disable-queue function ("qmod -d
>> <queue instance>") would get you the same thing:
>>
>> - running jobs are untouched in disabled queues
>> - no user jobs are ever touched, suspended, re-queued or killed
>> - no new work gets sent to disabled queues (thus draining the  
>> machine)
>> - you can easily disable every node in the cluster with ("qmod -d
>> '*'") or in manageable groups
>> - you know which nodes still need admin work done because they are in
>> state 'd'
>> - a node that is rebooted for admin reasons (update; applied new
>> kernel etc.) will still come online in 'disabled' state
>>
>>
>> -Chris
>>
>>
>> On Dec 23, 2005, at 7:46 AM, Jon Lockley wrote:
>>
>>> Hi everyone,
>>>
>>> I'm wondering if the following is already possible (in a non-kludgy
>>> way)
>>> or whether it's something sensible to ask for as a new feature.
>>>
>>> Traditionally when we want to upgrade the software on nodes in a
>>> cluster
>>> we drain work off those nodes by shortening the wall clock limit
>>> every few
>>> hours such that it reaches zero when the work is scheduled.  This
>>> is a bit
>>> of a pain for the users but they prefer it to the alternative:
>>> killing
>>> all running jobs at a scheduled time. Obviously this means the  
>>> cluster
>>> gets fairly empty so I'm wondering of there's a better option.
>>>
>>> My idea is to have some form of "management job" in the SGE  
>>> software.
>>> Management jobs run once and once only on each node selected
>>> (usually the
>>> whole cluster I guess) as soon as the current (user) job on it
>>> finishes.
>>> In other words they jump ahead of regular user jobs on nodes which
>>> haven't
>>> yet run the management job. The node in question could then be
>>> automatically released back to normal duties and eventually the  
>>> whole
>>> cluster will have been upgraded/changed.
>>>
>>> The advantages of doing things this way are 1) you don't have to
>>> empty the
>>> cluster or kill jobs to do upgrades, 2) you're not changing
>>> anything while
>>> users have jobs running and 3) SGE keeps track of which machines do/
>>> don't
>>> still need to execute the management tasks.
>>>
>>> I grant that this wouldn't be appropriate for every upgrade e.g.  
>>> where
>>> post-upgrade nodes can't work with pre-upgrade nodes for parallel
>>> applications.  However I can see a lot of scenarios where it makes
>>> sense
>>> to couple the job scheduler with cluster management tasks to keep  
>>> the
>>> cluster as "alive" as possible at all times.
>>>
>>> So as I said, I'm curious to know if/how this can be done or
>>> alternatively
>>> if other people would find it a useful SGE feature.
>>>
>>> All the best,
>>>
>>> Jon
>>>
>>> -------------------------------------------------------------------- 
>>> --
>>> --
>>> | Dr Jon Lockley, Centre Manager
>>> |                                   |
>>> | Oxford Supercomputing Centre     | Email
>>> jon.lockley at comlab.ox.ac.uk |
>>> | Oxford University Computing Lab. | Tel +44 (0)1865
>>> 283569            |
>>> | Wolfson Building                 | Fax +44 (0)1865
>>> 273839            |
>>> | Parks Rd.                        |
>>> www.osc.ox.ac.uk                  |
>>> | Oxford, OX1 3QD                  | "Out of Darkness Cometh
>>> Light"    |
>>> | UK
>>> |                                   |
>>> -------------------------------------------------------------------- 
>>> --
>>> --
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
>
> ---------------------------------------------------------------------- 
> --
> | Dr Jon Lockley, Centre Manager    
> |                                   |
> | Oxford Supercomputing Centre     | Email  
> jon.lockley at comlab.ox.ac.uk |
> | Oxford University Computing Lab. | Tel +44 (0)1865  
> 283569            |
> | Wolfson Building                 | Fax +44 (0)1865  
> 273839            |
> | Parks Rd.                        |  
> www.osc.ox.ac.uk                  |
> | Oxford, OX1 3QD                  | "Out of Darkness Cometh  
> Light"    |
> | UK                                
> |                                   |
> ---------------------------------------------------------------------- 
> --
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list