Opened 51 years ago
Last modified 10 years ago
#866 new task
IZ225: need 'sdmadm add/remove/modify_slo'
Reported by: | rhierlmeier | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | hedeby | Version: | 1.0 |
Severity: | Keywords: | Sun cli | |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=225]
Issue #: 225 Platform: Sun Reporter: rhierlmeier (rhierlmeier) Component: hedeby OS: All Subcomponent: cli Version: 1.0 CC: None defined Status: NEW Priority: P3 Resolution: Issue type: TASK Target milestone: 1.0u5next Assigned to: adoerr (adoerr) QA Contact: adoerr URL: * Summary: need 'sdmadm add/remove/modify_slo' Status whiteboard: Attachments: Issue 225 blocks: Votes for issue 225: Vote for this issue Opened: Wed Nov 21 08:04:00 -0700 2007 ------------------------ I think we should have the possibility to change the SLOs on the fly (without stopping/starting the service). In GEAdapter I implemented such a feature in the reload method. This method reads the new configuration, compares it with the old one and decides whether restarting of the service is necessary. If only SLOs has been changed the service is not restarted, only the SLOManager is reconfigured. With this approach it is possible to add/modify/remove SLOs on the fly with the mod_config and reload_compopnent command. However for implementing the add/modify/remove/show SLOs we need the following steps: - The service interface needs a new method setSLOs. - In hedeby-common.xsd we have to define a global slo element - The AbstractServiceConfig xml type has already the the slo element it's defined in hedeby-common.xsd - We have to implement the following cli commands sdmadm add_slo -s <service name> [-f <slo file>] sdmadm mod_slo -s <service name> -n <slo name> sdmadm remove_slo -s <service name> -n <slo name> ------- Additional comments from rhierlmeier Wed Nov 21 08:26:33 -0700 2007 ------- type changed to task ------- Additional comments from crei Fri Apr 4 04:47:21 -0700 2008 ------- Supporting this commands later ------- Additional comments from rhierlmeier Tue Aug 5 03:41:34 -0700 2008 ------- The commands for modifying SLOs on the fly are very important. We should implement it in near future. ------- Additional comments from afisch Tue Aug 12 06:15:02 -0700 2008 ------- Extending the CLI with dynamic modifySLO commands Description: SDM could benefit from a set of commands that allows to modify SLOs independent from the component config modification. Usually the setup of the components is done only once whereas the SLO modification might be a more frequent task during the lifetime of an SDM system. Typical reasons might be a changed use case for a managed service or the adaption of new services into a running SDM system. Evaluation: This issue is rated p3. It is not a mandatory but a handy feature, as a set of dedicated commands to modify SLOs would simlify the administration effort and the the SLO management would be separated from the service/component setup. A couple of remarks should be considered before implementing this feature: 1.) The update of changed SLOs might be implemented dynamically, i.e without restarting the service. For the GE Adapter it is possible to update its configuration without explicit service shut down. This could be implemented for the update_SLO commando in a similar fashon. However during service shut down the managed domain may remain active. This is the case for a GE instance. This fact implies that the dynamic modification feature is not mandatory as long as it does not interrupt the managed service domain. 2.) It should be clarified whether SLO modifications lead to implicit updates (aka reload) or not. If a update is done implicitly it would be sufficient to create/modify commands to show / add / remove / modify SLOS. Otherwise a fifth command to update is needed. From the usability point of view the behavior of the SLO modification should be consistent with other SDM commando sets. If a modify_SLO leads to an implicit update, the user might expect this behavior for other modification actions, too (e.g. modify component needs a explicit update). From this point of view a separate update would be reasonable. 3.) It should be discussed if there are modify_SLO_scenarios with side effects that have to be considered. Here is one example: In a case where a set of services get their SLOs modified, there might be a time delay between the separate modifications. This time delay can lead to an imbalanced system. If for example the urgency of the services is raised/lowered it might make the resources migrate uncoordinated until all changes are applied. This point should be discussed with 2.) as a separate reload might avoid such problems especially if it could be done for a complete SDM system. Suggested Fix/Work Around: Currently the SLOs have to be edited by modifying the corresponding component configs. Analysis: In order to allow the separate modification of SLOs a set of commands has to be developed. Additionally man pages and wiki documentation for the commands have to be updated/created. As the service name is unique it is reasonable to consider serviceName:SloName as unique identifier for any SLO manipulation. 1.) addSLO: This command allows to add a new slo to an existing service. It shows an editor with a default SLO XML-template. a(dd_)slo -s(ervice) <Service name> -n(ame) <SLO name> -t(emplate) <SLO template> 2.) removeSLO: This command allows to remove all SLOs/the SLO with the specified name from a service. r(emove_)slo -s(ervice) <Service name> [-n(ame) <SLO name>] 3.) modifySLO: This command opens all SLOs /a singe SLO in VI to allow modification. m(odify_)slo -s(ervice)<Service name> [-n(ame) <SLO name>] 4.) showSLO: shows all SLOs of a)the system, b)the service or lists c)a single SLO. s(how_)slo [-s(ervice)<Service name> [-n(ame) <SLO name>]] The show command exists but it should be extended with the following features: A -detail/-all flag and or a format option might be useful here (to list dependent resoucres, just the name etc). For the commands that can affect a set of slos (remove, show, update) it would be helpful to use a filter option to specify the set of SLOs. [-s(lo_)f(ilter) <e.g. 'type = "MaxPendingJobsSLO"'>] or to enumerate them explicitly (or simply allow this for the -n(ame) option) [-l(ist) slo1,slo2,...] The above commands should be implemented similar to the corresponding modify component/service config commands. But they should only modify the configuration subset that concerns the SLO definition. A command that should be considered separately is the updateSLO command: updateSLO: This command allows to update the SLOs of a)the system, b)a service. The system wide reload would allow a synchronized way to apply new resources (see Evaluation section). The command might be obsolete if a reload is included in commands 1.) - 4.) u(date_)slo [-s(ervice)<Service name> [-n(ame) <slo name>]] A good starting point to implement the functionality is the inner class ReloadAction:execute in the file com.sun.grid.grm.service.impl.ge.GEServiceImpl.java in : ... if(getServiceState().equals(ServiceState.RUNNING) || getServiceState().equals(ServiceState.UNKNOWN)) { // We have a valid configuration, check if a reconnect is necessary if(config.isSameCluster(oldConfig) && jgdi.isConnected()) { log.log(Level.INFO, "gsi.sameCluster", getName()); log.log(Level.FINE, "gsi.reinitSLOs", getName()); hostManager.stop(false); hostManager.start(); sloManager.interrupt(); sloManager.setSLOs(config.getSLOs()); sloManager.setUpdateInterval(config.getSloUpdateInterval().getValueInMillis()); sloManager.triggerUpdate(); setState(ComponentState.STARTED); // Issue 421: I can be that we have missed meanwhile a EXECD_DEL or // EXECD_ADD event. Trigger mergeResources manually hostManager.mergeResources(jgdi.getExecHostList()); } else { log.log(Level.INFO, "gsi.newCluster", getName()); // We have a completly new cluster // We really have to stop and restart this component try { StopAction stopAction = new StopAction(); stopAction.execute(); StartAction startAction = new StartAction(); startAction.execute(); } catch(GrmException ex) { log.log(Level.WARNING, ex.getLocalizedMessage(), ex); throw ex; } } } else { // If the service is not running we reconfigure only the SLOManager sloManager.setSLOs(config.getSLOs()); setState(ComponentState.STARTED); } The behavior can be outlined like this: if ServiceState Running/Unknown: if managed GE-system is valid: =>Restart Host manager (GE Adapter specific not SLO specific) =>Restart SLO manager else =>restart service/component else =>update sloManager This behavior allows a reload the component without shut down the corresponding service if possible. It would be reasonable to separate the SLO reload action from the general component reload action. This dedicated updateSLO method needs to be implemented by every Service (eg. the spare pool or any future service adapter). The functionality can not be fully implemented as a general function because details of the managed service domain have to be considered. Therefore it would be reasonable to extend the Service interface with a updateSLO() method as SLO reload is a service related task and not a general component task. com.sun.grid.grm.service.Service However this reload method would be similar to the one in the GrmComponent interface: /** * Triggers a reload of a SLO configuration * * @throws com.sun.grid.grm.GrmException when an error happend. It can * also be a ReloadSLOsNotSupportedException, when the Service does not support the * reload. */ public void reloadSLOs() throws GrmException; A force option should not be neccessary. Finally the commands to modify the component config should still cover the SLO modification, as this is a handy way to configure the complete component in one step. How to test: There should be a set of JUnit test for each new command to check if each command can modify a DummySystem properly. For each command a Testsuite Test has to be developed to test the functionality on the command line level. Each command should be tested with different scenarios: a) add/remove/modify/show a set of SLO b) add/modify without saving the file to edit c) add/remove the same SLO twice to the same service/different services. d) add/remove/modify/reload with nonexistent service e) remove/modify/reload nonexistent SLO f) etc At this point it should be considered to cover missing "modify component/system"-commands with test cases, too. The corresponding tests would be very similar to these ones and should be therefore easy to implement. ETC: 5 PD design for the SLO management module as part of the Service SDK 5 PD Implementation of the SLO management module 2 PD Implementation of the UI classes 2 PD Implementation of CLI classes 3 PD testsuite infrastructure for SLO management 2 PD concrete testsuite tests 1 PD documentation (wiki, man pages ...) 20 PD ------- Additional comments from rhierlmeier Wed Nov 25 07:21:10 -0700 2009 ------- Milestone changed
Note: See
TracTickets for help on using
tickets.