[GE users] taking hosts offline

fx d.love at liverpool.ac.uk
Fri Dec 31 14:58:48 GMT 2010


Daniel Templeton <daniel.templeton at oracle.com> writes:

> Another approach is to create a host group, e.g. @disabled, and an RQS 
> that limits "hosts @disabled to slots=0".  To disable a host, just add 
> it to the host group.  The benefit of this approach is that it works for 
> all queues on the host without needing to enumerate them.

The approach I recommended uses a host group.  Do people not normally
test nodes in batch before letting users back on them, which the
additional ACL allows?

I should have mentioned the refinement of maintaining a host comment
complex recording why the node is (semi-)disabled, which isn't in the
version I referred to.  I.e. the sge-restrict-nodes should have a
--reason arg, which sets the string-valued `problem' complex, and
sge-unrestrict-nodes nullifies it.  (The hostgroup isn't redundant with
the complex defined, because an RQS can't restrict on the basis of the
complex as far as I know.)

-- 
Dave Love
Advanced Research Computing, Computing Services, University of Liverpool
AKA fx at gnu.org

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=311370

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list