[GE users] Infiniband loadsensor

reuti reuti at staff.uni-marburg.de
Mon Aug 23 14:27:50 BST 2010


Am 23.08.2010 um 15:25 schrieb reuti:

> Hi Erik,
> 
> Am 23.08.2010 um 15:02 schrieb erilon78se:
> 
>> Does anyone know how to disable a node in SGE/OGE if a custom "load sensor" detects that the state of the link is bad.
>> 
>> * For some nodes this is "Ok", since they dont have infiniband.
>> * For some nodes this is "Not Ok" since they have infiniband.
>> 
>> I have implemented the Load Sensor and added a (bool) complex to SGE, but how do I disable a node which reports an inactive infiniband link?
> 
> it should be possible to put a node into alarm state, which is in the end like disabling the node. An entry like:
> 
> $ qconf -sq all.q
> ...
> load_thresholds       NONE,[@infiniband=0]

Sorry, should read: NONE,[@infiniband=yourcomplex=0]

-- Reuti


> should do it. Depending on the logic you used, it might be necessary to replace the 0 with 1. The faulty nodes should then show an "a" in `qstat -f` in the column "states".
> 
> -- Reuti
> 
> 
>> /Erik
>> 
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=276248
>> 
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>> 
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=276253
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=276256

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list