[GE users] Infiniband loadsensor

reuti reuti at staff.uni-marburg.de
Mon Aug 23 14:25:05 BST 2010


Hi Erik,

Am 23.08.2010 um 15:02 schrieb erilon78se:

> Does anyone know how to disable a node in SGE/OGE if a custom "load sensor" detects that the state of the link is bad.
> 
> * For some nodes this is "Ok", since they dont have infiniband.
> * For some nodes this is "Not Ok" since they have infiniband.
> 
> I have implemented the Load Sensor and added a (bool) complex to SGE, but how do I disable a node which reports an inactive infiniband link?

it should be possible to put a node into alarm state, which is in the end like disabling the node. An entry like:

$ qconf -sq all.q
...
load_thresholds       NONE,[@infiniband=0]

should do it. Depending on the logic you used, it might be necessary to replace the 0 with 1. The faulty nodes should then show an "a" in `qstat -f` in the column "states".

-- Reuti


> /Erik
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=276248
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=276253

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list