[GE users] [OT] Cluster monitoring

murple andreas.kuntzagk at mdc-berlin.de
Thu May 28 12:52:39 BST 2009


I'm in search for a monitoring solution.
I want to monitor the status of a 100+ node cluster. What I'm mostly 
interested in is the hardware status of the nodes and some attached 
(Temperature, disk failure etc.)
What solutions do you use? Here is a list of open source products I'm 
aware of and my impression (mostly by reading the webpages).
Maybe somebody could comment on them.

Ganglia: Seems to be intended more for monitoring the load on the cluster
Nagios: Very powerful, but also complex to setup? Hardware status via 
IPMI possible?
SunMC: Confusing interface, can monitor hardware in detail
Hobbit (now Xymon): Easy to setup, have some experience with smaller 
setup, don't know about hardware monitoring

regards, Andreas


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list