[GE users] Calculation of load average accurately

Anand S Bisen vmlinuz at abisen.com
Tue Aug 10 21:50:54 BST 2004


What should be the correct way to define the load average in the sun grid
engine 5.3ee. Currently on my cluster that consists of 64 node all with dual
Pentium 4 3.2 GHz processors we are using np_load_average as the method for
load formula and the threshold that is set as of now is 1.75.
 
what should be the load formula (np_load_average) what should be the
adjustment ?? 0.50 load threshold np_load_Average 1.75 and new jobs are not
submitted to the queue if the np_load_Average is > 1.75 on any of the node.
where as if i log on my compute nodes i see that the nodes are very free and
the cpu's are mostly idle since the jobs only starts and use 10-20% of each
CPU. And when i locally execute programs to creat artificial load the load
average goes to 5 and even 7 and that is when i see my node a little busy. 
 
Another thing that i noticed after which i saw the under utilization of my
cluster is that once i do a channel bonding (that is teaming up two NIC
cards to act as one) the load average on my linux boxes jumped to 1.0 1.0
1.0 as minimum when there is no processes running and i see the cpu's as
100% free. But this affected the number of jobs that were being submitted to
the node because sun grid engine thought that the node is already loaded. 
 
So my question is is there any other way to evaluate the load on a node or
how should i go about setting a right threshold for a dual Pentium IV (3.2
GHz) what is set to 1.75 right now.
 
Thanks
 
Anand



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list