[GE users] qmaster ignoring config changes.

griznog griznog at gmail.com
Tue Jul 20 15:38:40 BST 2010


I have an instance of ge 6.2u5 using courtesy binaries and running
under centos in Amazon EC2 which has worked fine for several months
but recently started having trouble recognizing configuration changes.
We have a script which requests a new node instance and then
configures it for gridengine, adding it to several hostgroups, one of
which is @Ncorehosts where N == number of cores. In the queue
configuration we have:

slots               1,[@8corehosts=8],[@4corehosts=4],@[2corehosts=2]

This was all working fine until the last few weeks when upon starting
new instances, they ignore the hostgroup specific setting and get
assigned 1 slot until I either manually open the queue configuration
and save it (no changes necessary) or restart the qmaster. The
sections of the node creation script which handle this haven't changed
and there's been no significant change to the ge config other than
adding/deleting compute instances. As a workaround I plan to add to
the script a fake queue modification after each node addition,
however, I'd rather know what happened to make this suddenly be flaky
and fix the real problem. Any suggestions for a fix or where to look
for more information are appreciated.




To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list