[GE users] dynamic execution nodes recognition

Krzysztof Wilk chris at gridwisetech.com
Wed May 17 12:23:08 BST 2006

    [ The following text is in the "ISO-8859-2" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi all,

I am struggling with the problem of dynamic recognition of execution nodes.

Master nodes has permanent IP address that is known to all execution
nodes. Execution nodes have IP addresses from a known pool of addresses.
The naive approach of master polling execution nodes will not scale
hence IMO the execution nodes should inform the master of their
availability. I assume that both master host and each execution node is
a submit host as well.

I came up with two methods how the nodes could contact the master: DRMAA
file staging (sending empty files) or sending ping (qping).
I tend to the DRMAA approach because I think it is more reliable.
Am I right?

There is another problem arising: how the master should check for
"signals" received by the nodes?
I am thinking of two methods: load sensor or job prolog/epilog.
I think the former is safer because if there were no jobs running for a
long time, no new available execution nodes would be found.
Am I right?

Maybe there is a more straightforward solution to dynamic execution
nodes recognition.

Thanks in advance for any hints or clues,

Chris Wilk                            chris.wilk at gridwisetech.com

GridwiseTech                          office/fax: +48 12 294 71 20

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list