[GE users] Is there a way to run prolog scripts (as root) in all machines involved in the Parallel task?
reuti at staff.uni-marburg.de
Thu Mar 11 18:43:15 GMT 2010
Am 11.03.2010 um 19:18 schrieb goncalo:
> Hi Reuti
>> why do you want to do so? Open MPI is tighly integrated into SGE
>> as long as you compile it --with-sge in the configure step. SGE
>> would disallow a `qrsh -inherit ...` to a machine not in the
>> granted list of machines. Whether it uses -builtin-, rsh, ssh or
>> tight-ssh as startup method doesn't matter.
> I'm using the SGE tight integration:
> [root at hpc001 ~]# /usr/mpi/gcc/openmpi-1.2.8/bin/ompi_info |
> grep gridengine
> MCA ras: gridengine (MCA v1.0, API v1.3,
> Component v1.2.8)
> MCA pls: gridengine (MCA v1.0, API v1.3,
> Component v1.2.8)
> However, in the link
> it is stated:
> "Specifically, if you execute an mpirun command in a SGE job,
> it will automatically use the SGE mechanisms to launch and kill
> What do they mean with "SGE mechanisms to launch and kill
> processes"? Do I have to have qrsh working? I've seen that I can
> use qrsh working either with rsh (which is very insecure and I do
> not like it)
depends; often the nodes are on a private subnet which is no
reachable form the outside anyway, so rsh could be used. As it's only
between nodes, it doesn't matter. As `qrsh -inherit ...` will start a
rshd/sshd for each slave task, it's also possible to disable rshd/
sshd in /etc/xinetd.d completely. Or (what I prefer) limit it to
admin staff. The sshd definition for qrsh_daemon will then need an
addititonal parameter which points to a different sshd_config to
allow this to users.
> , or with ssh (but the accounting is not correctly done
Correct, unless you compile SGE on your own (which you don't use
afterwards but the built sshd).
> ). How do you guys deal with this issue?
>> In the prolog you can use a loop across the list of machines to do
>> it on your own though.
> I though about that, but that would mean that I would have to allow
> root ssh to all hosts, and this is something I wouldn't like...
You can run a second sshd all the time which a) allows login only for
root, b) uses a different port, c) allows only login with ssh keys.
This approach I use to shutdown some machine when apcupsd detects
that the UPSes are running out of battery (with 4 UPSes supplied by 4
different mains feeding different parts of the cluster the network
option apcupsd offers would mean to put the logic "when" to shutdown
"what" on each node).
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users