[GE users] parallel jobs submission help
reuti at staff.uni-marburg.de
Thu Nov 25 18:50:35 GMT 2004
[ The following text is in the "ISO-8859-1" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some special characters may be displayed incorrectly. ]
> I have SGE installed on my rocks cluster. I ahve also installed some
> computational chemistry software called molpro. If I want to run this
> program interactively on 2 CPU i would simply issue the command
> molpro -n2 file.com
> where file.com is my input file, after doing this and looking at top I see
> the line
> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
> 3943 root 25 0 49540 48M 5604 R 199.9 0.8 0:09 0
you have to set up a parallel environment in SGE to get it proper working with
parallel jobs. See "man sge_pe".
> indicating that the process is running on 2 CPU.
> I now want to emulate this using SGE so that I can launch jobs from my
> frontend node to the various compute nodes.
> I have a script:
> $ cat molpro.sh
> #$ -cwd
> #$ -j y
> #$ -S /bin/bash
> molpro -n2 test.com
molpro -n $NSLOTS test.com for integration with SGE
Well, we also just bought a parallel license for Molpro and I wonder, whether
you need a shared scratch space to all the nodes for parallel jobs. Because it
will go all to the file server in this case, maybe to set some nodes aside and
using PVFS (a parallel file-server) to speed up the things. Anyway, because you
have only one variable which is set by default to be $SCRATCH in Molpro, you
must set SCRATCH=$TMPDIR. In the second step (for multinode jobs) you can add
to startmpi.sh a loop to create identical named scratch directories on all
nodes (if this is working at all this way with Molpro).
The other not well documented thing is, that Molpro expects the name of the
nodes in $PBS_NODEFILE, so you have to set it also to point to the machinefile
i.e. PBS_NODEFILE=$TMPDIR/machines and have one line per task on each node in
Okay, first step is to get it working on one node, and I would also suggest to
start with the mpi PE example. For the setup of the PE I intend to use
"allocation rule" of 2. This way you can use the cleanipcs script from e.g.
mpich to remove some shared memory segments which maybe still there after the
end of the job, without killing other jobs from the same user on the node.
Sorry, many things at once. And be assured: it's not a native question or setup
to get it working in the correct way. BTW: are you using rsh or ssh between the
Cheers - Reuti
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users