[GE users] parallel jobs submission help

Reuti reuti at staff.uni-marburg.de
Thu Nov 25 18:50:35 GMT 2004


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

> I have SGE installed on my rocks cluster.  I ahve also installed some
> computational chemistry software called molpro.  If I want to run this
> program interactively on 2 CPU i would simply issue the command
> 
> molpro -n2 file.com
> 
> where file.com is my input file, after doing this and looking at top I see
> the line
> 
>  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
> 3943 root      25   0 49540  48M  5604 R    199.9  0.8   0:09   0
> molprop_2002

you have to set up a parallel environment in SGE to get it proper working with 
parallel jobs. See "man sge_pe".
 
> indicating that the process is running on 2 CPU.
> 
> I now want to emulate this using SGE so that I can launch jobs from my
> frontend node to the various compute nodes.
> 
> I have a script:
> 
> $ cat molpro.sh
> #!/bin/bash
> #
> #$ -cwd
> #$ -j y
> #$ -S /bin/bash
> 
> molpro -n2 test.com

molpro -n $NSLOTS test.com for integration with SGE

Well, we also just bought a parallel license for Molpro and I wonder, whether 
you need a shared scratch space to all the nodes for parallel jobs. Because it 
will go all to the file server in this case, maybe to set some nodes aside and 
using PVFS (a parallel file-server) to speed up the things. Anyway, because you 
have only one variable which is set by default to be $SCRATCH in Molpro, you 
must set SCRATCH=$TMPDIR. In the second step (for multinode jobs) you can add 
to startmpi.sh a loop to create identical named scratch directories on all 
nodes (if this is working at all this way with Molpro).

The other not well documented thing is, that Molpro expects the name of the 
nodes in $PBS_NODEFILE, so you have to set it also to point to the machinefile 
i.e. PBS_NODEFILE=$TMPDIR/machines and have one line per task on each node in 
this file.

Okay, first step is to get it working on one node, and I would also suggest to  
start with the mpi PE example. For the setup of the PE I intend to use 
"allocation rule" of 2. This way you can use the cleanipcs script from e.g. 
mpich to remove some shared memory segments which maybe still there after the 
end of the job, without killing other jobs from the same user on the node.

Sorry, many things at once. And be assured: it's not a native question or setup 
to get it working in the correct way. BTW: are you using rsh or ssh between the 
nodes?

Cheers - Reuti

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list