[GE users] Correct accounting with mpich-mx tight integration

Chris Rudge chris.rudge at astro.le.ac.uk
Thu May 10 12:51:50 BST 2007


I've set up a PE to get tight MPI integration with mpich-mx but wonder
if there's a way to get walltime accounting correct.

Using sge_mpirun, rsh is replaced with 'qrsh -inherit'. As mpich-mx runs
an rsh (or now qrsh) to launch every process the PE has to have "job is
first task" set to false. This appears to mean that, for a 16 cpu job,
there are 17 lots of accounting done - the 16 MPI processes plus the
job. This is OK for cpu time but is wrong for walltime.

Is there any way to avoid this with mpich-mx?

Do other MPI distributions, e.g. openmpi, suffer the same problem?


Dr Chris Rudge
chris.rudge at astro.le.ac.uk

UKAFF Facility Manager & Dept. Research Computing Manager
Dept of Physics & Astronomy
University of Leicester

web.  www.ukaff.ac.uk
Tel.  +44 (0)116 2523331
Fax.  +44 (0)116 2231283
Mob.  +44 (0)794 1379420

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list