[GE users] DRMAA and MPI jobs

Jacek Strzelczyk szczelba at op.pl
Fri May 30 16:00:43 BST 2008


    [ The following text is in the "ISO-8859-2" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hello,

I'm trying to run an mpi job using DRMAA. My DRMAA code looks like this:

----------------------------------
(...)
errnum = drmaa_allocate_job_template (&jt, error, 
DRMAA_ERROR_STRING_BUFFER);

    if (errnum != DRMAA_ERRNO_SUCCESS) {
       fprintf (stderr, "Could not create job template\n");
    }
    else {
       errnum = drmaa_set_attribute (jt, DRMAA_WD, 
"/usr/SGE/test/drmaa", NULL, 0);
       errnum = drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, 
"/usr/SGE/test/maximum/maximumtest.sh", error, DRMAA_ERROR_STRING_BUFFER);
       if (errnum != DRMAA_ERRNO_SUCCESS) {
          fprintf (stderr, "Could not set attribute \"%s\": %s\n", 
DRMAA_REMOTE_COMMAND, error);       }

       else {
          char jobid[DRMAA_JOBNAME_BUFFER];

          errnum = drmaa_run_job (jobid, DRMAA_JOBNAME_BUFFER, jt, 
error, DRMAA_ERROR_STRING_BUFFER);

          if (errnum != DRMAA_ERRNO_SUCCESS) {
             fprintf (stderr, "Could not submit job\n");

(...)
-------------------------------

The script maximumtest.sh :

-------------------------------
#!/bin/csh -f

# our name
#$ -N MaximumTest
#
# queue
#$ -q work
#
# pe request
#$ -pe mpich2 2
#
# MPIR_HOME from submitting environment
#$ -v MPIR_HOME=/usr/SGE/mpich2,SGE_QMASTER_PORT
#
# needs in
#   $NSLOTS
#       the number of tasks to be used
#   $TMPDIR/machines
#       a valid machiche file to be passed to mpirun


# enables $TMPDIR/rsh to catch rsh calls if available
  set path=($TMPDIR $path)
  set MPIR_HOME=/usr/SGE/mpich2

  ( echo 100 ; echo 100 ) | $MPIR_HOME/bin/mpirun -n $NSLOTS 
-machinefile $TMPDIR/machines -pwdfile /usr/SGE/credentials 
/usr/SGE/test/maximum/maximumtest 'y=x' 1
--------------------------------

And after executing DRMAA code, the job is submited to grid, but as a 
result I get:
Error, unable to open machine file '/tmp/253.1.work/machines'

What can be wrong?

Jacek

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list