[GE users] DRMAA and MPI jobs

Daniel Templeton Dan.Templeton at Sun.COM
Fri May 30 16:22:32 BST 2008


    [ The following text is in the "ISO-8859-2" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

The problem is that you submitted your job as a batch job.  To submit as 
a parallel job, you'll either have to use a job category or the native 
specification to pass in the -pe switch.  (See qsub(1).)  You will also 
need to configure a parallel environment.  (See sge_pe(5) and qconf(1)).

Daniel

Jacek Strzelczyk wrote:
> Hello,
>
> I'm trying to run an mpi job using DRMAA. My DRMAA code looks like this:
>
> ----------------------------------
> (...)
> errnum = drmaa_allocate_job_template (&jt, error, 
> DRMAA_ERROR_STRING_BUFFER);
>
>    if (errnum != DRMAA_ERRNO_SUCCESS) {
>       fprintf (stderr, "Could not create job template\n");
>    }
>    else {
>       errnum = drmaa_set_attribute (jt, DRMAA_WD, 
> "/usr/SGE/test/drmaa", NULL, 0);
>       errnum = drmaa_set_attribute (jt, DRMAA_REMOTE_COMMAND, 
> "/usr/SGE/test/maximum/maximumtest.sh", error, 
> DRMAA_ERROR_STRING_BUFFER);
>       if (errnum != DRMAA_ERRNO_SUCCESS) {
>          fprintf (stderr, "Could not set attribute \"%s\": %s\n", 
> DRMAA_REMOTE_COMMAND, error);       }
>
>       else {
>          char jobid[DRMAA_JOBNAME_BUFFER];
>
>          errnum = drmaa_run_job (jobid, DRMAA_JOBNAME_BUFFER, jt, 
> error, DRMAA_ERROR_STRING_BUFFER);
>
>          if (errnum != DRMAA_ERRNO_SUCCESS) {
>             fprintf (stderr, "Could not submit job\n");
>
> (...)
> -------------------------------
>
> The script maximumtest.sh :
>
> -------------------------------
> #!/bin/csh -f
>
> # our name
> #$ -N MaximumTest
> #
> # queue
> #$ -q work
> #
> # pe request
> #$ -pe mpich2 2
> #
> # MPIR_HOME from submitting environment
> #$ -v MPIR_HOME=/usr/SGE/mpich2,SGE_QMASTER_PORT
> #
> # needs in
> #   $NSLOTS
> #       the number of tasks to be used
> #   $TMPDIR/machines
> #       a valid machiche file to be passed to mpirun
>
>
> # enables $TMPDIR/rsh to catch rsh calls if available
>  set path=($TMPDIR $path)
>  set MPIR_HOME=/usr/SGE/mpich2
>
>  ( echo 100 ; echo 100 ) | $MPIR_HOME/bin/mpirun -n $NSLOTS 
> -machinefile $TMPDIR/machines -pwdfile /usr/SGE/credentials 
> /usr/SGE/test/maximum/maximumtest 'y=x' 1
> --------------------------------
>
> And after executing DRMAA code, the job is submited to grid, but as a 
> result I get:
> Error, unable to open machine file '/tmp/253.1.work/machines'
>
> What can be wrong?
>
> Jacek
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list