[GE users] Stuck jobs

FL lengyel at gmail.com
Wed Nov 26 22:13:32 GMT 2008


My apologies to the group -- I had restored my configuration from
a previous one, in which dual core execution hosts had the number
of slots set to one (the computational chemists insisted on it).
However, I set the number of slots/host equal to the number of cores,
and resolved the issue.

Here is some background: I use a parallel execution environment even
for single threaded gaussian jobs and ask users to submit with a
script that ensures that the  number of slots=the number of cores needed.
This works for core duo machines and quads (there is a separate pe, queue
and submission script for each). Since the job submission script may be of
some interest to users of the list, I'll include it here:

[flengyel at nept bin]$ more gsub
#!/bin/bash
if [ $# -lt 1 ]; then
  echo "Usage: gsub gaussianfile [qsub options]"
  exit
fi
ARGS=("$@")
QOPTS=${ARGS[@]:1}
qsub $QOPTS <<__HereDocument__
#!/bin/bash
#$ -S /bin/bash
#$ -cwd
#$ -N $1
#$ -pe gauss 2
#$ -q x86_64.q

export g03root=/usr/local/gaussian
. /usr/local/gaussian/g03/bsd/g03.profile
export SGE_ROOT=/usr/local/sge
.  /usr/local/sge/default/common/settings.sh
export GAUSS_SCRDIR=/tmp

g03 $1
__HereDocument__


This script uses a here document to generate the script at run-time,
to get around the problem that #$ variables cannot be dynamically
assigned within a job submission script. The lines

ARGS=("$@")
QOPTS=${ARGS[@]:1}
qsub $QOPTS <<__HereDocument__

allow the user to provide additional values to qsub....

On Wed, Nov 26, 2008 at 4:24 PM, FL <lengyel at gmail.com> wrote:
> Could someone tell me the meaning of the following?
> A user is attempting to run a job using a parallel execution environment.
> There are numerous hosts available, but this is the kind of output
> qstat -j job_id is returning:
>
>  has no permission for host "m52.gc.cuny.edu"
>                            (-l NONE) cannot run in queue
> "m39.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m44.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m32.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m50.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m36.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m58.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m59.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m43.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m35.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m55.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m45.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m34.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m41.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m48.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m46.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m53.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m49.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m54.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m47.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m42.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m31.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m33.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m57.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m51.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m37.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m40.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m60.gc.cuny.edu" because it offers only hc:slots=0.000000
>                            (-l NONE) cannot run in queue
> "m38.gc.cuny.edu" because it offers only hc:slots=0.000000
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=90029

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list