[GE users] Stuck jobs

reuti reuti at staff.uni-marburg.de
Thu Nov 27 08:16:03 GMT 2008


Am 26.11.2008 um 23:13 schrieb FL:

> My apologies to the group -- I had restored my configuration from
> a previous one, in which dual core execution hosts had the number
> of slots set to one (the computational chemists insisted on it).
> However, I set the number of slots/host equal to the number of cores,
> and resolved the issue.

I wonder, why this solved the issue wirth this change. For me it was  
making a resource requestable.


> Here is some background: I use a parallel execution environment even
> for single threaded gaussian jobs and ask users to submit with a
> script that ensures that the  number of slots=the number of cores  
> needed.
> This works for core duo machines and quads (there is a separate pe,  
> queue
> and submission script for each). Since the job submission script  
> may be of
> some interest to users of the list, I'll include it here:
>
> [flengyel at nept bin]$ more gsub
> #!/bin/bash
> if [ $# -lt 1 ]; then
>   echo "Usage: gsub gaussianfile [qsub options]"
>   exit
> fi
> ARGS=("$@")
> QOPTS=${ARGS[@]:1}
> qsub $QOPTS <<__HereDocument__
> #!/bin/bash
> #$ -S /bin/bash
> #$ -cwd
> #$ -N $1
> #$ -pe gauss 2
> #$ -q x86_64.q
>
> export g03root=/usr/local/gaussian
> . /usr/local/gaussian/g03/bsd/g03.profile
> export SGE_ROOT=/usr/local/sge
> .  /usr/local/sge/default/common/settings.sh
> export GAUSS_SCRDIR=/tmp
>
> g03 $1
> __HereDocument__
>
>
> This script uses a here document to generate the script at run-time,
> to get around the problem that #$ variables cannot be dynamically
> assigned within a job submission script. The lines
>
> ARGS=("$@")
> QOPTS=${ARGS[@]:1}
> qsub $QOPTS <<__HereDocument__
>
> allow the user to provide additional values to qsub....

Yep, we do it similar. But our generated script will copy the input  
file to the $TMPDIR on the node and alter it to include the nodes  
granted by SGE in a %lindaworks= line.

http://gridengine.sunsource.net/ds/viewMessage.do? 
dsMessageId=20763&dsForumId=38

-- Reuti

>
> On Wed, Nov 26, 2008 at 4:24 PM, FL <lengyel at gmail.com> wrote:
>> Could someone tell me the meaning of the following?
>> A user is attempting to run a job using a parallel execution  
>> environment.
>> There are numerous hosts available, but this is the kind of output
>> qstat -j job_id is returning:
>>
>>  has no permission for host "m52.gc.cuny.edu"
>>                            (-l NONE) cannot run in queue
>> "m39.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m44.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m32.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m50.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m36.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m58.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m59.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m43.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m35.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m55.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m45.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m34.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m41.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m48.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m46.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m53.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m49.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m54.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m47.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m42.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m31.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m33.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m57.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m51.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m37.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m40.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m60.gc.cuny.edu" because it offers only hc:slots=0.000000
>>                            (-l NONE) cannot run in queue
>> "m38.gc.cuny.edu" because it offers only hc:slots=0.000000
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=90029
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=90054

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list