[GE users] relationship between qsub and qstat and queue allocations

reuti reuti at staff.uni-marburg.de
Thu Nov 13 14:21:25 GMT 2008


Am 13.11.2008 um 15:02 schrieb Margaret Doll:

> On Nov 13, 2008, at 3:52 AM, reuti wrote:
>
>> Hi Margaret,
>>
>> Am 12.11.2008 um 23:36 schrieb Margaret Doll:
>>
>>> qsub is not working the way that I thought it should.  Each qsub may
>>> start several instances of a job, but it will
>>>
>>> create only one instance of a running job showing in qmon
>>> only one instance of a queued job using "qstat -f" and
>>> seems to count  as only one job on one  of the  compute nodes.
>>>
>>> For instance, I am using
>>>
>>> qsub -q mem16.q shll
>>>
>>> where shll includes:
>>>
>>> #!/bin/bash
>>> #$ -o $HOME/works-1/Out
>>> #$ -j y
>>> /opt/openmpi/bin/mpiexec -v -n 17 -machinefile $Home/works-1/
>>> machinefile $Home/works-1/mad
>>
>> although Open MPI has a tight integration into SGE built in, you will
>> need to define and request a PE (parallel environment), instead of
>> supplying your own list of machines.
>>
>> http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge
>>
>> -- Reuti
>>
>
> I found the settings.sh in /opt/gridengine/default/common/settings.sh
>
> I believe, however, before I run
>
> qsh -pe  orte 4
> mpirun -np 4  a.out
>
> I have to set up a parallel environment named orte.  How do I set up
> numerous
> parallel environments that correspond to the queues that I set up in
> qmon?

You'll find the information in the "N1 Grid Engine 6 Administration  
Guide" on page 155 ff.

http://gridengine.sunsource.net/documentation.html

-- Reuti


>
>>
>>>
>>> machinefile includes:
>>>
>>> compute-0-10
>>> compute-0-11
>>> compute-0-10
>>> compute-0-11
>>> compute-0-10
>>> compute-0-11
>>> compute-0-10
>>> compute-0-11
>>> compute-0-10
>>> compute-0-11
>>> compute-0-10
>>> compute-0-11
>>> compute-0-10
>>> compute-0-11
>>> compute-0-10
>>> compute-0-11
>>> compute-0-10
>>> compute-0-11
>>>
>>> I have each of the  compute nodes set to run  only eight queued jobs
>>> at  a time.
>>>
>>> A queued job will show up on compute-0-10 when I run "qstat -f"
>>>
>>> compute-0-10 will be running  9 instances of the program;
>>> compute-0-11
>>> will be running 8.
>>>
>>> What am I doing incorrectly?
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=88651
>>>
>>> To unsubscribe from this discussion, e-mail: [users-
>>> unsubscribe at gridengine.sunsource.net].
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessageId=88664
>>
>> To unsubscribe from this discussion, e-mail: [users- 
>> unsubscribe at gridengine.sunsource.net
>> ].
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=88680
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88681

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list