[GE users] relationship between qsub and qstat and queue allocations

Margaret Doll Margaret_Doll at brown.edu
Thu Nov 13 14:24:45 GMT 2008


Thanks.

On Nov 13, 2008, at 9:21 AM, reuti wrote:

> Am 13.11.2008 um 15:02 schrieb Margaret Doll:
>
>> On Nov 13, 2008, at 3:52 AM, reuti wrote:
>>
>>> Hi Margaret,
>>>
>>> Am 12.11.2008 um 23:36 schrieb Margaret Doll:
>>>
>>>> qsub is not working the way that I thought it should.  Each qsub  
>>>> may
>>>> start several instances of a job, but it will
>>>>
>>>> create only one instance of a running job showing in qmon
>>>> only one instance of a queued job using "qstat -f" and
>>>> seems to count  as only one job on one  of the  compute nodes.
>>>>
>>>> For instance, I am using
>>>>
>>>> qsub -q mem16.q shll
>>>>
>>>> where shll includes:
>>>>
>>>> #!/bin/bash
>>>> #$ -o $HOME/works-1/Out
>>>> #$ -j y
>>>> /opt/openmpi/bin/mpiexec -v -n 17 -machinefile $Home/works-1/
>>>> machinefile $Home/works-1/mad
>>>
>>> although Open MPI has a tight integration into SGE built in, you  
>>> will
>>> need to define and request a PE (parallel environment), instead of
>>> supplying your own list of machines.
>>>
>>> http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge
>>>
>>> -- Reuti
>>>
>>
>> I found the settings.sh in /opt/gridengine/default/common/settings.sh
>>
>> I believe, however, before I run
>>
>> qsh -pe  orte 4
>> mpirun -np 4  a.out
>>
>> I have to set up a parallel environment named orte.  How do I set up
>> numerous
>> parallel environments that correspond to the queues that I set up in
>> qmon?
>
> You'll find the information in the "N1 Grid Engine 6 Administration
> Guide" on page 155 ff.
>
> http://gridengine.sunsource.net/documentation.html
>
> -- Reuti
>
>
>>
>>>
>>>>
>>>> machinefile includes:
>>>>
>>>> compute-0-10
>>>> compute-0-11
>>>> compute-0-10
>>>> compute-0-11
>>>> compute-0-10
>>>> compute-0-11
>>>> compute-0-10
>>>> compute-0-11
>>>> compute-0-10
>>>> compute-0-11
>>>> compute-0-10
>>>> compute-0-11
>>>> compute-0-10
>>>> compute-0-11
>>>> compute-0-10
>>>> compute-0-11
>>>> compute-0-10
>>>> compute-0-11
>>>>
>>>> I have each of the  compute nodes set to run  only eight queued  
>>>> jobs
>>>> at  a time.
>>>>
>>>> A queued job will show up on compute-0-10 when I run "qstat -f"
>>>>
>>>> compute-0-10 will be running  9 instances of the program;
>>>> compute-0-11
>>>> will be running 8.
>>>>
>>>> What am I doing incorrectly?
>>>>
>>>> ------------------------------------------------------
>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>> dsForumId=38&dsMessageId=88651
>>>>
>>>> To unsubscribe from this discussion, e-mail: [users-
>>>> unsubscribe at gridengine.sunsource.net].
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=88664
>>>
>>> To unsubscribe from this discussion, e-mail: [users-
>>> unsubscribe at gridengine.sunsource.net
>>> ].
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?
>> dsForumId=38&dsMessageId=88680
>>
>> To unsubscribe from this discussion, e-mail: [users-
>> unsubscribe at gridengine.sunsource.net].
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88681
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
> ].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88682

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list