[GE users] Problems with Advanced Reservations

reuti reuti at staff.uni-marburg.de
Wed Oct 13 16:32:30 BST 2010


Hi,

Am 13.10.2010 um 16:59 schrieb pablorey:

>     We don't modify anything in the GE configuration and so we don't change num_proc's relation.
> 
>     Our nodes have 16 processors and we use num_proc to specify the number of processors used by parallel jobs using shared-memory (OpenMP) and the parallel environment (for example, -pe mpi NSLOTS) to parallel jobs using distributed-memory (MPI, MPICH,...). It is also possible to submit hibrid jobs specifying num_proc and nslots bigger than 1. For example:
> 
>     * If we want to run an OpenMP program using 10 processors we use "num_proc=10" without any parallel environment.

I would suggest to use a PE here too. Often this is called smp or openmp in lower or upper case and using $pe_slots as allocation rule. This way also SGE is aware that this is a parallel job, but bound to one and the same machine.


>     * If we want to run a pure MPI program using 10 processors we use "num_proc=1" and "-pe mpi 10" (or any other parallel environment like mpi_1p, mpi_rr, mpi_2p,...).

Unless you make num_proc consumable and change the relation to "<=" (which I would advice against), requesting exactly one core would leave the job hanging around forever - you request a machine with the feature of having exactly one core. Is such a job running outside of an Advance Reservation?

NB: num_proc is a (fixed) feature of a machine (type INT) like you could define "cpu_type" == "amd2380" for some machines as a fixed STRING. Then you could request similar `qsub -l cpu_type=amd2380 ...`.

>     * If we want to run a hibrid program (OpenMP + MPI) using 5 slots MPI and 10 processors by slot MPI we use "num_proc=10" and "-pe mpi 5" (or any other parallel environment like mpi_1p, mpi_rr, mpi_2p,...).

You don't need any request for num_proc here either, because it's already in the allocation rule to get exactly this amount of cores per machine. You will just need to request the total amount of cores in the `qsub` request, and not the number of machines. SGE's parallel environment thinks in slots, not nodes. I.e.:

$ qsub -pe mpi_10 50 ...

and in the pe "mpi_10" you defined "10" as "allocation_rule". So you will get 5 machines.


>     In the examples that you can see in the previous document we modify the qrsub command to reserve the minimum processors to be able to do tests without to reserve a lot of processors because we have a lot jobs in queue (in the initial test we try to reserve 10 nodes so we use num_proc=16 and "-pe mpi 10").
> 
>     When I submit the job with "-w n" the job is enqueued but it is waiting forever. The AR have finished and the job didn't start to run. This is the output of the "qstat -j" command:
> 
> (-l h_fsize=20G,num_proc=1,s_rt=3600,s_vmem=10G) cannot run in queue "medium_queue at cn020.null" because it offers only qf:s_rt=00:00:00
> (-l h_fsize=20G,num_proc=1,s_rt=3600,s_vmem=10G) cannot run in queue "medium_queue at cn021.null" because it offers only qf:s_rt=00:00:00
> cannot run in PE "mpi_1p" because it only offers 1 slots
>     
>     I am a little confused with this issue but I think that the observed message 'cannot run in queue "medium_queue at cn017.null" because it offers only qf:s_rt=00:00:00'

I would assume this is a second (independet) issue, which I also noticed, when you request "-l s_rt=..." in combination with "-ar". And here it seems "-w n" is necessary to bypass any check at all, and forces the job to run (as it should also with enabled verification).

-- Reuti


> is not really relevant. I think that the problem is not related to the s_rt. For me the problem is that the AR works properly and reserve all the requested nodes but we cannot use them. Even more, I have done some tests and if I have an AR like 96 in the previous document that have granted slots in several nodes we can submit several jobs requesting only 1 MPI slot (-pe mpi_1p 1) and all the jobs run in the same node (the first node of the list) and it is not possible to run more than 1 job at same time (independently that we have several nodes reserved). It seems that it is possible only use the first node of the granted slots list. 
> 
>     Thanks again by your help. It is very important for us to solve this problem and so any help is very appreciate.
> 
>     Regards,
>     Pablo
> 
> 
> 
> 
> On 13/10/2010 14:52, reuti wrote:
>> Hi,
>> 
>> Am 13.10.2010 um 10:57 schrieb pablorey:
>> 
>> 
>>>     Hi Reuti,
>>> 
>>>     Yes, the "mpi_1p" has a fixed allocation rule of 1. We use it to be sure that GE assigns 1 MPI slot per node.
>>> 
>>>     In the attached document you can see the configuration of the mpi and mpi_1p parallel environment. We also have checked other parallel environments with different allocation rules (round_robin, 2, 4, ...) with the same results.
>>> 
>>>     You can also find in the attached document two examples. The first of them use "mpi_1p" to reserve several nodes and we cannot submit the jobs (except if we request only 1 slot with "-pe mpi_1p 1"). In the second example we use "mpi" and so only 1 node is reserved. In this case all works properly.
>>> 
>>>     Regarding the .sge_request file, we don't use it so we don't request any queue by default. We specify the queues in the qrsub command because we only want to use nodes belonging to that queues.
>>> 
>> what I can see in the attached document: did you change num_proc's relation? It's by default "== " and as it's just like a feature of a machine it shouldn't be touched, besides requesting the exact amount for certain machines. You are requesting "num_proc=1" which would mean single core machines.
>> 
>> Near the end of page 2 of the document: you can submit the job with "-w n" but get "no suitable queue(s)" for "-w v". For a normal `qsub` bypassing the verification would lead to a job which is waiting forever. But in my test the job started to run inside the AR when I request far too large amounts for s_rt - you observed the same? I would judge it to be a bug, although I'm not sure for now how to phrase it in issuzilla.
>> 
>> To the real problem: don't request any s_rt or alike in the real `qsub`, or specify "-w n".
>> 
>> -- Reuti
>> 
>> 
>> 
>> 
>>>     Thank you very much by your help,
>>>     Pablo
>>> 
>>> 
>>> 
>>> On 11/10/2010 19:32, reuti wrote:
>>> 
>>>> Am 11.10.2010 um 16:23 schrieb pablorey:
>>>> 
>>>> 
>>>> 
>>>>>     Hi Reuti,
>>>>> 
>>>>>     Yes, I request always the same parallel environment used to submit the AR when I submit jobs (mpi_1p or mpi). The first test job is always done requesting the same resources used in the qrsub command. As it don't work, I change the requirements (num_proc, s_rt, s_vmen, ...) o the number of slots but always use the PE requested in the qrsub command.
>>>>> 
>>>>> 
>>>> And the "mpi_1p" has a fixed allocation rule of 1 then?
>>>> 
>>>> For now I can't reproduce this. Can you force the execution with "-w n" instead of "-w v"?
>>>> 
>>>> Do you request any queues by an .sge_request by default?
>>>> 
>>>> -- Reuti
>>>> 
>>>> 
>>>> 
>>> -- 
>>> Pablo Rey Mayo
>>> Tecnico de Sistemas
>>> Centro de Supercomputacion de Galicia (CESGA)
>>> Avda. de Vigo s/n (Campus Sur)
>>> 15705 Santiago de Compostela (Spain)
>>> Tel: +34 981 56 98 10 ext. 233; Fax: +34 981 59 46 16
>>> email: 
>>> prey at cesga.es; http://www.cesga.es/
>>> 
>>> ------------------------------------------------
>>> NOTA: Este mensaje ha sido redactado intencionadamente sin utilizar
>>> acentos ni caracteres especiales, para que pueda ser visualizado
>>> correctamente desde cualquier cliente de correo y sistema.
>>> ------------------------------------------------
>>> <xacobeo.jpg>
>>> <AR_problem.pdf>
>>> 
>> ------------------------------------------------------
>> 
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=286868
>> 
>> 
>> To unsubscribe from this discussion, e-mail: [
>> users-unsubscribe at gridengine.sunsource.net
>> ].
>> 
>> 
> 
> -- 
> Pablo Rey Mayo
> Tecnico de Sistemas
> Centro de Supercomputacion de Galicia (CESGA)
> Avda. de Vigo s/n (Campus Sur)
> 15705 Santiago de Compostela (Spain)
> Tel: +34 981 56 98 10 ext. 233; Fax: +34 981 59 46 16
> email: prey at cesga.es; http://www.cesga.es/
> ------------------------------------------------
> NOTA: Este mensaje ha sido redactado intencionadamente sin utilizar
> acentos ni caracteres especiales, para que pueda ser visualizado
> correctamente desde cualquier cliente de correo y sistema.
> ------------------------------------------------
> <xacobeo.jpg>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=286900

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list