[GE users] comprehensive -l limit documentation

Reuti reuti at staff.uni-marburg.de
Thu Jan 24 11:03:15 GMT 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

Am 23.01.2008 um 22:20 schrieb Alexandre Racine:

> << --- Q2 Does this config up here is optimal for my need? I don't
> << want any limit by default, but obliviously I want to be able to
> << request some reserved memory for some task.
>
> <To me it looks okay. But you will need a default limit - otherwise
> <SGE can't decrease the amount of remaining memory by the already
> <running jobs on a node. If you check only the actual consumption,
> <this may vary over the runtime of the job (and by later submitted
> <jobs) and is no reliable indicator what's really left.
>
>
>
> Does putting a default limit of let's say 3G in qconf -mc for  
> h_vmem, is the same as putting -l h_vmem=3G on the command line?

yes.

> Because the latter will make the job only see a maximum of 3G and I  
> don't really want that since most job will not work correctly if I  
> do that.

Some programs (like Gaussian) don't like that h_stack equals h_vmem  
(h_stack and h_data is set in addition if you set h_vmem). It will  
work again, if you limit h_stack further to around 128M.

> The other point is that I don't currently know how much memory each  
> program will take while running. I guess I could do an array with  
> qacct -j $JOBARRAY | grep maxvmem :)
>
>
> You where saying that putting -l h_vmem=10G will change the ulimit  
> in Linux for that job, but will it also reserve that amount of  
> memory for the job and other jobs wont be able to use this memory?

If you made the complex consumable and defined it for every exec  
host: yes

-- Reuti


>
>
> Thanks.
>
>
>
>
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: Wed 2008-01-23 15:30
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] comprehensive -l limit documentation
>
> Hi,
>
> Am 23.01.2008 um 20:28 schrieb Alexandre Racine:
>
>> --- Q1 : So I have done this.
>> qconf -mc
>> #name   shortcut type   relop requestable consumable default  urgency
>> h_vmem  h_vmem   MEMORY <=    YES         YES        NONE     0
>>
>> qconf -me server1
>> complex_values        h_vmem=16G
>>
>> qconf -me server2
>> complex_values        h_vmem=32G
>>
>> ... and in the script I added -l h_vmem=20G and the job will run on
>> server2.
>
> Great!
>
>> I have play around these, so is this a bug report if I say that I
>> could put in qconf -me server1, h_vmem=200G and there is no error
>> message? (The server only have 32G of memory and no swap)?
>
> This is because the default of the -w switch is n (none) for qsub.
> You can use -w e and should see something like: No suitable queues.
> If you like, you can put this in the sge_request file as default.
>
>
>> --- Q2 Does this config up here is optimal for my need? I don't
>> want any limit by default, but obliviously I want to be able to
>> request some reserved memory for some task.
>
> To me it looks okay. But you will need a default limit - otherwise
> SGE can't decrease the amount of remaining memory by the already
> running jobs on a node. If you check only the actual consumption,
> this may vary over the runtime of the job (and by later submitted
> jobs) and is no reliable indicator what's really left.
>
>
>> --- Q3 Also, let's say that I have a couple of program running on a
>> 32G RAM server and that only 10G are free. If I ask for 15G with "-
>> l h_vmem=15G" and that the host limit is "complex_values
>> h_vmem=32G", will SGE see this and wait for the free memory before
>> launching the job?
>
> Yes. To check what is left on the host you can use: qhost -F
>
> -- Reuti
>
>
>> -----Original Message-----
>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>> Sent: Wed 2008-01-23 10:53
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] comprehensive -l limit documentation
>>
>> Hi,
>>
>> Am 23.01.2008 um 16:09 schrieb Alexandre Racine:
>>
>>> Mmm, well is my syntax correct?
>>>
>>> In the bash file I have put this witch would ask for 20GB of memory.
>>> #$-l h_vmem=20G
>>>
>>> When launching the job, SGE sent the job to a machine with 14G
>>> free...
>>>
>>> qstat
>>> all.q at server1.com   BIP   1/3       0.12     lx24-amd64
>>>     400 0.56000 Merli racine      r     01/23/2008 10:10:05     1
>>>
>>> $ qhost
>>> HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE
>>> SWAPTO  SWAPUS
>>> -------------------------------------------------------------------- 
>>> -
>>> -
>>> ---------
>>> global                  -               -     -       -
>>> -       -       -
>>> SERVER2                 lx24-amd64      8  1.13   30.4G    4.4G
>>> 1.9G     0.0
>>> SERVER1                 lx24-amd64      4  0.12   14.6G  769.1M
>>> 2.0G     0.0
>>>
>>>
>>>
>>> Can SGE use memory from another machine?
>>
>> of course not, you would need such things like: http://
>> www.kerrighed.org/wiki/index.php/Main_Page if you would have a need
>> for it.
>>
>> In your setup h_vmem is for now only a limit per job, but not a
>> consumable per host which will SGE decrease and increase depending on
>> the submitted jobs on this machine. To do so, you would need to:
>>
>> - make h_vmem consumable with a proper default consumption in the
>> complex definition (qconf -mc)
>> - give every machine a sensible default for the built in memory
>> (qconf -me <node>)
>>
>> If you have it definied this way as a queue limit and an exec host
>> limit, the smaller one of the values will be taken for each job.
>>
>> -- Reuti
>>
>>
>>> Thanks
>>>
>>>
>>>
>>> Alexandre Racine
>>> 514-461-1300 poste 3304
>>> alexandre.racine at mhicc.org
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>>> Sent: Tue 2008-01-22 17:58
>>> To: users at gridengine.sunsource.net
>>> Subject: Re: [GE users] comprehensive -l limit documentation
>>>
>>> Hi,
>>>
>>> Am 22.01.2008 um 23:15 schrieb Alexandre Racine:
>>>
>>>> (Bump)
>>>> For example if my job absolutely need 20G of memory (reservation),
>>>> what parameter should I use in this list?
>>>>
>>>> -l s_data=20G
>>>> -l h_data=20G
>>>> -l s_rss=20G
>>>> -l h_rss=20G
>>>> -l s_vmem=20G
>>>> -l h_vmem=20G
>>>
>>> you will need just h_vmem. This will set:
>>>
>>> data seg size
>>> stack size
>>> virtual memory
>>>
>>> in the ulimit of the kernel and besides this enable the memory
>>> control in SGE to observe the job's memory consumption.
>>>
>>> -- Reuti
>>>
>>>
>>>>
>>>> Alexandre Racine
>>>> 514-461-1300 poste 3304
>>>> alexandre.racine at mhicc.org
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Alexandre Racine [mailto:Alexandre.Racine at mhicc.org]
>>>> Sent: Wed 2008-01-16 10:59
>>>> To: users at gridengine.sunsource.net
>>>> Subject: RE: [GE users] comprehensive -l limit documentation
>>>>
>>>> It seems the document you point to is more for statistics of
>>>> machines then for -l limit reservation. It's all good :) but what I
>>>> really need is a comprehensive guide on resources reservation.
>>>>
>>>> For example if my job absolutely need 10G of memory (reservation),
>>>> what parameter should I use in this list?
>>>>
>>>> -l s_data 20G
>>>> -l h_data 20G
>>>> -l s_rss 20G
>>>> -l h_rss 20G
>>>> -l s_vmem 20G
>>>> -l h_vmem 20G
>>>>
>>>>
>>>>
>>>> Alexandre Racine
>>>> Projets spéciaux
>>>> 514-461-1300 poste 3304
>>>> alexandre.racine at mhicc.org
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Rayson Ho [mailto:rayrayson at gmail.com]
>>>> Sent: Tue 2008-01-15 16:46
>>>> To: users at gridengine.sunsource.net
>>>> Subject: Re: [GE users] comprehensive -l limit documentation
>>>>
>>>> Did you read $SGE_ROOT/doc/load_parameters.asc before??
>>>>
>>>> Rayson
>>>>
>>>>
>>>>
>>>> On Jan 15, 2008 3:44 PM, Alexandre Racine
>>>> <Alexandre.Racine at mhicc.org> wrote:
>>>>> Hi all,
>>>>>
>>>>> Somehow, we now have to use some limits/ressources reservation for
>>>>> some jobs. Looking around man qsub, man complex, man queue_conf
>>>>> and a little bit on the SGE website, I can't really find any
>>>>> comprehensive documentation about the subject.
>>>>>
>>>>> For example, I saw the -l mem_total=6G on the web, but can't find
>>>>> it in the official documentation.
>>>>>
>>>>> Searching for "mem_total" in the administration guide gives 0
>>>>> result (for 6.0), and in the user guide, there is the listing of
>>>>> "qconf -sc", but no descriptions.
>>>>>
>>>>> I use SGE 6.0.
>>>>>
>>>>> Is there a comprehensive guide on resources reservation somewhere?
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>>
>>>>> Alexandre Racine
>>>>> Projets spéciaux
>>>>> 514-461-1300 poste 3304
>>>>> alexandre.racine at mhicc.org
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> -
>>>>> -
>>>>> -
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail: users-
>>>>> help at gridengine.sunsource.net
>>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> -
>>>> -
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> -
>>>> -
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list