[GE users] SGE large memory jobs

mbay2002 jeff at haferman.com
Wed Jul 15 21:53:54 BST 2009

Great, this has all been very helpful.  I really am a rookie with
configuring SGE, so I just want to make sure I get everything right, so
here are 3 additional questions:

1) when I run the qconf -mc, the mem_free line now looks like

#name      shortcut   type      relop requestable consumable default  urgency
mem_free   mf         MEMORY    <=    YES         NO          0        0

Question: do I simply want to change "NO" to "YES" for consumable?  Are 
defaults of 0 and 0 for default and urgency okay in general?

2) for the qconf -me <execthost>, the file now looks like

hostname              compute-0-0.local
load_scaling          NONE
complex_values        NONE
user_lists            NONE
xuser_lists           NONE
projects              NONE
xprojects             NONE
usage_scaling         NONE
report_variables      NONE

So, I'll change complex_values to "mem_free=8G"  (because we have 8G
available - so the guy submitting the 5G job will use 5G of the 8G
available so there will still be resources left for others).  Where does 
this file live so I can do a sed for all 144 compute nodes that we have, 
or is there a way with qmon to set this for all exechosts?

3) Any "unintended consequences" that I should be aware of my making
mem_free a consumable?  e.g., might other qsub jobs break?  When I used
other schedulers in the past, amount of memory needed was always upfront
and center in the job submission scripts.  I'm surprised that this topic
isn't more obvious in the documentation.


dom wrote:
> Hi,
> you have to set  the complex_values for each exechost.
> In your case set complex_values to mem_free=5G (using qconf -me <exechost>)
> Set the mem_free complex to consumable to yes (qconf -mc) and submit 
> your jobs like this:
> qsub -l mem_free=5G java.sge
> Now only one job per host will be scheduled. Setting complex_value to 8G 
> you will have 8G of mem available for consuming
> Marco
> On 07/12/09 04:43, mbay2002 wrote:
>> A user is running a java based script that requires 5 GB of memory. Each of our nodes has 8 GB of RAM available.
>> He is trying something along the lines of
>> qsub -l mem_free=5G java.sge
>> sleep 10
>> qsub -l mem_free=5G java.sge
>> However, even though we have plenty of open nodes, after submitting several jobs like this, a few end up on the same node, and the jobs start paging.
>> We have a pretty vanilla install of SGE, we haven't done any special configuration.  I honestly do not know SGE well enough to know if there is a simple way to ensure that these jobs get assigned one per node.
>> I've done a bit of RTFM'ing, but could use a hint at this point.
>> This is SGE 6.2


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list