[GE users] Dynamic queue -- sge schedule policy

John Tseng jtseng at montalvosystems.com
Thu Aug 24 08:26:56 BST 2006


We've done this using a single queue and just asking for mem_free=16G

The more correct scenario is that you'd have a 'mem' consumable per host set by your custom load_sensor which would set (via qconf -mattr exec_host consumabled mem=16G) the total amount of memory on the host.  (You could also set this manually if that suites you better)

then you would submit

qsub -l mem=16G,mem_free=16G ...

Where the mem_free is a static check and 'mem' would be an accounting check
If no other jobs (ie interactive jobs) are running, then the two should match.


Another idea would be to have the host load_sensor add itself to a hostgroup
based on memory (you are probably doing this for your separate queues)

ie
@host_16G
@host_4G

Then just add @host_16G to the @host_4G group.  More manual work, but might be better if you have to maintain separate queues.


submitting to
qsub -q all.q@@host_4G   should also pick up the 16G hosts

qusb -q all.q@@host_16G  should only go to 16G hosts.


-john

On Thu, Aug 24, 2006 at 10:46:10AM +0800, Eric Zhang wrote:
> 
> Hi, users:
> 
>     I have used sge for a while and I found a problem here.
> 
>     I have a 32 nodes cluster here. Some nodes in cluster have 4G memory
> and some nodes have 16G memory. Some jobs need at least 16G memory and
> the other jobs have no this limitation. In order to run the "16G memory"
> job correctly, I created a queue(we call this queue as queue A) which
> contains all 16G memory nodes and created another queue(we call this
> queue as queue B) contains the nodes left. Here comes the problem:
> 
>     When I submit a "non 16G memory need" job, the job can only run in
> queue B even if the queue A has no load -- obviously this is a resource
> waste. In another hand, if I don't define the queue A and queue B, the
> "non 16G memory need" job will run on the 16G memory nodes, that means,
> the "16G memory need" job cannot run on it and has to waiting.
> 
>     I don't know whether the sge has a dynamic schedule policy? I mean,
> if the queue A has no load, the "non 16G memory need" job can run on it;
> if the "16G memory need" job has been submitted, the "non 16G memory
> need" jobs will be paused and be migrated to queue B or, just in a
> simple way -- restart/reschedule these jobs.
> 
>     Thanks for any suggestions. If you cannot understand my meaning, I'd
> like to explain more details.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list