[GE users] 16GB and 4GB Queue?

reuti reuti at staff.uni-marburg.de
Tue Dec 9 16:15:35 GMT 2008


Hi,

Am 09.12.2008 um 16:28 schrieb seca2 at freenet.de:

> i hava a question. We have a System with 9 hosts, every host has  
> quad-core cpu and 16gb ram. Now we want to have 2 queues, so that  
> users can send to one queue if they want the 4 cpus and 16 GB Ram,  
> so on a machine with such a job running, there shouldn't start any  
> other job. If users send to another queue it should use one of the  
> cpus and 4gb ram, it should then be possible to run 4 jobs on one  
> machine.
>
> is this possible? i don't get through this consumables and complex  
> configuration.
>
> i have set h_vmem for every host to 16gb. i have made complex  
> configuration h_vmem requestable, default 4gb. i have created 2  
> queueus, one nodes_fullmem.q one nodes_light.q. i have set for  
> nodes_fullmem.q the slots to 1 in nodes_light.q the slots to 4. no  
> i thought if someone requests h_vmem=16gb he should automatically  
> get a slot of nodes_fullmem.q if he requests h_vmem=4gb he should  
> get nodes_light.q.

perfect, but you don't need two queues. Having this, you will just  
need to

a)

submit with:

$ qsub -l h_vmem=16gb myjob.sh

and as all memory is used up, nothing else can start there anymore  
(you need only one queue, but it would not represent your intended  
logic to run a parallel job).

or b)

setup a PE (parallel environment). The resource request will be  
multiplied, and if you request four slots you will also get the full  
16GB by requesting 4 slots (as you wrote gb: G = base 1024, g = base  
1000 [`man sge_types`]).

$ qsub -pe smp 4 myjob.sh

Often a PE, which will give you slots from only one node, is called  
"smp". After starting "qconf -ap smp" you call stay with the default  
settings (just change the slots entry to the complete number i.e. 36)  
and attach it to a queue with `qconf -mq your.q` in the entry  
"pe_list". You need only one queue for this setup also. Idea in SGE  
is to request resources you need for your job, and not (like Torque)  
to submit into a queue.

If you want to run real a parallel job, you will need a PE anyway to  
support the various parallel libraries like Open MPI, PVM, MPICH 
(1/2), OpenMP... to get a so called Tight Integration of your  
parallel jobs, where SGE controls the spawned processes and to get a  
correct accouting.

===

Next step would be to avoid starvation of a parallel job by serial  
jobs slipping in always. For this you can set "max_reservation  10"  
and submit the jobs with:

qsub -pe smp 4 -R y myjob.s

(this could also be put in a sge_request (`man sge_request`)

-- Reuti


> hope someone can helps!
>
> thx in advance
>
> bubbas
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=91958
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=91966

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list