[GE users] Problem with Complexes and disabling queues

Richard Hobbs richard.hobbs at crl.toshiba.co.uk
Thu Dec 15 12:30:47 GMT 2005


Hello,

Output as requested:

============================================================
[root at stg2 root]# qhost -h stg-tts1 -F
HOSTNAME             ARCH       NPROC  LOAD   MEMTOT   MEMUSE   SWAPTO
SWAPUS
----------------------------------------------------------------------------
---
global               -              -     -        -        -        -
-
   hv:arch=none
   hv:num_proc=1.000000
   hv:load_avg=99.990000
   hv:load_short=99.990000
   hv:load_medium=99.990000
   hv:load_long=99.990000
   hv:np_load_avg=99.990000
   hv:np_load_short=99.990000
   hv:np_load_medium=99.990000
   hv:np_load_long=99.990000
   hv:mem_free=0.000000
   hv:mem_total=0.000000
   hv:swap_free=0.000000
   hv:swap_total=0.000000
   hv:virtual_free=0.000000
   hv:virtual_total=0.000000
   hv:mem_used=infinity
   hv:swap_used=infinity
   hv:virtual_used=infinity
   hv:swap_rsvd=0.000000
   hv:swap_rate=0.000000
   hv:slots=0.000000
   hv:s_vmem=0.000000
   hv:h_vmem=0.000000
   hv:s_fsize=0.000000
   hv:h_fsize=0.000000
   hv:cpu=0.000000
stg-tts1             glinux         4  0.00  1005.8M   203.2M     2.0G
756.0K
   hl:arch=glinux
   hl:num_proc=4.000000
   hl:load_avg=0.000000
   hl:load_short=0.000000
   hl:load_medium=0.000000
   hl:load_long=0.000000
   hl:np_load_avg=0.000000
   hl:np_load_short=0.000000
   hl:np_load_medium=0.000000
   hl:np_load_long=0.000000
   hl:mem_free=802.59M
   hl:mem_total=1005.83M
   hl:swap_free=2.00G
   hl:swap_total=2.00G
   hl:virtual_free=2.78G
   hl:virtual_total=2.98G
   hl:mem_used=203.23M
   hl:swap_used=756.00K
   hl:virtual_used=203.97M
   hv:swap_rsvd=0.000000
   hv:swap_rate=0.000000
   hv:slots=0.000000
   hv:s_vmem=0.000000
   hv:h_vmem=0.000000
   hv:s_fsize=0.000000
   hv:h_fsize=0.000000
   hl:cpu=0.100000
   hc:mem_slot=4.000000
[root at stg2 root]#
============================================================

Am I to understand that the default "slots" complex is designed to to
exactly what we are trying to with "mem_slot"? Is it a definitive maximum
number of slots per machine, which will *never* be exceeded by GridEngine?

Also, given that our value for "slots" is currently set to zero, how would I
start to use this feature if I set it to 4?

Thanks again,
Richard.

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Web: http://www.toshiba-europe.com/research/
Normal Email: richard.hobbs at crl.toshiba.co.uk
Mobile Email: mobile at mongeese.co.uk
Tel: +44 1223 376964        Mobile: +44 7811 803377 

> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de] 
> Sent: 14 December 2005 21:07
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Problem with Complexes and disabling queues
> 
> Hi,
> 
> Am 14.12.2005 um 17:22 schrieb Richard Hobbs:
> 
> > Hello,
> >
> > We have various queues configured on various hosts. Each 
> host has a  
> > complex
> > setup as a consumable resource, named "mem_slot". The value of  
> > "mem_slot" is
> > 4. Basically, we have many queues on each machine, but only 
> 4 CPUs,  
> > and this
> > consumable is therefore designed to stop too many jobs running on  
> > one host.
> >
> > Each queue (using 'qconf -mq queuename') then has a value for  
> > "mem_slot",
> > which is 1.
> >
> > Also, each submitted job uses "-l mem_slot=1" to requests one  
> > mem_slot.
> >
> > This works fine.
> >
> > However, if I disable a queue with a running job in order to stop  
> > more jobs
> > being submitted to this queue, it releases the mem_slot, and 5th  
> > job will
> > enter the machine even if the previous jobs are all still running.
> >
> > It's almost as if disabling a queue releases the resources even  
> > though the
> > job is still active and running.
> >
> > This seems like a bug...
> >
> > Can anyone confirm having seen this? Is there a fix? Is there a  
> > workaround?
> 
> we are also using complexes, but I don't see this behavior in u6  
> (which is your version?). Can you check this by issuing:
> 
> qhost -h <nodename> -F
> 
> But anyway, you don't need this mem_slot at all I think. If I  
> understand you in the correct way, you could just attach the default  
> complex "slots" to your exec nodes with a value set to 4.
> 
> Cheers - Reuti
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 
> _____________________________________________________________________
> This e-mail has been scanned for viruses by MCI's Internet 
> Managed Scanning Services - powered by MessageLabs. For 
> further information visit http://www.mci.com
> 
> 



_____________________________________________________________________
This e-mail has been scanned for viruses by MCI's Internet Managed Scanning Services - powered by MessageLabs. For further information visit http://www.mci.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list