[GE users] resource management (resending to the list)

Ron Chen ron_chen_123 at yahoo.com
Sun Apr 25 03:08:46 BST 2004


Sorry, I didn't think about that load sensors are not
attached to queues.

So, for SGE 5.3, you can use the load sensor on each
host to close/open the queues, depending on the
availablity of diskspace. (but the value is
hard-coded)

For SGE 6.0, you will get the "or operator", and thus
you can use "disk_a > x || disk_b > y". As for finding
out which disk to use, I think doing a "qstat -r"
inside the job would be able to find out which
resource SGE allocates for that job.

(So, if you use SGE 6.0, you don't need to have 1
queue per disk. And in fact, you can have only 1 queue
per cluster using "cluster queue"!)

 -Ron

--- "Ara.T.Howard" <Ara.T.Howard at noaa.gov> wrote:
> ok.  i've setup two queues for each machine (one for
> each machine:disk_{1,2} combo):
> 
>   eg.
> 
>     foreach machine in $machines
>       setup queue ${machine}_disk_1
>       setup queue ${machine}_disk_2
>     end
> 
> then i created two complexes, disk_1 and disk_2,
> both complexes have the
> following attributes
>   
>   ------------ -------- ------- ------ ----- ------
> ---------- -------
>   NAME         SHORTCUT TYPE    VALUE  RELOP REQ   
> CONSUMABLE DEFAULT
>   ------------ -------- ------- ------ ----- ------
> ---------- -------
>   njob         nj       INT     1      ==    FORCED
> YES        1                   
>   disk_free    df       MEMORY  0      <=    YES   
> YES        0                   
>   disk_tot     dt       MEMORY  0      <=    YES   
> YES        0                   
>   disk_used    du       MEMORY  0      >=    YES   
> YES        0                   
> 
> 
> for each of created queues, complex 'disk_1' is
> attached to ${machine}_disk_1
> and complex 'disk_2' is attached to
> ${machine}_disk_2
> 
> 
> my aim here, is obviously to be able to say
> 
>   qsub -l "disk_free=${required_space}" script
> 
> and have a job be scheduled, one at the time, in a
> queue with enough free
> disk.  remember, there two such possible disks per
> machine and either one will
> suffice.
> 
> here is where i'm running into problems, i have a
> load monitor which reports
> on the various disk quantities (modified from doc's
> example 'tmpspace.sh').
> the trouble is, it seems you can only associate a
> load_monitor with a host,
> not a host and a queue or only a queue.  so i seem
> to have two alternatives,
> neither of which would work:
> 
>   0) there are two load_monitors per machine, one
> which reports on disk_1 and
>   one which reports on disk_2, each could output
> key=val pairs like
> 
>     ...
>       disk_1_free=123435
>       disk_1_used=54321
>     ...
> 
>   or
> 
>     ...
>       disk_2_free=123435
>       disk_2_used=54321
>     ...
> 
>   depending on which disk.  note, however that they
> both cannot output key=val
>   pairs like
> 
>     ...
>       disk_free=123435
>       disk_used=54321
>     ...
> 
>   since they would be clobbering each other's
> output!  additionally it seems
>   you cannot actually run two load monitors on a
> machine...
> 
> 
>   1) one load monitor per machine, it reports on
> both disk_1 AND disk_2 with
>   key=val pairs like
> 
>     ...
>       disk_1_free=123435
>       disk_1_used=54321
>       disk_2_free=123435
>       disk_2_used=54321
>     ...
> 
>   using this method there is no way to de-multiplex
> these values into the
>   'disk_free', 'disk_used', etc. attributes
> associated with each complex that
>   sge will be looking for in the output of the load
> monitors...
> 
> 
> so, perhaps i am being dense (note that i have very
> little experience with
> sge), but i've re-read your original post and most
> of the docs and it seems
> like this again boils down to the inability to do
> OR'ing of resources:
> 
> this is my current understanding :
> 
>   - exactly ONE load monitor may be configured per
> host
> 
>   - the load monitor must output unique key=val
> pairs
> 
>     eg
> 
>       ...
>         disk_1_free=12345
>         disk_2_free=12345
>       ...
> 
>   - there is no way to merge (OR) key=val pairs. 
> either via setting up
>     multiple queues, multiple complexes, or from the
> qsub command line
> 
> eg.  i think i'm stuck.  hopefully you can enlighten
> me! ;-)
> 
> -a
> 
> > 
> > --- "Ara.T.Howard" <Ara.T.Howard at noaa.gov> wrote:
> > > if i understand you correctly, this is
> problematic
> > > for two reasons:
> > > 
> > >   - i cannot know apriori which host to run on
> > >   - i cannot know apriori which disk of a
> particular
> > > host to run on
> > > 
> > > 
> > > in otherwords, given the above, how would i say
> > > 
> > >   qsub -l 'any host' -l 'any of two disks' job
> > > 
> > > it seems i would need to know both the host and
> disk
> > > to submit to, wouldn't i?
> > > 
> > > i would like to associate each of the disks with
> a
> 
=== message truncated ===



	
		
__________________________________
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25?
http://photos.yahoo.com/ph/print_splash

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list