[GE users] contemporary of -l slots=#

Raphael Y. Rubin rafi at cs.caltech.edu
Wed Mar 30 21:15:57 BST 2005


I guess I'm a little unclear as to what the "logic behind" is, I
wouldn't mind a short explanation.

So are you saying the PE warning basically assumes the user intended to
use a PE and simply forgot?  And creating a PE to specify the number of
slots to reserve is bad, at least if there's no other driving purpose?

I'm not so sure "virtual_free" is a safe alternative.  It seems to vary
too wildly, even on unloaded systems.

Furthermore it doesn't seem to garuntee grid exclusivity (for the host), which is what
our users occassionally need.


I guess this gets back to the question of intent?  Is the slots thing
specifically discouraged because there is an assumption that users
shouldn't have the right to monopolize a machine?  Or is there something
else going on?

Rafi

On Wed, Mar 30, 2005 at 09:53:37PM +0200, Reuti wrote:
> Hi Raphael,
> 
> it's personal taste, but I wouldn't use any of the two options you offered - 
> both are not refelecting the logic behind. Although: what I suggest is similar 
> to the second:
> 
> - make the complex "virtual_free" consumable and requestable - default 1GB
>                                                        (or what you like)
> 
> - attach this to each node with a value of the built in memory
>                                     (complex_values   virtual_free=3.5GB)
> 
> - request 3.5GB in your qsub command -> single job on the node
> 
> 
> With 4GB built in only 3.5GB are usable I think. Yes, it's nearly the same as 
> your vslots, but this can also be used for a real request of just 2GB for 
> larger jobs.
> 
> Cheers - Reuti
> 
> PS: IMO it's good to disallow the request for slots, to remind users to request 
> a PE - maybe they forgot it by accident.
> 
> 
> Quoting "Raphael Y. Rubin" <rafi at cs.caltech.edu>:
> 
> > I would like to configure my grid so that mortal users can grab exclusive
> > access to a machine, using the normal submittion commands with little extra
> > work.
> > 
> > Occassionally, a user wants to run a job exclusively for benchmarking,
> > whether that's to test a class of machine or a specific job.
> > 
> > Also some jobs we know will be resource hogs, and we'd like to annotate them
> > to indicate they are the equivalent of two or more normal jobs.
> > 
> > And of course there are various other nees that arrise, but the above two are
> > the most common and important.  In the past we just use to specify "-l
> > slots=n".  But as of sge5.3 that was discouraged.
> > 
> > 	error: denied: use parallel environments instead of requesting slots
> > explicitly
> > 	
> > 		- from sge6
> > 
> > In sge 5.3, I had created a slots pe, after we first noticed the messages 
> > about -l slots.  Here is an updated version of that pe (in a form for sge 
> > 6).
> > 
> > pe_name           slots
> > slots             999
> > user_lists        NONE
> > xuser_lists       NONE
> > start_proc_args   /bin/true
> > stop_proc_args    /bin/true
> > allocation_rule   $pe_slots
> > control_slaves    FALSE
> > job_is_first_task TRUE
> > urgency_slots     min
> > 
> > Alternatively, one can use a consumable complex, as described  in:
> > 
> http://gridengine.sunsource.net/servlets/BrowseList?list=users&by=thread&from=2
> 530
> > 
> > or more simply:
> > #name               shortcut   type        relop requestable consumable 
> > default  urgency
> > 
> #------------------------------------------------------------------------------
> ----------
> > vslots              vs         INT         <=    YES         YES        1
> > 1000
> > 
> > Which is of course just the normal complex slots copied to a different 
> > name to get around the explicit "-l slots" block.  Somehow this seems 
> > wrong, a reincarnation of a technique deliberately killed for some reason 
> > unknown to me.
> > 
> > 
> > 
> > Which style is prefered and why?
> > What are the ramifications?
> > Are there any behavioral differences?
> > 
> > 
> > As for the prefered option, any suggestions to improve the above
> > configurations?
> > Also what's the best way to deploy, either globally, or to a set of queues?
> > 
> > 
> > I know with sge 5.3, I was able to use:
> > queue_list        all
> > To deploy to my whole cell.
> > 
> > 
> > 
> > Also on a not complete tangent.  Does anyone have advice, or has anyone 
> > written guidelines to optimize configuration of queues?  We are mostly 
> > using dual cpu xeons with hyperthreading and 4G of ram.
> > 
> > Our jobs are mostly java, c, and lisp, single threaded (except the jvm 
> > which forks its own stuff).  Jobs mostly run in a few hundred MB or less, 
> > with a occasional memory hog which will eat a gig or two.
> > 
> > 
> > Rafi Rubin
> > California Institute of Technology
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list