[GE users] Maintaining good memory usage

Andreas Haas Andreas.Haas at Sun.COM
Fri Apr 30 13:53:09 BST 2004


You could use "h_vmem" as host-based consumable. For oversubscribing
your SF6800 memory-wise you specifiy e.g. h_vmem=50G rather than 44G.
To enforce an upper limit for particular users you could use an
additional queue-based "h_vmem" consumable with different access
lists priviledges for different users. Use of "h_vmem" is recommended
b/c for "h_vmem" Grid Engine execd tracks/enforces memory consumption
on a per job basis rather than only on a per process basis.

Well, for ruling out jobs starve due to large memory request you
require 6.0 resource reservation.

Cheers,
Andreas


On Fri, 30 Apr 2004, Aaron Turner wrote:

> Hello,
>
> We have a shared memory (SunFire 6800) machine with
> 20 processors and 44GB RAM.
>
> With a series of queues of varying allowable run
> times, appropriate subordination and suspend thresholds
> we are getting a good, constant CPU load with few
> problems.
>
> However, we are having some problems keeping memory
> load down to acceptable levels as users are running
> large jobs.
>
> What we would like to do is:-
>     1. Continue to maximise CPU load as much as possible
>     2. Keep memory usage within limits
>     3. Allow users as much flexibility as possible (i.e.
>        try to accomodate those wanting to run large memory
>        jobs)
>
>
> Currently most queues have a 4GB stack size limit, which
> accomodates most users nicely without having to have too
> great a plethora of queues and too complex subordination.
> However # slots * stack size is greater than 44GB. Typically
> many users run jobs with a memory footprint of rather less
> than 4GB, though.
>
> We have a single slot queue enabled for large memory jobs
> up to 16GB, and so the total memory usage possible is very
> much greater than the available memory.
>
> Queue selection for users is via -l h_rss etc, with the users
> suggesting what they think their memory usage will be for
> that job.
>
> What is the simplest way of keeping on top of the memory
> usage issue, both from my point of view and for that of
> the users? My initial thought is to create a consumable
> resource for memory, for the host, that users can request.
> However there is no guarantee that users will actually
> request an amount of memory that is accurate, and so users
> may be effectively locked out by users requesting more
> memory than they need. This would then reduce throughput.
> Also I need a mechanism to prevent most users from requesting
> more than 4GB so I can control the users allowed to submit
> very large memory jobs, again to ensure that throughput is
> maintained.
>
> Any hints?
>
> Thanks
>
>     Aaron Turner
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list