[GE users] Jobs Rushing the Slots

John Coldrick jc at axyzfx.com
Thu Jun 8 21:07:34 BST 2006

	Apologies if this is a FAQ, couldn't find it, but it seems like a common 

	We have various systems on the grid(sge-6.0u6), with varying amounts of 
memory and cpu/slots per machine.  When we submit jobs, typically it's in the 
hundreds at one go.  When working with large datasets, we'll need to specify 
a memory resource of, say, 0.9G.  In a perfect world, systems with 2 cpus and 
1G of memory would only allow one job to start.  However, in that initial 
rush, and in the startup period of the job, that system will appear to the 
system to have 2 slots each with a gig of memory, which isn't true of course.  
Two jobs start up simultaneously, and boom - jobs will eventually crash.

	Now, we have some PE's set up, one for example is called dual_hog, which will 
run ahead of non-dual hog systems and chew up both slots.  That's fine, 
however now the larger memory systems - 4G with 2 cpus - have wasted slots - 
they could easily run 2 jobs at once.

	With such a heterogeneous environment, is there a way to essentially ask SGE:

	1.  Run me alone on systems with less than xx G memory


	2. Run me with other jobs on systems with more than xx G memory.

	This works fine when there's already one job each running on each system, but 
in that initial submission rush, or when one job finishes on a small mem 
system, SGE will rush two jobs in.

	Thanks for any suggestions...



John Coldrick                  www.axyzfx.com        Axyz Animation
416-504-0425                                         425 Adelaide St W
                                                     Toronto, ON Canada
jc at axyzfx.com                                        M5V 1S4
The trouble with being punctual is that people think you have nothing
more important to do.

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list