[GE users] pe_slots issues

John Coldrick jc at axyzfx.com
Fri Nov 2 13:32:24 GMT 2007


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

On Thursday 01 November 2007 17:43, Reuti wrote:
> Am 01.11.2007 um 21:16 schrieb John Coldrick:
> > 	I've got a PE set up:
> > ***
> > 	pe_name           m9_1
> > 	slots             999
> > 	user_lists        NONE
> > 	xuser_lists       NONE
> > 	start_proc_args   /bin/true
> > 	stop_proc_args    /bin/true
> > 	allocation_rule   $pe_slots
>
> You might want to try $round_robin All possible values are listed in
> man sge_pe.

	Thanks for getting back...my apps can't run cross-system, they all must run 
on a single system, which is why I'm using $pe_slots.  I assume that's the 
one I should be using, right?  If I do use round_robin, it splits the 4 slots 
up over multiple systems, as it should.

> No. With $pe_slots all slots must come from one node, and maybe there
> is already something else running, so you can't get more. You defined
> 8 also for the slot count in the queue configuration for these three
> nodes?

	Correct.  I've got complete control of the grid, so nothing else is running 
when I'm testing, and all the systems have all their slots open and 
available.


> Often advisable with parallel jobs is to request reservation with "-R
> y" in qsub and set a sensible value for "max_reservation" in the
> scheduler configuration.

	That makes no difference - I've tried reserving along with variations of 0-8 
for the max reservation, and the behaviour is the same.  

	Is there anywhere else where a maximum ceiling of three(or 'n') slots could 
exist?  SGE6.0 worked fine with this, I like to keep current if I can, 
though.  :)  I can't help but think there's a new default somewhere that 
didn't exist in 6.0 that's I'm getting caught up on.  Just to check - getting 
that message that the PE only offers '0' slots - isn't that indicative of 
something being very wrong?  If I qalter the existing job to 3 slots, off it 
goes, it runs using the PE.  It seems fundamentally wrong to me that this 
message shows up at all given that, unless it's more generic than I'm 
assuming and it's a catchall for numerous variables failing, such as load or 
mem(which, btw, is fine, provable by requesting 3 slots running fine).

	Any suggestions appreciated...

	Cheers,

	J.C.
-- 
John Coldrick                  www.axyzfx.com        Axyz Animation
416-504-0425                                         477 Richmond St W
                                                     Toronto, ON Canada
jc at axyzfx.com                                        M5V 3E7
-----------------------------------------------------------------------
"Life is too important to take seriously."
		-- Corky Siegel

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list