[GE users] Best practices for redundant farm?

Ron Chen ron_chen_123 at yahoo.com
Thu Jun 17 14:21:22 BST 2004


If you have one big SGE qmaster that manages 2
different sites, and if the link between the 2 is
down, the shadow master won't work, since the job
spool files are not accessible from the other side.

Other problems are not difficult to solve, there are
setting like: "reschedule_unknown", "rerun", and
others to solve the problem that one site can't see
the other and still be able to handle the situration:
either rerun the jobs after X minutes or ignore the
problem.

 -Ron

--- John Ross <jhr at fenks.org> wrote:
> Hello.
> 
> I will soon be setting up a processing farm using
> Grid Engine 6
> 
> One of the things I'm trying to figure out how to
> setup the primary and
> backup sites.
> 
> The problem is a bit more then simply handling when
> the master goes down -
> we need to handle a situation when the master and a
> good portion of the
> farm disappears.
> 
> We'll have enough CPU at each site to finish the job
> should the other site
> go down, but we would still like to use all the
> resources whenever
> possible.
> 
> If I use a shadow master at the backup site, what
> would it do to any jobs
> that were running on the machines that it lost
> visibility to?
> 
> Would it be a better idea to build 2 plexes, with a
> global master (And
> shadow master)
> Again, how does the global master deal with any jobs
> that were running on
> the plex that just disappeared?
> 
> Any other ideas or thoughts?
> 
> -- 
> John Ross
> jhr at fenks.org
> 
> There's plenty of room for all God's creatures.
> Right next to the mashed potatoes.
> 	- Billboard ad for Saskatoon Restaurant
> 		Greenville, SC
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> 
> 



		
__________________________________
Do you Yahoo!?
Yahoo! Mail is new and improved - Check it out!
http://promotions.yahoo.com/new_mail

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list