[GE users] Fixing a broken Berkeley database?

Orlando Richards orlando.richards at ed.ac.uk
Thu Oct 16 12:44:16 BST 2008


skip at pobox.com wrote:
>     Orlando> I'm guessing that there is a corrupt entry in the database - is
>     Orlando> there any simple way to repair it?
> 
> If you installed Berkeley DB from source (or perhaps installed a dev RPM on
> a Linux system) you should have a program called db_recover (or maybe
> dbXY_recover where X and Y are the major and minor version numbers).  If
> your database is corrupt that will quite possibly fix it.
> 
> Longer term the more important question to answer is how it got corrupted in
> the first place.  Make sure multi-program access to the database is either
> mediated by a single program your clients communicate with or that your
> programs use some sort of file locking scheme to prevent multiple
> simultaneous accesses.
> 

Thanks for that Skip - I've installed the db4-utils package on our 
RedHat box that includes db_recover. However - we're not running a 
separate Berkeley DB server, but instead the file-based option that SGE 
has "bundled" up with it, so I'm a bit confused as to how to recover the 
database. From what I can tell, db_recover will replay log files to 
rebuild a database. Unfortunately, the corruption seems to have happened 
some time ago (around 3 months ago from what we can tell from the logs), 
and the database logs don't exist for anything before the immediate past.

Do you know if there's a way for the integrity of the database to be 
checked and/or repaired? I'm slightly suspicious that the database 
entries might look fine to an unaware integrity checker, and only SGE is 
able to tell (or a human if we could see the contents) that there is a 
problem.

We suspect that the corruption was down to an unclean SGE qmaster crash 
(or several of them) - we had a spate of crashes due to a memory leak 
and some oddly formatted user jobs. We use GPFS for the file system 
storing the database files, which implements fully posix compliant locking.


--
Orlando.

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list