[GE users] Fixing a broken Berkeley database?
orlando.richards at ed.ac.uk
Thu Oct 16 12:44:16 BST 2008
skip at pobox.com wrote:
> Orlando> I'm guessing that there is a corrupt entry in the database - is
> Orlando> there any simple way to repair it?
> If you installed Berkeley DB from source (or perhaps installed a dev RPM on
> a Linux system) you should have a program called db_recover (or maybe
> dbXY_recover where X and Y are the major and minor version numbers). If
> your database is corrupt that will quite possibly fix it.
> Longer term the more important question to answer is how it got corrupted in
> the first place. Make sure multi-program access to the database is either
> mediated by a single program your clients communicate with or that your
> programs use some sort of file locking scheme to prevent multiple
> simultaneous accesses.
Thanks for that Skip - I've installed the db4-utils package on our
RedHat box that includes db_recover. However - we're not running a
separate Berkeley DB server, but instead the file-based option that SGE
has "bundled" up with it, so I'm a bit confused as to how to recover the
database. From what I can tell, db_recover will replay log files to
rebuild a database. Unfortunately, the corruption seems to have happened
some time ago (around 3 months ago from what we can tell from the logs),
and the database logs don't exist for anything before the immediate past.
Do you know if there's a way for the integrity of the database to be
checked and/or repaired? I'm slightly suspicious that the database
entries might look fine to an unaware integrity checker, and only SGE is
able to tell (or a human if we could see the contents) that there is a
We suspect that the corruption was down to an unclean SGE qmaster crash
(or several of them) - we had a spate of crashes due to a memory leak
and some oddly formatted user jobs. We use GPFS for the file system
storing the database files, which implements fully posix compliant locking.
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users