Thread: Corrupted database - how to recover
After two disks on our raid-5 system failed causing a filesystem failure. We could get one online again and the filesystem is at present usable but I cannot start postgresql at the moment: $ /usr/lib/postgresql/8.3/bin/postgres --single -D /etc/postgresql/8.3/main/ 2010-05-27 09:22:48 SAST FATAL: could not stat directory "base/16400": Structure needs cleaning 2010-05-27 09:22:48 SAST CONTEXT: xlog redo insert: rel 1663/16400/16438; tid 10960/41 in 'base' I see drwx------ 10 postgres postgres 102 2010-04-22 15:09 . drwx------ 10 postgres postgres 4096 2010-05-27 09:29 .. drwx------ 2 postgres postgres 4096 2010-05-20 02:11 1 drwx------ 2 postgres postgres 4096 2010-04-22 15:07 11510 ?????????? ? ? ? ? ? 16388 drwx------ 2 postgres postgres 4096 2010-05-20 02:11 16389 drwx------ 2 postgres postgres 4096 2010-05-20 02:11 16390 drwx------ 2 postgres postgres 4096 2010-05-20 02:11 16391 ?????????? ? ? ? ? ? 16400 I can just move the data-directory, do an initdb and use an older dump to repair the database as it was some time ago. Fortunately the data is not mission critical. But what if it was. How do I recover from a situation like this? Regards Johann -- Johann Spies Telefoon: 021-808 4599 Informasietegnologie, Universiteit van Stellenbosch "All that the Father giveth me shall come to me; and him that cometh to me I will in no wise cast out." John 6:37
hi johann, maybe you should consider using point in time recovery (pitr) if the database is mission critical. http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html regards andreas Johann Spies wrote: > After two disks on our raid-5 system failed causing a filesystem > failure. > > We could get one online again and the filesystem is at present usable > but I cannot start postgresql at the moment: > > $ /usr/lib/postgresql/8.3/bin/postgres --single -D /etc/postgresql/8.3/main/ > 2010-05-27 09:22:48 SAST FATAL: could not stat directory "base/16400": Structure needs cleaning > 2010-05-27 09:22:48 SAST CONTEXT: xlog redo insert: rel 1663/16400/16438; tid 10960/41 > > in 'base' I see > > drwx------ 10 postgres postgres 102 2010-04-22 15:09 . > drwx------ 10 postgres postgres 4096 2010-05-27 09:29 .. > drwx------ 2 postgres postgres 4096 2010-05-20 02:11 1 > drwx------ 2 postgres postgres 4096 2010-04-22 15:07 11510 > ?????????? ? ? ? ? ? 16388 > drwx------ 2 postgres postgres 4096 2010-05-20 02:11 16389 > drwx------ 2 postgres postgres 4096 2010-05-20 02:11 16390 > drwx------ 2 postgres postgres 4096 2010-05-20 02:11 16391 > ?????????? ? ? ? ? ? 16400 > > I can just move the data-directory, do an initdb and use an older dump > to repair the database as it was some time ago. Fortunately the data is > not mission critical. But what if it was. > > How do I recover from a situation like this? > > Regards > Johann >
On Thu, May 27, 2010 at 09:54:00AM +0200, Andreas Schmitz wrote: > > maybe you should consider using point in time recovery (pitr) if the > database is mission critical. > > http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html Thanks. This is the first time I experienced this type of problem. Fortunately the data was not that critical. I agree that pitr should be a part of any mission critical setup. Regards Johann -- Johann Spies Telefoon: 021-808 4599 Informasietegnologie, Universiteit van Stellenbosch "All that the Father giveth me shall come to me; and him that cometh to me I will in no wise cast out." John 6:37