Thread: Continuous archiving fails

Continuous archiving fails

From
David Darville
Date:
Hello everybody

While testing a continuous archiving setup using PostgreSQL 8.2.3, on Debian
Etch amd64, I found out that the slave database crashed when I did a 'DROP
DATABASE' on the master.

I was trying to stress test our setup, by continuously restoring a dump of
our database, dropping the database, restoring it etc.
But when I dropped the database I found out that the slave database crased,
leaving log messages like these:

....
LOG: restored log file "000000010000004F000000F0" from archive
LOG: restored log file "000000010000004F000000F1" from archive
LOG: restored log file "000000010000004F000000F2" from archive
LOG: restored log file "000000010000004F000000F3" from archive
LOG: could not fsync segment 0 of relation 19820534/105758957/125593540: No
such file or directory
CONTEXT: xlog redo checkpoint: redo 4F/F3859B60; undo 0/0; tli 1; xid
0/84778; oid 125601021; multi 1; offset 0; online
FATAL: storage sync failed on magnetic disk: No such file or directory
CONTEXT: xlog redo checkpoint: redo 4F/F3859B60; undo 0/0; tli 1; xid
0/84778; oid 125601021; multi 1; offset 0; online
LOG:  startup process (PID 16101) exited with exit code 1
LOG: aborting startup due to startup process failure



---
David Darville

Re: Continuous archiving fails

From
Tom Lane
Date:
David Darville <ml@darville.vm.bytemark.co.uk> writes:
> While testing a continuous archiving setup using PostgreSQL 8.2.3, on Debian
> Etch amd64, I found out that the slave database crashed when I did a 'DROP
> DATABASE' on the master.

Thanks for the report.  I believe this will fix it:

Index: dbcommands.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/dbcommands.c,v
retrieving revision 1.187.2.1
diff -c -r1.187.2.1 dbcommands.c
*** dbcommands.c    27 Jan 2007 20:15:47 -0000    1.187.2.1
--- dbcommands.c    12 Apr 2007 14:40:40 -0000
***************
*** 1438,1443 ****
--- 1438,1446 ----
          /* Also, clean out any entries in the shared free space map */
          FreeSpaceMapForgetDatabase(xlrec->db_id);

+         /* Also, clean out any fsync requests that might be pending in md.c */
+         ForgetDatabaseFsyncRequests(xlrec->db_id);
+
          /* Clean out the xlog relcache too */
          XLogDropDatabase(xlrec->db_id);



            regards, tom lane

Re: Continuous archiving fails

From
David Darville
Date:
On Thu, Apr 12, 2007 at 11:05:40AM -0400, Tom Lane wrote:
> David Darville <ml@darville.vm.bytemark.co.uk> writes:
> > While testing a continuous archiving setup using PostgreSQL 8.2.3, on Debian
> > Etch amd64, I found out that the slave database crashed when I did a 'DROP
> > DATABASE' on the master.
>
> Thanks for the report.  I believe this will fix it:

The patch did indeed fix it, and my test setup has now been running for 2
days straight without any problems ;-)


---
David Darville