Re: PITR failing to stop before DROP DATABASE - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: PITR failing to stop before DROP DATABASE
Date
Msg-id 5474B848.3060909@vmware.com
Whole thread Raw
In response to PITR failing to stop before DROP DATABASE  (Christoph Berg <cb@df7cb.de>)
Responses Re: PITR failing to stop before DROP DATABASE  (Tomas Vondra <tv@fuzzy.cz>)
Re: PITR failing to stop before DROP DATABASE  (Christoph Berg <cb@df7cb.de>)
Re: PITR failing to stop before DROP DATABASE  (Christoph Berg <cb@df7cb.de>)
List pgsql-hackers
On 11/25/2014 06:06 PM, Christoph Berg wrote:
> In 9.3.5, if I set up archiving, create a database, pull a base
> backup, look at the clock, drop database, stop the server, rm -rf
> datadir, put back the backup, edit recovery.conf:
>
> cd /tmp; initdb foo
> edit postgresql.conf with archive_mode = on, archive_command,
> max_wal_senders = 1, wal_level = hot_standby
> edit pg_hba.conf
> pg_ctl start
> psql -c 'create database db1'
> psql -c 'checkpoint'
> pg_basebackup -D /tmp/back1
> date
> Di 25. Nov 17:00:31 CET 2014
> psql -c 'drop database db1'
> pg_ctl stop
> rm -rf foo; cp -a back1 foo
>
> restore_command = 'cp /tmp/%f %p'
> standby_mode = on
> recovery_target_time = '2014-11-25 17:00:31'
>
> ... and then restart the server, I get:
>
> postgres=# \c db1
> FATAL:  database "db1" does not exist
> DETAIL:  The database subdirectory "base/70262" is missing.
>
> 2014-11-25 16:11:22 CET [15225-1] LOG:  database system was interrupted; last known up at 2014-11-25 15:56:51 CET
> 2014-11-25 16:11:22 CET [15225-2] LOG:  creating missing WAL directory "pg_xlog/archive_status"
> 2014-11-25 16:11:22 CET [15225-3] LOG:  entering standby mode
> 2014-11-25 16:11:22 CET [15225-4] LOG:  restored log file "000000010000000D0000007B" from archive
> 2014-11-25 16:11:22 CET [15225-5] LOG:  redo starts at D/7B000028
> 2014-11-25 16:11:22 CET [15225-6] LOG:  consistent recovery state reached at D/7B0000F0
> 2014-11-25 16:11:22 CET [15225-7] LOG:  restored log file "000000010000000D0000007C" from archive
> 2014-11-25 16:11:22 CET [15225-8] LOG:  recovery stopping before commit of transaction 60681, time 2014-11-25
15:59:57.071137+01
> 2014-11-25 16:11:22 CET [15225-9] LOG:  redo done at D/7C001110
> cp: cannot stat '/tmp/00000002.history': No such file or directory
> 2014-11-25 16:11:22 CET [15225-10] LOG:  selected new timeline ID: 2
> cp: cannot stat '/tmp/00000001.history': No such file or directory
> 2014-11-25 16:11:23 CET [15225-11] LOG:  archive recovery complete
> 2014-11-25 16:11:23 CET [15236-1] [unknown]@[unknown] LOG:  incomplete startup packet
> 2014-11-25 16:11:23 CET [15238-1] LOG:  autovacuum launcher started
> 2014-11-25 16:11:23 CET [15224-1] LOG:  database system is ready to accept connections
> 2014-11-25 16:11:41 CET [15268-1] postgres@db1 FATAL:  database "db1" does not exist
> 2014-11-25 16:11:41 CET [15268-2] postgres@db1 DETAIL:  The database subdirectory "base/70262" is missing.
>
> db1 is registered in pg_database, but the directory is missing on
> disk.

Yeah, DROP DATABASE cheats. It deletes all the files first, and commits 
the transaction only after that. There's this comment at the end of 
dropdb() function:

>     /*
>      * Force synchronous commit, thus minimizing the window between removal of
>      * the database files and commital of the transaction. If we crash before
>      * committing, we'll have a DB that's gone on disk but still there
>      * according to pg_database, which is not good.
>      */

So you could see the same after crash recovery, but it's a lot easier to 
reproduce with PITR.

This could be fixed by doing DROP DATABASE the same way we do DROP 
TABLE. At the DROP DATABASE command, just memorize the OID of the 
dropped database, but don't delete anything yet. Perform the actual 
deletion after flushing the commit record to disk. But then you would 
have the opposite problem - you might be left with a database that's 
dropped according to pg_database, but the files are still present on disk.

- Heikki




pgsql-hackers by date:

Previous
From: Adam Brightwell
Date:
Subject: Re: Role Attribute Bitmask Catalog Representation
Next
From: Stephen Frost
Date:
Subject: Re: Additional role attributes && superuser review