Thread: Recovering a deleted database problem

Recovering a deleted database problem

From

"Andy Shellam (Mailing Lists)"

Date:

05 January 2007, 07:12:55

Earlier this evening I made the usual mistake someone makes at some
point in their lives - and dropped a database thinking I didn't need it,
then realised later I did.
So, because I have DDL statement logging turned on, I could find the
exact time/date it happened, and attempted to restore from my
file-system level backup taken at 2am this morning, and rolled forward
all my WAL logs archived throughout the day (98 files.)

In the recovery.conf, I specified the date/time from the log file that
the database was dropped and set recovery_target_inclusive so it would
not include this transaction.

However the restore has finished, and PostgreSQL thinks the database is
there, but the relevant data directory in "base" is missing - so it's
removed the file-system objects but not the system database entry.
I've checked the base backup, and this directory is in the backup, hence
it has been removed at some point during the restore.

What I'm going to do now is to set the recovery target to about a minute
earlier to make sure the transaction has not started when the recovery
finishes - but I'm just asking if I'm missing something obvious, as this
is the first time I've done a restore from WAL logs.

(Note, after writing this, I tried restoring to a minute earlier (ie.
18:57:40) and still have the same problem.
As a quick fix, I copied the base/35290 directory from the backup before
I had run the recovery - does anyone know any caveats to doing this, as
the DB seems to be working OK?)

My recovery.conf is:

# PostgreSQL database recovery config file

restore_command = 'cp /path/to/wal/archive/%f "%p"'
recovery_target_time = '"2007-01-04 18:58:40 -00:00"'
recovery_target_inclusive = 'false'

The log entry where I discovered the date/time is:

2007-01-04 18:58:40 GMT 84.45.66.158 postgres postgresql - LOG:
statement: DROP DATABASE [dbname];

The error I get when I try to connect to [dbname] after the restore is:

FATAL: database "[dbname]" does not exist
DETAIL: The database subdirectory "base/35290" is missing.

But the [dbname] database is still in the system catalogues:

/usr/local/pgsql/bin/psql -U postgresql -d postgres -c "select datname
from pg_database;"
    datname
----------------
 postgres
 [db1]
 template1
 template0
 [dbname]
 [db2]
 [db3]
 [db4]
(8 rows)


This is PostgreSQL 8.1.5 on FreeBSD 6.1.

Many thanks,

--
Andy Shellam
NetServe Support Team

the Mail Network
"an alternative in a standardised world"

Re: Recovering a deleted database problem

From

Tom Lane

Date:

05 January 2007, 11:09:10

"Andy Shellam (Mailing Lists)" <andy.shellam-lists@mailnetwork.co.uk> writes:
> (Note, after writing this, I tried restoring to a minute earlier (ie.
> 18:57:40) and still have the same problem.

The PITR recovery process in effect rolls forward until it finds
a transaction-commit record >= the specified time.  Now for normal
database operations, stopping just short of the commit of the
transaction is enough to ensure that the transaction has no effect.
But for the XLOG_DBASE_DROP record, not so --- replaying that means
"rm -rf base/whatever".  So you've got to make sure the replay stops
before it reaches that record, and that means you need a stop time
<= the commit time of some *prior* transaction.  I suppose this was
a slow time of day and you didn't have any other commits in the prior
minute :-( ... so take another look in the log and see what was the
last commit before that, and use that time.

            regards, tom lane

Re: Recovering a deleted database problem

From

"Andy Shellam (Mailing Lists)"

Date:

05 January 2007, 11:26:00

Thanks for the info Tom, too much data will have been entered into the
other databases in the cluster by now so I cannot give it another shot
on that server, plus all of yesterday's WAL logs will have been purged
by now by the daily backup routine.

Is it enough to simply have re-copied in the base/xxx directory from the
base backup, after the PITR recovery had completed (obviously any
changes made to that database since the base backup won't have been
restored but thankfully it's backed up nightly and doesn't change too
often :-) )  All CRUD operations seem to be working on that database OK
and the app that (I now know) uses it hasn't complained.

What I'll probably do is try to simulate the same process again on a
different machine to get myself a bit more familiar.  Is there any other
situations you can think of where this may also be relevant, or is it
just when dropping a complete database?

Many thanks,

Andy.

Tom Lane wrote:
> "Andy Shellam (Mailing Lists)" <andy.shellam-lists@mailnetwork.co.uk> writes:
>
>> (Note, after writing this, I tried restoring to a minute earlier (ie.
>> 18:57:40) and still have the same problem.
>>
>
> The PITR recovery process in effect rolls forward until it finds
> a transaction-commit record >= the specified time.  Now for normal
> database operations, stopping just short of the commit of the
> transaction is enough to ensure that the transaction has no effect.
> But for the XLOG_DBASE_DROP record, not so --- replaying that means
> "rm -rf base/whatever".  So you've got to make sure the replay stops
> before it reaches that record, and that means you need a stop time
> <= the commit time of some *prior* transaction.  I suppose this was
> a slow time of day and you didn't have any other commits in the prior
> minute :-( ... so take another look in the log and see what was the
> last commit before that, and use that time.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>
> !DSPAM:37,459e6a32137101648020742!
>
>
>

--
Andy Shellam
NetServe Support Team

the Mail Network
"an alternative in a standardised world"

Re: Recovering a deleted database problem

From

Tom Lane

Date:

05 January 2007, 11:50:02

"Andy Shellam (Mailing Lists)" <andy.shellam-lists@mailnetwork.co.uk> writes:
> Is it enough to simply have re-copied in the base/xxx directory from the
> base backup, after the PITR recovery had completed (obviously any
> changes made to that database since the base backup won't have been
> restored but thankfully it's backed up nightly and doesn't change too
> often :-) )

Well, I'd be a little worried about whether the base backup was
self-consistent, but if it was taken at a time where the DB was idle
then you can probably get away with this.

> What I'll probably do is try to simulate the same process again on a
> different machine to get myself a bit more familiar.  Is there any other
> situations you can think of where this may also be relevant, or is it
> just when dropping a complete database?

AFAIK the only operations that have non-rollbackable side effects are
CREATE/DROP DATABASE and CREATE/DROP TABLESPACE.  For any of these,
you'd end up with inconsistent state if you try to stop replay just
before the commit record.

            regards, tom lane

Re: Recovering a deleted database problem

From

"Andy Shellam (Mailing Lists)"

Date:

05 January 2007, 14:44:13

> Well, I'd be a little worried about whether the base backup was
> self-consistent, but if it was taken at a time where the DB was idle
> then you can probably get away with this.
>

It gets backed up at 2am in the morning and AFAIK there'd be very few
(if any) transactions going through until about 5am, so it should be OK.
> AFAIK the only operations that have non-rollbackable side effects are
> CREATE/DROP DATABASE and CREATE/DROP TABLESPACE.  For any of these,
> you'd end up with inconsistent state if you try to stop replay just
> before the commit record.
>

OK thanks for that Tom, after glancing through the online documentation,
I've enabled logging of checkpoints etc. by setting "log_min_messages"
to "log" - is this enough to log the times that transactions are
committed, so I can find this info should I need it again in the future?

Thanks,

--
Andy Shellam
NetServe Support Team

the Mail Network
"an alternative in a standardised world"