Re: [GENERAL] Recovery Assistance - Mailing list pgsql-general

From Adrian Klaver
Subject Re: [GENERAL] Recovery Assistance
Date
Msg-id d5fd4dfc-74f4-59ea-9288-054a0cbc3ce1@aklaver.com
Whole thread Raw
In response to Re: [GENERAL] Recovery Assistance  (Brian Mills <brian@trybooking.com>)
List pgsql-general
On 01/29/2017 03:12 PM, Brian Mills wrote:
> OK. I think I'm on to something here, I first reset back to my file
> level backup.
> I created a recovery.conf file in the root of the data directory like this:
> ---------------------
> restore_command = 'cp /mnt/archive/%f %p'
> recovery_target_time = '2017-01-24 02:08:00.023064+11'
> recovery_target_inclusive = 'true'
> pause_at_recovery_target = 'false'
> ---------------------
> Note, the archive directory had no files in it, I left the WAL files in
> the pg_xlog directory.
>
> Then I started up the database again:
> postgres@atlassian:~/9.3/main$ /usr/lib/postgresql/9.3/bin/pg_ctl -D
> /etc/postgresql/9.3/main start
> server starting
> postgres@atlassian:~/9.3/main$ 2017-01-30 10:07:28 AEDT LOG:  database
> system was interrupted while in recovery at 2017-01-27 20:13:26 AEDT
> 2017-01-30 10:07:28 AEDT HINT:  This probably means that some data is
> corrupted and you will have to use the last backup for recovery.
> 2017-01-30 10:07:28 AEDT LOG:  starting point-in-time recovery to
> 2017-01-24 02:08:00.023064+11
> 2017-01-30 10:07:28 AEDT LOG:  database system was not properly shut
> down; automatic recovery in progress
> 2017-01-30 10:07:28 AEDT WARNING:  WAL was generated with
> wal_level=minimal, data may be missing

This would be a problem:

https://www.postgresql.org/docs/9.3/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS

"But minimal WAL does not contain enough information to reconstruct the
data from a base backup and the WAL logs, so either archive or
hot_standby level must be used to enable WAL archiving (archive_mode)
and streaming replication. "

> 2017-01-30 10:07:28 AEDT HINT:  This happens if you temporarily set
> wal_level=minimal without taking a new base backup.
> 2017-01-30 10:07:28 AEDT LOG:  redo starts at 5/528B4558
> 2017-01-30 10:07:40 AEDT LOG:  consistent recovery state reached at
> 5/A3FFFA30
> cp: cannot stat ‘/mnt/archive/0000000100000005000000A3’: No such file or
> directory
> cp: cannot stat ‘/mnt/archive/0000000100000005000000A4’: No such file or
> directory
> 2017-01-30 10:07:40 AEDT LOG:  redo done at 5/A3FFF9E8
> 2017-01-30 10:07:40 AEDT LOG:  last completed transaction was at log
> time 2017-01-24 02:08:00.023064+11
> cp: cannot stat ‘/mnt/archive/0000000100000005000000A3’: No such file or
> directory
> cp: cannot stat ‘/mnt/archive/00000002.history’: No such file or directory
> 2017-01-30 10:07:40 AEDT LOG:  selected new timeline ID: 2
> cp: cannot stat ‘/mnt/archive/00000001.history’: No such file or directory
> 2017-01-30 10:07:40 AEDT LOG:  archive recovery complete
> 2017-01-30 10:08:55 AEDT FATAL:  the database system is starting up
> 2017-01-30 10:08:57 AEDT FATAL:  the database system is starting up
>
> This time it looks like it has actually finished the startup and
> recovery. However I think I might have something wrong about the process.
> Any thoughts on the above log?

See above.

>
>
> *Brian Mills*
> CTO
>
>
> *Mob: *0410660003 <tel:0410660003>
> *Melbourne* 03 9012 3460 <tel:03%209012%203460> or 03 8376 6327
> <tel:03%208376%206327> *|* * **Sydney* 02 8064 3600
> <tel:02%208064%203600> *|*  *Brisbane* 07 3173 1570 <tel:07%203173%201570>
> Level 1 *|*  600 Chapel Street *|* South
> Yarra*|*  VIC *|*  3141 *|*  Australia
>
> <https://www.facebook.com/TryBooking/> <https://twitter.com/trybooking>
<https://www.linkedin.com/company/trybooking-com>
>
> On 30 January 2017 at 04:49, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 01/28/2017 11:23 PM, Brian Mills wrote:
>
>         I presume this is a binary log file for the database.
>
>         Am I able to recover to a point in time using this log file?
>
>         What I would do in SQL Server would be recover to a point in
>         time, say a
>         bit before the last completed transaction time the log mentions,
>         then
>         take a backup. Is that possible in postgres?
>
>
>     Had another thought. If I remember correctly you are using this as
>     an exercise in Postgres recovery. If that is indeed the case you
>     might try:
>
>     1) Stop the Postgres instance you have running now.
>
>     2) Move the WAL file that Postgres is currently stalled on,
>     0000000100000005000000A3, out of pg_xlog.
>
>     3) Restart the Postgres instance.
>
>     My guess it it will not bring it back to the exact point you want,
>     but close. You can get a general idea by running(before and after
>     removing the WAL), as the postgres user:
>
>     pg_controldata -D /etc/postgresql/9.3/main
>
>
>         The log mentions this:
>         2017-01-27 20:36:18 AEDT LOG:  last completed transaction was at log
>         time 2017-01-24 02:08:00.023064+11
>
>         (which is moments before, or possibly as the disk filled up
>         doing a db
>         backup dump)
>
>         *Brian Mills*
>         CTO
>
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Nikolai Zhubr
Date:
Subject: [GENERAL] Causeless CPU load waves in backend, on windows, 9.5.5 (EDB binary).
Next
From: Michael Paquier
Date:
Subject: Re: [GENERAL] Causeless CPU load waves in backend, on windows, 9.5.5(EDB binary).