Re: Recovery from WAL archives not advancing timeline? - Mailing list pgsql-admin

From Ian Lawrence Barwick
Subject Re: Recovery from WAL archives not advancing timeline?
Date
Msg-id CAB8KJ=jsEKz6GY0d+bi6Q2_U5_GDSJkaOO8Mh2V+rQ-JHM6_vg@mail.gmail.com
Whole thread Raw
In response to Recovery from WAL archives not advancing timeline?  (Don Seiler <don@seiler.us>)
Responses Re: Recovery from WAL archives not advancing timeline?  (Don Seiler <don@seiler.us>)
List pgsql-admin
2020年8月9日(日) 0:12 Don Seiler <don@seiler.us>:
>
> On Sat, Aug 8, 2020 at 9:14 AM Ian Lawrence Barwick <barwick@gmail.com> wrote:
>>
>>
>> After the failure to fetch the expected WAL file, the standby
>> (replica) should be
>> attempting to fetch the 00000002.history file; do you see any wal-g
>> error messages
>> about that?
>
>
> No. I when I first start the new replica, it shows:
>
> 2020-08-08 06:16:47.536 GMT [30368] LOG:  database system was shut down at 2020-08-08 05:29:12 GMT
> 2020-08-08 06:16:47.768 GMT [30379] [unknown]@[unknown] - [unknown] LOG:  connection received: host=[local]
> 2020-08-08 06:16:47.768 GMT [30379] postgres@postgres - [unknown] FATAL:  the database system is starting up
> 2020-08-08 06:16:47.769 GMT [30380] [unknown]@[unknown] - [unknown] LOG:  connection received: host=[local]
> 2020-08-08 06:16:47.769 GMT [30380] postgres@banno_login - [unknown] FATAL:  the database system is starting up
> 2020-08-08 06:16:47.849 GMT [30381] [unknown]@[unknown] - [unknown] LOG:  connection received: host=[local]
> 2020-08-08 06:16:47.849 GMT [30381] [unknown]@[unknown] - [unknown] LOG:  incomplete startup packet
> WARNING: 2020/08/08 06:16:47.857808 You are using deprecated functionality that uses an external gpg library.
> It will be removed in next major version.
> Please set GPG key using environment variables WALG_PGP_KEY or WALG_PGP_KEY_PATH.
> ERROR: 2020/08/08 06:16:48.041147 openpgp: invalid argument: no armored data found
> 2020-08-08 06:16:48.042 GMT [30368] LOG:  entering standby mode
> ERROR: 2020/08/08 06:16:48.423220 Archive '000000010000142E0000005B' does not exist.
> 2020-08-08 06:16:48.428 GMT [30368] LOG:  consistent recovery state reached at 142E/5B000098
> 2020-08-08 06:16:48.428 GMT [30368] LOG:  invalid record length at 142E/5B000098: wanted 24, got 0
> 2020-08-08 06:16:48.429 GMT [30366] LOG:  database system is ready to accept read only connections
>
> From there I see these messages repeated:
>
> ERROR: 2020/08/08 06:16:49.029836 Archive '000000010000142E0000005B' does not exist.
> WARNING: 2020/08/08 06:16:49.510460 You are using deprecated functionality that uses an external gpg library.
> It will be removed in next major version.
> Please set GPG key using environment variables WALG_PGP_KEY or WALG_PGP_KEY_PATH.
> ERROR: 2020/08/08 06:16:49.513288 openpgp: invalid argument: no armored data found
>
> There's no attempt to look for 00000002.history that I would normally expect when a replica doing WAL
restore/recoveryruns of out of files to restore. 

In that case are you
a) absolutely sure recovery.conf contains "recovery_target_timeline = 'latest'"?
b) the standby was restarted *after* the change was made?

I ask because not attempting to fetch a history file would be a sign
that "recovery_target_timeline"
is not set (or set but not to "latest").

(...)

> There are WAL files in that new replica from when it was still the old primary, would those be causing confusion?:
>
> postgres:~/9.6/main/pg_xlog$ ls -lh
> total 273M
> -rw------- 1 postgres postgres  340 Aug  7 22:21 000000010000142C000000AD.00000028.backup
> -rw------- 1 postgres postgres  16M Aug  8 05:24 000000010000142E00000055
> -rw------- 1 postgres postgres  16M Aug  8 05:25 000000010000142E00000056
> -rw------- 1 postgres postgres  16M Aug  8 05:26 000000010000142E00000057
> -rw------- 1 postgres postgres  16M Aug  8 05:27 000000010000142E00000058
> -rw------- 1 postgres postgres  16M Aug  8 05:28 000000010000142E00000059
> -rw------- 1 postgres postgres  16M Aug  8 05:29 000000010000142E0000005A
> -rw------- 1 postgres postgres  16M Aug  8 05:29 000000010000142E0000005B
> -rw------- 1 postgres postgres  16M Aug  8 05:16 000000010000142E0000005C
> -rw------- 1 postgres postgres  16M Aug  8 05:15 000000010000142E0000005D
> -rw------- 1 postgres postgres  16M Aug  8 05:18 000000010000142E0000005E
> -rw------- 1 postgres postgres  16M Aug  8 05:14 000000010000142E0000005F
> -rw------- 1 postgres postgres  16M Aug  8 05:17 000000010000142E00000060
> -rw------- 1 postgres postgres  16M Aug  8 05:19 000000010000142E00000061
> -rw------- 1 postgres postgres  16M Aug  8 05:22 000000010000142E00000062
> -rw------- 1 postgres postgres  16M Aug  8 05:21 000000010000142E00000063
> -rw------- 1 postgres postgres  16M Aug  8 05:20 000000010000142E00000064
> -rw------- 1 postgres postgres  16M Aug  8 05:23 000000010000142E00000065

From the timestamps, 000000010000142E0000005C and later are files
which were "recycled"
for future use and as such are perfectly normal.


Regards

Ian Barwick

--
Ian Barwick                   https://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgsql-admin by date:

Previous
From: Hotmail
Date:
Subject: Re: PostgreSQL JDBC Driver ERROR: invalid byte sequence for encoding "UTF8": 0x00 on Select stmt
Next
From: Don Seiler
Date:
Subject: Re: Recovery from WAL archives not advancing timeline?