Thread: pg_rewind restore_command issue in PG12
Hi;
In a situation where pg_rewind gets an error due to a missing wall, I have set restore_command so that the needed wals can be read from the archive (I don't want to manually copy the wal files), but I see it doesn't work. What am I missing? Is restore_command not really working with pg_rewind in PG12? Or how should I trigger pg_rewind to use restore_command?
Thank you.
Attachment
On 03/01/2021 20:13, Amine Tengilimoglu wrote: > In a situation where pg_rewind gets an error due to a missing > wall, I have set restore_command so that the needed wals can be read > from the archive (I don't want to manually copy the wal files), but I > see it doesn't work. What am I missing? Is restore_command not really > working with pg_rewind in PG12? Or how should I trigger pg_rewind to > use restore_command? Using restore_command is a new feature in pg_rewind in PostgreSQL 13. It doesn't work on earlier versions. - Heikki
When I read the pg_rewind PG12 doc. It says:
"... but if the target cluster ran for a long time after the divergence, the old WAL files might no longer be present. In that case, they can be manually copied from the WAL archive to the
pg_wal
directory, or fetched on startup by configuring primary_conninfo or restore_command.". So I thought we could use restore_command. But when I try to use it , I see it doesn't work either.
Thanks.
Heikki Linnakangas <hlinnaka@iki.fi>, 4 Oca 2021 Pzt, 15:42 tarihinde şunu yazdı:
On 03/01/2021 20:13, Amine Tengilimoglu wrote:
> In a situation where pg_rewind gets an error due to a missing
> wall, I have set restore_command so that the needed wals can be read
> from the archive (I don't want to manually copy the wal files), but I
> see it doesn't work. What am I missing? Is restore_command not really
> working with pg_rewind in PG12? Or how should I trigger pg_rewind to
> use restore_command?
Using restore_command is a new feature in pg_rewind in PostgreSQL 13. It
doesn't work on earlier versions.
- Heikki
On Mon, Jan 04, 2021 at 04:12:34PM +0300, Amine Tengilimoglu wrote: > When I read the pg_rewind PG12 doc. It says: > > "... but if the target cluster ran for a long time after the divergence, > the old WAL files might no longer be present. In that case, they can be > manually copied from the WAL archive to the pg_wal directory,* or fetched > on startup by configuring **primary_conninfo > <https://www.postgresql.org/docs/12/runtime-config-replication.html#GUC-PRIMARY-CONNINFO> > or restore_command > <https://www.postgresql.org/docs/12/runtime-config-wal.html#GUC-RESTORE-COMMAND>* > .". > > So I thought we could use restore_command. But when I try to use it , I > see it doesn't work either. I agree with your point that the docs of 9.6~12 are confusing here. It makes no sense to mention restore_command or primary_conninfo to fetch WAL segments for the target to allow pg_rewind to find the point of divergence because the target is already offline when we look at that. Mentioning restore_command/primary_conninfo for recovery purposes could make sense in the context in the follow-up paragraph though, where the target gets restarted, after the rewind. But the uses are different. The docs of 13~ got that right when -c has been introduced by rewording this sentence as "or run pg_rewind with the -c option to automatically retrieve them from the WAL archive". So let's get rid of ", or fetched on startup by configuring primary_conninfo or restore_command." ("or fetched on startup by configuring recovery.conf" in some older branches). This confusion has been introduced by 878bd9a, down to 9.6. Heikki, what do you think? -- Michael
Attachment
Thank you Michael. I agree with you. Relevant part can be removed from the document and eliminate the confusion at least.
Michael Paquier <michael@paquier.xyz>, 5 Oca 2021 Sal, 10:17 tarihinde şunu yazdı:
On Mon, Jan 04, 2021 at 04:12:34PM +0300, Amine Tengilimoglu wrote:
> When I read the pg_rewind PG12 doc. It says:
>
> "... but if the target cluster ran for a long time after the divergence,
> the old WAL files might no longer be present. In that case, they can be
> manually copied from the WAL archive to the pg_wal directory,* or fetched
> on startup by configuring **primary_conninfo
> <https://www.postgresql.org/docs/12/runtime-config-replication.html#GUC-PRIMARY-CONNINFO>
> or restore_command
> <https://www.postgresql.org/docs/12/runtime-config-wal.html#GUC-RESTORE-COMMAND>*
> .".
>
> So I thought we could use restore_command. But when I try to use it , I
> see it doesn't work either.
I agree with your point that the docs of 9.6~12 are confusing here.
It makes no sense to mention restore_command or primary_conninfo to
fetch WAL segments for the target to allow pg_rewind to find the point
of divergence because the target is already offline when we look at
that. Mentioning restore_command/primary_conninfo for recovery
purposes could make sense in the context in the follow-up paragraph
though, where the target gets restarted, after the rewind. But the
uses are different.
The docs of 13~ got that right when -c has been introduced by
rewording this sentence as "or run pg_rewind with the -c option to
automatically retrieve them from the WAL archive". So let's get rid
of ", or fetched on startup by configuring primary_conninfo or
restore_command." ("or fetched on startup by configuring
recovery.conf" in some older branches). This confusion has been
introduced by 878bd9a, down to 9.6.
Heikki, what do you think?
--
Michael
On Tue, Jan 05, 2021 at 11:54:42AM +0300, Amine Tengilimoglu wrote: > Thank you Michael. I agree with you. Relevant part can be removed from the > document and eliminate the confusion at least. Okay, I got around this stuff, and committed a fix for 9.6~12. Thanks for the report, Amine! -- Michael
Attachment
You're welcome Michael!
Michael Paquier <michael@paquier.xyz>, 7 Oca 2021 Per, 14:54 tarihinde şunu yazdı:
On Tue, Jan 05, 2021 at 11:54:42AM +0300, Amine Tengilimoglu wrote:
> Thank you Michael. I agree with you. Relevant part can be removed from the
> document and eliminate the confusion at least.
Okay, I got around this stuff, and committed a fix for 9.6~12. Thanks
for the report, Amine!
--
Michael