Re: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot() - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()
Date
Msg-id 20200619005926.GB453547@paquier.xyz
Whole thread Raw
In response to [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()  (Vyacheslav Makarov <v.makarov@postgrespro.ru>)
Responses Re: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()
List pgsql-hackers
On Thu, Jun 18, 2020 at 03:39:09PM +0300, Vyacheslav Makarov wrote:
> If the WAL segment for the specified restart_lsn (STOP_LSN of the backup)
> exists, then the function will create a physical replication slot and will
> keep all the WAL segments required by the replica to catch up with the
> primary. Otherwise, it returns error, which means that the required WAL
> segments have been already utilised, so we do need to take a new backup.
> Without passing this newly added parameter
> pg_create_physical_replication_slot() works as before.
>
> What do you think about this?

I think that this was discussed in the past (perhaps one of the
threads related to WAL advancing actually?), and this stuff is full of
holes when it comes to think about error handling with checkpoints
running in parallel, potentially doing recycling of segments you would
expect to be around based on your input value for restart_lsn *while*
pg_create_physical_replication_slot() is still running and
manipulating the on-disk slot information.  I suspect that this also
breaks a couple of assumptions behind concurrent calls of the minimum
LSN calculated across slots when a caller sees fit to recompute the
thresholds (WAL senders mainly here, depending on the replication
activity).
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: Extracting only the columns needed for a query
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: min_safe_lsn column in pg_replication_slots view