Re: Restrict copying of invalidated replication slots - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Restrict copying of invalidated replication slots
Date
Msg-id CAD21AoDTEoEZ880tZpXjude3GygR5sjKbGOJZ_7i902U8vWkoA@mail.gmail.com
Whole thread Raw
In response to Re: Restrict copying of invalidated replication slots  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Restrict copying of invalidated replication slots
List pgsql-hackers
On Tue, Feb 25, 2025 at 2:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Feb 25, 2025 at 1:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've checked if this issue exists also on v15 or older, but IIUC it
> > doesn't exist, fortunately. Here is the summary:
> >
> > Scenario-1: the source gets invalidated before the copy function
> > fetches its contents for the first time. In this case, since the
> > source slot's restart_lsn is already an invalid LSN it raises an error
> > appropriately. In v15, we have only one slot invaldation reason, WAL
> > removal, therefore we always reset the slot's restart_lsn to
> > InvalidXlogRecPtr.
> >
> > Scenario-2: the source gets invalidated before the copied slot is
> > created (i.e., between first content copy and
> > create_logical/physical_replication_slot()). In this case, the copied
> > slot could have a valid restart_lsn value that however might point to
> > a WAL segment that might have already been removed. However, since
> > copy_restart_lsn will be an invalid LSN (=0), it's caught by the
> > following if condition:
> >
> >         if (copy_restart_lsn < src_restart_lsn ||
> >             src_islogical != copy_islogical ||
> >             strcmp(copy_name, NameStr(*src_name)) != 0)
> >             ereport(ERROR,
> >                     (errmsg("could not copy replication slot \"%s\"",
> >                             NameStr(*src_name)),
> >                      errdetail("The source replication slot was
> > modified incompatibly during the copy operation.")));
> >
> > Scenario-3: the source gets invalidated after creating the copied slot
> > (i.e. after create_logical/physical_replication_slot()). In this case,
> > since the newly copied slot have the same restart_lsn as the source
> > slot, both slots are invalidated.
> >
>
> Which part of the code will cover Scenario-3? Shouldn't we give ERROR
> for Scenario-3 as well?

In scenario-3, the backend process executing
pg_copy_logical/physical_replication_slot() already holds the new
copied slot and its restart_lsn is the same or older than the source
slot's restart_lsn. Therefore, if the source slot is invalidated at
that timing, the copied slot is invalidated too, resulting in an error
by the backend.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Re: Redact user password on pg_stat_statements
Next
From: Melanie Plageman
Date:
Subject: Re: Parallel heap vacuum