Re: Restrict copying of invalidated replication slots - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Restrict copying of invalidated replication slots |
Date | |
Msg-id | CAD21AoDmzxF2vBiu1n_ceUgRoJ2u98JNUupQ=U84mAhg6Wv_wA@mail.gmail.com Whole thread Raw |
In response to | Re: Restrict copying of invalidated replication slots (Shlok Kyal <shlok.kyal.oss@gmail.com>) |
List | pgsql-hackers |
On Mon, Feb 24, 2025 at 10:09 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Tue, 25 Feb 2025 at 01:03, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I've checked if this issue exists also on v15 or older, but IIUC it > > doesn't exist, fortunately. Here is the summary: > > > > Scenario-1: the source gets invalidated before the copy function > > fetches its contents for the first time. In this case, since the > > source slot's restart_lsn is already an invalid LSN it raises an error > > appropriately. In v15, we have only one slot invaldation reason, WAL > > removal, therefore we always reset the slot's restart_lsn to > > InvalidXlogRecPtr. > > > > Scenario-2: the source gets invalidated before the copied slot is > > created (i.e., between first content copy and > > create_logical/physical_replication_slot()). In this case, the copied > > slot could have a valid restart_lsn value that however might point to > > a WAL segment that might have already been removed. However, since > > copy_restart_lsn will be an invalid LSN (=0), it's caught by the > > following if condition: > > > > if (copy_restart_lsn < src_restart_lsn || > > src_islogical != copy_islogical || > > strcmp(copy_name, NameStr(*src_name)) != 0) > > ereport(ERROR, > > (errmsg("could not copy replication slot \"%s\"", > > NameStr(*src_name)), > > errdetail("The source replication slot was > > modified incompatibly during the copy operation."))); > > > > Scenario-3: the source gets invalidated after creating the copied slot > > (i.e. after create_logical/physical_replication_slot()). In this case, > > since the newly copied slot have the same restart_lsn as the source > > slot, both slots are invalidated. > > > > If the above analysis is right, I think the patches are mostly ready. > > I've made some changes to the patches: > > > > - removed src_isinvalidated variable as it's not necessarily necessary. > > - updated the commit message. > > > > Please review them. If there are no further comments on these patches, > > I'm going to push them. > > > I have verified the above scenarios and I confirm the behaviour described. > > I have a small doubt. > In PG_17 and PG_16 we can invalidate physical slots only for > 'wal_removed' case [1]. And copying of such slot already give error > 'cannot copy a replication slot that doesn't reserve WAL'. So, in PG17 > and PG16 should we check for invalidated source slot only if we are > copying logical slots? Although your analysis is correct, I believe we should retain the validation check. Even though invalidated physical slots in PostgreSQL 16 and 17always have an invalid restart_lsn, maintaining this check would be harmless. Furthermore, I prefer to maintain consistency in the codebase across back branches and the master branch rather than introducing variations. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: