Thread: Re: Restrict copying of invalidated replication slots
On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > Hi, > > Currently, we can copy an invalidated slot using the function > 'pg_copy_logical_replication_slot'. As per the suggestion in the > thread [1], we should prohibit copying of such slots. > > I have created a patch to address the issue. This patch does not fix all the copy_replication_slot scenarios completely, there is a very corner concurrency case where an invalidated slot still gets copied: + /* We should not copy invalidated replication slots */ + if (src_isinvalidated) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot copy an invalidated replication slot"))); Consider the following scenario: step 1) Set up streaming replication between the primary and standby nodes. step 2) Create a logical replication slot (test1) on the standby node. step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or add a sleep in InvalidatePossiblyObsoleteSlot function like below: if (cause == RS_INVAL_WAL_LEVEL) { while (bsleep) sleep(1); } step 4) Reduce wal_level on the primary to replica and restart the primary node. step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1', 'test2'); -- It will wait till the lock held by InvalidatePossiblyObsoleteSlot is released while trying to create a slot. step 6) Increase wal_level back to logical on the primary node and restart the primary. step 7) Now allow the invalidation to happen (continue the breakpoint held at step 3), the replication control lock will be released and the invalidated slot will be copied After this: postgres=# SELECT 'copy' FROM pg_copy_logical_replication_slot('test1', 'test2'); ?column? ---------- copy (1 row) -- The invalidated slot (test1) is copied successfully: postgres=# select * from pg_replication_slots ; slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size | two_phas e | inactive_since | conflicting | invalidation_reason | failover | synced -----------+---------------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+--------- --+----------------------------------+-------------+------------------------+----------+-------- test1 | test_decoding | logical | 5 | postgres | f | f | | | 745 | 0/4029060 | 0/4029098 | lost | | f | 2025-02-13 15:26:54.666725+05:30 | t | wal_level_insufficient | f | f test2 | test_decoding | logical | 5 | postgres | f | f | | | 745 | 0/4029060 | 0/4029098 | reserved | | f | 2025-02-13 15:30:30.477836+05:30 | f | | f | f (2 rows) -- A subsequent attempt to decode changes from the invalidated slot (test2) fails: postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, NULL); WARNING: detected write past chunk end in TXN 0x5e77e6c6f300 ERROR: logical decoding on standby requires "wal_level" >= "logical" on the primary -- Alternatively, the following error may occur: postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, NULL); WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0 data ------------ BEGIN 744 COMMIT 744 (2 rows) This is an edge case that can occur under specific conditions involving replication slot invalidation when there is a huge lag between primary and standby. There might be a similar concurrency case for wal_removed too. Regards, Vignesh
On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > Hi, > > > > > > Currently, we can copy an invalidated slot using the function > > > 'pg_copy_logical_replication_slot'. As per the suggestion in the > > > thread [1], we should prohibit copying of such slots. > > > > > > I have created a patch to address the issue. > > > > This patch does not fix all the copy_replication_slot scenarios > > completely, there is a very corner concurrency case where an > > invalidated slot still gets copied: > > + /* We should not copy invalidated replication slots */ > > + if (src_isinvalidated) > > + ereport(ERROR, > > + > > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > + errmsg("cannot copy an invalidated > > replication slot"))); > > > > Consider the following scenario: > > step 1) Set up streaming replication between the primary and standby nodes. > > step 2) Create a logical replication slot (test1) on the standby node. > > step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause > > is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or > > add a sleep in InvalidatePossiblyObsoleteSlot function like below: > > if (cause == RS_INVAL_WAL_LEVEL) > > { > > while (bsleep) > > sleep(1); > > } > > step 4) Reduce wal_level on the primary to replica and restart the primary > node. > > step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1', > > 'test2'); -- It will wait till the lock held by > > InvalidatePossiblyObsoleteSlot is released while trying to create a > > slot. > > step 6) Increase wal_level back to logical on the primary node and > > restart the primary. > > step 7) Now allow the invalidation to happen (continue the breakpoint > > held at step 3), the replication control lock will be released and the > > invalidated slot will be copied > > > > After this: > > postgres=# SELECT 'copy' FROM > > pg_copy_logical_replication_slot('test1', 'test2'); ?column? > > ---------- > > copy > > (1 row) > > > > -- The invalidated slot (test1) is copied successfully: > > postgres=# select * from pg_replication_slots ; > > slot_name | plugin | slot_type | datoid | database | temporary > > | active | active_pid | xmin | catalog_xmin | restart_lsn | > > confirmed_flush_lsn | wal_status | safe_wal_size | two_phas > > e | inactive_since | conflicting | > > invalidation_reason | failover | synced > > > -----------+---------------+-----------+--------+----------+-----------+ > --------+------------+------+--------------+-------------+--------------- > ------+------------+---------------+--------- > > > --+----------------------------------+-------------+---------------------- > --+----------+-------- > > test1 | test_decoding | logical | 5 | postgres | f > > | f | | | 745 | 0/4029060 | 0/4029098 > > | lost | | f > > | 2025-02-13 15:26:54.666725+05:30 | t | > > wal_level_insufficient | f | f > > test2 | test_decoding | logical | 5 | postgres | f > > | f | | | 745 | 0/4029060 | 0/4029098 > > | reserved | | f > > | 2025-02-13 15:30:30.477836+05:30 | f | > > | f | f > > (2 rows) > > > > -- A subsequent attempt to decode changes from the invalidated slot > > (test2) fails: > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > NULL); > > WARNING: detected write past chunk end in TXN 0x5e77e6c6f300 > > ERROR: logical decoding on standby requires "wal_level" >= "logical" > > on the primary > > > > -- Alternatively, the following error may occur: > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > NULL); > > WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0 > > data > > ------------ > > BEGIN 744 > > COMMIT 744 > > (2 rows) > > > > This is an edge case that can occur under specific conditions > > involving replication slot invalidation when there is a huge lag > > between primary and standby. > > There might be a similar concurrency case for wal_removed too. > > > > Hi Vignesh, > > Thanks for reviewing the patch. Thanks for updating the patch. I have a question related to it. > > I have tested the above scenario and was able to reproduce it. I have fixed it in > the v2 patch. > Currently we are taking a shared lock on ReplicationSlotControlLock. > This issue can be resolved if we take an exclusive lock instead. > Thoughts? It's not clear to me why increasing the lock level can solve it, could you elaborate a bit more on this ? Besides, do we need one more invalidated check in the following codes after creating the slot ? /* * Check if the source slot still exists and is valid. We regard it as * invalid if the type of replication slot or name has been changed, * or the restart_lsn either is invalid or has gone backward. (The ... Best Regards, Hou zj
On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > Hi, > > > > > > > > Currently, we can copy an invalidated slot using the function > > > > 'pg_copy_logical_replication_slot'. As per the suggestion in the > > > > thread [1], we should prohibit copying of such slots. > > > > > > > > I have created a patch to address the issue. > > > > > > This patch does not fix all the copy_replication_slot scenarios > > > completely, there is a very corner concurrency case where an > > > invalidated slot still gets copied: > > > + /* We should not copy invalidated replication slots */ > > > + if (src_isinvalidated) > > > + ereport(ERROR, > > > + > > > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > + errmsg("cannot copy an invalidated > > > replication slot"))); > > > > > > Consider the following scenario: > > > step 1) Set up streaming replication between the primary and standby nodes. > > > step 2) Create a logical replication slot (test1) on the standby node. > > > step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause > > > is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or > > > add a sleep in InvalidatePossiblyObsoleteSlot function like below: > > > if (cause == RS_INVAL_WAL_LEVEL) > > > { > > > while (bsleep) > > > sleep(1); > > > } > > > step 4) Reduce wal_level on the primary to replica and restart the primary > > node. > > > step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1', > > > 'test2'); -- It will wait till the lock held by > > > InvalidatePossiblyObsoleteSlot is released while trying to create a > > > slot. > > > step 6) Increase wal_level back to logical on the primary node and > > > restart the primary. > > > step 7) Now allow the invalidation to happen (continue the breakpoint > > > held at step 3), the replication control lock will be released and the > > > invalidated slot will be copied > > > > > > After this: > > > postgres=# SELECT 'copy' FROM > > > pg_copy_logical_replication_slot('test1', 'test2'); ?column? > > > ---------- > > > copy > > > (1 row) > > > > > > -- The invalidated slot (test1) is copied successfully: > > > postgres=# select * from pg_replication_slots ; > > > slot_name | plugin | slot_type | datoid | database | temporary > > > | active | active_pid | xmin | catalog_xmin | restart_lsn | > > > confirmed_flush_lsn | wal_status | safe_wal_size | two_phas > > > e | inactive_since | conflicting | > > > invalidation_reason | failover | synced > > > > > -----------+---------------+-----------+--------+----------+-----------+ > > --------+------------+------+--------------+-------------+--------------- > > ------+------------+---------------+--------- > > > > > --+----------------------------------+-------------+---------------------- > > --+----------+-------- > > > test1 | test_decoding | logical | 5 | postgres | f > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > | lost | | f > > > | 2025-02-13 15:26:54.666725+05:30 | t | > > > wal_level_insufficient | f | f > > > test2 | test_decoding | logical | 5 | postgres | f > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > | reserved | | f > > > | 2025-02-13 15:30:30.477836+05:30 | f | > > > | f | f > > > (2 rows) > > > > > > -- A subsequent attempt to decode changes from the invalidated slot > > > (test2) fails: > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > NULL); > > > WARNING: detected write past chunk end in TXN 0x5e77e6c6f300 > > > ERROR: logical decoding on standby requires "wal_level" >= "logical" > > > on the primary > > > > > > -- Alternatively, the following error may occur: > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > NULL); > > > WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0 > > > data > > > ------------ > > > BEGIN 744 > > > COMMIT 744 > > > (2 rows) > > > > > > This is an edge case that can occur under specific conditions > > > involving replication slot invalidation when there is a huge lag > > > between primary and standby. > > > There might be a similar concurrency case for wal_removed too. > > > > > > > Hi Vignesh, > > > > Thanks for reviewing the patch. > > Thanks for updating the patch. I have a question related to it. > > > > > I have tested the above scenario and was able to reproduce it. I have fixed it in > > the v2 patch. > > Currently we are taking a shared lock on ReplicationSlotControlLock. > > This issue can be resolved if we take an exclusive lock instead. > > Thoughts? > > It's not clear to me why increasing the lock level can solve it, could you > elaborate a bit more on this ? > In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on 'ReplicationSlotControlLock' Also in function 'copy_replication_slot' we take a SHARED lock on 'ReplicationSlotControlLock' during fetching of source slot. So, for the case described by Vignesh in [1], first InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock on 'ReplicationSlotControlLock'. We are now holding the function using the sleep if (cause == RS_INVAL_WAL_LEVEL) { while (bsleep) sleep(1); } Now we create a copy of the slot since 'copy_replication_slot' takes a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock and fetch the info of the source slot (the slot is not invalidated till now). and the function 'copy_replication_slot' calls function 'create_logical_replication_slot' which takes a EXCLUSIVE lock on ReplicationSlotControlLock and hence it will wait for function InvalidateObsoleteReplicationSlot to release lock. Once the function 'InvalidateObsoleteReplicationSlot' releases the lock, the execution of 'create_logical_replication_slot' continues and creates a copy of the source slot. Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence, it will wait for the 'InvalidateObsoleteReplicationSlot' to release the lock and then fetch the source slot info and try to create the copied slot (which will fail as source slot is invalidated before we fetch its info) > Besides, do we need one more invalidated check in the following codes after > creating the slot ? > > /* > * Check if the source slot still exists and is valid. We regard it as > * invalid if the type of replication slot or name has been changed, > * or the restart_lsn either is invalid or has gone backward. (The > ... > This approach seems more feasible to me. It also resolves the issue suggested by Vignesh in [1]. I have made changes for the same in v3 patch. [1]: https://www.postgresql.org/message-id/CALDaNm2rrxO5mg6OKoScw84K5P1Tw_cbjniHm%2BGeyxme8Ei-nQ%40mail.gmail.com Thanks and Regards, Shlok Kyal
Attachment
On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu) > <houzj.fnst@fujitsu.com> wrote: > > > > On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > Hi, > > > > > > > > > > Currently, we can copy an invalidated slot using the function > > > > > 'pg_copy_logical_replication_slot'. As per the suggestion in the > > > > > thread [1], we should prohibit copying of such slots. > > > > > > > > > > I have created a patch to address the issue. > > > > > > > > This patch does not fix all the copy_replication_slot scenarios > > > > completely, there is a very corner concurrency case where an > > > > invalidated slot still gets copied: > > > > + /* We should not copy invalidated replication slots */ > > > > + if (src_isinvalidated) > > > > + ereport(ERROR, > > > > + > > > > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > > + errmsg("cannot copy an invalidated > > > > replication slot"))); > > > > > > > > Consider the following scenario: > > > > step 1) Set up streaming replication between the primary and standby nodes. > > > > step 2) Create a logical replication slot (test1) on the standby node. > > > > step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause > > > > is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or > > > > add a sleep in InvalidatePossiblyObsoleteSlot function like below: > > > > if (cause == RS_INVAL_WAL_LEVEL) > > > > { > > > > while (bsleep) > > > > sleep(1); > > > > } > > > > step 4) Reduce wal_level on the primary to replica and restart the primary > > > node. > > > > step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1', > > > > 'test2'); -- It will wait till the lock held by > > > > InvalidatePossiblyObsoleteSlot is released while trying to create a > > > > slot. > > > > step 6) Increase wal_level back to logical on the primary node and > > > > restart the primary. > > > > step 7) Now allow the invalidation to happen (continue the breakpoint > > > > held at step 3), the replication control lock will be released and the > > > > invalidated slot will be copied > > > > > > > > After this: > > > > postgres=# SELECT 'copy' FROM > > > > pg_copy_logical_replication_slot('test1', 'test2'); ?column? > > > > ---------- > > > > copy > > > > (1 row) > > > > > > > > -- The invalidated slot (test1) is copied successfully: > > > > postgres=# select * from pg_replication_slots ; > > > > slot_name | plugin | slot_type | datoid | database | temporary > > > > | active | active_pid | xmin | catalog_xmin | restart_lsn | > > > > confirmed_flush_lsn | wal_status | safe_wal_size | two_phas > > > > e | inactive_since | conflicting | > > > > invalidation_reason | failover | synced > > > > > > > -----------+---------------+-----------+--------+----------+-----------+ > > > --------+------------+------+--------------+-------------+--------------- > > > ------+------------+---------------+--------- > > > > > > > --+----------------------------------+-------------+---------------------- > > > --+----------+-------- > > > > test1 | test_decoding | logical | 5 | postgres | f > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > | lost | | f > > > > | 2025-02-13 15:26:54.666725+05:30 | t | > > > > wal_level_insufficient | f | f > > > > test2 | test_decoding | logical | 5 | postgres | f > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > | reserved | | f > > > > | 2025-02-13 15:30:30.477836+05:30 | f | > > > > | f | f > > > > (2 rows) > > > > > > > > -- A subsequent attempt to decode changes from the invalidated slot > > > > (test2) fails: > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > NULL); > > > > WARNING: detected write past chunk end in TXN 0x5e77e6c6f300 > > > > ERROR: logical decoding on standby requires "wal_level" >= "logical" > > > > on the primary > > > > > > > > -- Alternatively, the following error may occur: > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > NULL); > > > > WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0 > > > > data > > > > ------------ > > > > BEGIN 744 > > > > COMMIT 744 > > > > (2 rows) > > > > > > > > This is an edge case that can occur under specific conditions > > > > involving replication slot invalidation when there is a huge lag > > > > between primary and standby. > > > > There might be a similar concurrency case for wal_removed too. > > > > > > > > > > Hi Vignesh, > > > > > > Thanks for reviewing the patch. > > > > Thanks for updating the patch. I have a question related to it. > > > > > > > > I have tested the above scenario and was able to reproduce it. I have fixed it in > > > the v2 patch. > > > Currently we are taking a shared lock on ReplicationSlotControlLock. > > > This issue can be resolved if we take an exclusive lock instead. > > > Thoughts? > > > > It's not clear to me why increasing the lock level can solve it, could you > > elaborate a bit more on this ? > > > In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on > 'ReplicationSlotControlLock' > Also in function 'copy_replication_slot' we take a SHARED lock on > 'ReplicationSlotControlLock' during fetching of source slot. > > So, for the case described by Vignesh in [1], first > InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock > on 'ReplicationSlotControlLock'. We are now holding the function using > the sleep > if (cause == RS_INVAL_WAL_LEVEL) > { > while (bsleep) > sleep(1); > } > > Now we create a copy of the slot since 'copy_replication_slot' takes > a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock > and fetch the info of the source slot (the slot is not invalidated > till now). and the function 'copy_replication_slot' calls function > 'create_logical_replication_slot' which takes a EXCLUSIVE lock on > ReplicationSlotControlLock and hence it will wait for function > InvalidateObsoleteReplicationSlot to release lock. Once the function > 'InvalidateObsoleteReplicationSlot' releases the lock, the execution > of 'create_logical_replication_slot' continues and creates a copy of > the source slot. > > Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE > lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence, > it will wait for the 'InvalidateObsoleteReplicationSlot' to release > the lock and then fetch the source slot info and try to create the > copied slot (which will fail as source slot is invalidated before we > fetch its info) > > > Besides, do we need one more invalidated check in the following codes after > > creating the slot ? > > > > /* > > * Check if the source slot still exists and is valid. We regard it as > > * invalid if the type of replication slot or name has been changed, > > * or the restart_lsn either is invalid or has gone backward. (The > > ... > > > > This approach seems more feasible to me. It also resolves the issue > suggested by Vignesh in [1]. I have made changes for the same in v3 > patch. > I agree to check if the source slot got invalidated during the copy. But why do we need to search the slot by the slot name again as follows? + /* Check if source slot was invalidated while copying of slot */ + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); + + for (int i = 0; i < max_replication_slots; i++) + { + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; + + if (s->in_use && strcmp(NameStr(s->data.name), NameStr(*src_name)) == 0) + { + /* Copy the slot contents while holding spinlock */ + SpinLockAcquire(&s->mutex); + first_slot_contents = *s; + SpinLockRelease(&s->mutex); + src = s; + break; + } + } + + LWLockRelease(ReplicationSlotControlLock); I think 'src' already points to the source slot. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, 21 Feb 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu) > > <houzj.fnst@fujitsu.com> wrote: > > > > > > On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > Currently, we can copy an invalidated slot using the function > > > > > > 'pg_copy_logical_replication_slot'. As per the suggestion in the > > > > > > thread [1], we should prohibit copying of such slots. > > > > > > > > > > > > I have created a patch to address the issue. > > > > > > > > > > This patch does not fix all the copy_replication_slot scenarios > > > > > completely, there is a very corner concurrency case where an > > > > > invalidated slot still gets copied: > > > > > + /* We should not copy invalidated replication slots */ > > > > > + if (src_isinvalidated) > > > > > + ereport(ERROR, > > > > > + > > > > > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > > > + errmsg("cannot copy an invalidated > > > > > replication slot"))); > > > > > > > > > > Consider the following scenario: > > > > > step 1) Set up streaming replication between the primary and standby nodes. > > > > > step 2) Create a logical replication slot (test1) on the standby node. > > > > > step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause > > > > > is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or > > > > > add a sleep in InvalidatePossiblyObsoleteSlot function like below: > > > > > if (cause == RS_INVAL_WAL_LEVEL) > > > > > { > > > > > while (bsleep) > > > > > sleep(1); > > > > > } > > > > > step 4) Reduce wal_level on the primary to replica and restart the primary > > > > node. > > > > > step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1', > > > > > 'test2'); -- It will wait till the lock held by > > > > > InvalidatePossiblyObsoleteSlot is released while trying to create a > > > > > slot. > > > > > step 6) Increase wal_level back to logical on the primary node and > > > > > restart the primary. > > > > > step 7) Now allow the invalidation to happen (continue the breakpoint > > > > > held at step 3), the replication control lock will be released and the > > > > > invalidated slot will be copied > > > > > > > > > > After this: > > > > > postgres=# SELECT 'copy' FROM > > > > > pg_copy_logical_replication_slot('test1', 'test2'); ?column? > > > > > ---------- > > > > > copy > > > > > (1 row) > > > > > > > > > > -- The invalidated slot (test1) is copied successfully: > > > > > postgres=# select * from pg_replication_slots ; > > > > > slot_name | plugin | slot_type | datoid | database | temporary > > > > > | active | active_pid | xmin | catalog_xmin | restart_lsn | > > > > > confirmed_flush_lsn | wal_status | safe_wal_size | two_phas > > > > > e | inactive_since | conflicting | > > > > > invalidation_reason | failover | synced > > > > > > > > > -----------+---------------+-----------+--------+----------+-----------+ > > > > --------+------------+------+--------------+-------------+--------------- > > > > ------+------------+---------------+--------- > > > > > > > > > --+----------------------------------+-------------+---------------------- > > > > --+----------+-------- > > > > > test1 | test_decoding | logical | 5 | postgres | f > > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > > | lost | | f > > > > > | 2025-02-13 15:26:54.666725+05:30 | t | > > > > > wal_level_insufficient | f | f > > > > > test2 | test_decoding | logical | 5 | postgres | f > > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > > | reserved | | f > > > > > | 2025-02-13 15:30:30.477836+05:30 | f | > > > > > | f | f > > > > > (2 rows) > > > > > > > > > > -- A subsequent attempt to decode changes from the invalidated slot > > > > > (test2) fails: > > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > > NULL); > > > > > WARNING: detected write past chunk end in TXN 0x5e77e6c6f300 > > > > > ERROR: logical decoding on standby requires "wal_level" >= "logical" > > > > > on the primary > > > > > > > > > > -- Alternatively, the following error may occur: > > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > > NULL); > > > > > WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0 > > > > > data > > > > > ------------ > > > > > BEGIN 744 > > > > > COMMIT 744 > > > > > (2 rows) > > > > > > > > > > This is an edge case that can occur under specific conditions > > > > > involving replication slot invalidation when there is a huge lag > > > > > between primary and standby. > > > > > There might be a similar concurrency case for wal_removed too. > > > > > > > > > > > > > Hi Vignesh, > > > > > > > > Thanks for reviewing the patch. > > > > > > Thanks for updating the patch. I have a question related to it. > > > > > > > > > > > I have tested the above scenario and was able to reproduce it. I have fixed it in > > > > the v2 patch. > > > > Currently we are taking a shared lock on ReplicationSlotControlLock. > > > > This issue can be resolved if we take an exclusive lock instead. > > > > Thoughts? > > > > > > It's not clear to me why increasing the lock level can solve it, could you > > > elaborate a bit more on this ? > > > > > In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on > > 'ReplicationSlotControlLock' > > Also in function 'copy_replication_slot' we take a SHARED lock on > > 'ReplicationSlotControlLock' during fetching of source slot. > > > > So, for the case described by Vignesh in [1], first > > InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock > > on 'ReplicationSlotControlLock'. We are now holding the function using > > the sleep > > if (cause == RS_INVAL_WAL_LEVEL) > > { > > while (bsleep) > > sleep(1); > > } > > > > Now we create a copy of the slot since 'copy_replication_slot' takes > > a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock > > and fetch the info of the source slot (the slot is not invalidated > > till now). and the function 'copy_replication_slot' calls function > > 'create_logical_replication_slot' which takes a EXCLUSIVE lock on > > ReplicationSlotControlLock and hence it will wait for function > > InvalidateObsoleteReplicationSlot to release lock. Once the function > > 'InvalidateObsoleteReplicationSlot' releases the lock, the execution > > of 'create_logical_replication_slot' continues and creates a copy of > > the source slot. > > > > Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE > > lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence, > > it will wait for the 'InvalidateObsoleteReplicationSlot' to release > > the lock and then fetch the source slot info and try to create the > > copied slot (which will fail as source slot is invalidated before we > > fetch its info) > > > > > Besides, do we need one more invalidated check in the following codes after > > > creating the slot ? > > > > > > /* > > > * Check if the source slot still exists and is valid. We regard it as > > > * invalid if the type of replication slot or name has been changed, > > > * or the restart_lsn either is invalid or has gone backward. (The > > > ... > > > > > > > This approach seems more feasible to me. It also resolves the issue > > suggested by Vignesh in [1]. I have made changes for the same in v3 > > patch. > > > > I agree to check if the source slot got invalidated during the copy. > But why do we need to search the slot by the slot name again as > follows? > > + /* Check if source slot was invalidated while copying of slot */ > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > + > + for (int i = 0; i < max_replication_slots; i++) > + { > + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; > + > + if (s->in_use && strcmp(NameStr(s->data.name), > NameStr(*src_name)) == 0) > + { > + /* Copy the slot contents while holding spinlock */ > + SpinLockAcquire(&s->mutex); > + first_slot_contents = *s; > + SpinLockRelease(&s->mutex); > + src = s; > + break; > + } > + } > + > + LWLockRelease(ReplicationSlotControlLock); > > I think 'src' already points to the source slot. > Hi Sawada san, Thanks for reviewing the patch. I have used the 'src' instead of iterating again. I have attached the updated v4 patch. Thanks and Regards, Shlok Kyal
Attachment
On Fri, Feb 21, 2025 at 4:30 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Fri, 21 Feb 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu) > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > Currently, we can copy an invalidated slot using the function > > > > > > > 'pg_copy_logical_replication_slot'. As per the suggestion in the > > > > > > > thread [1], we should prohibit copying of such slots. > > > > > > > > > > > > > > I have created a patch to address the issue. > > > > > > > > > > > > This patch does not fix all the copy_replication_slot scenarios > > > > > > completely, there is a very corner concurrency case where an > > > > > > invalidated slot still gets copied: > > > > > > + /* We should not copy invalidated replication slots */ > > > > > > + if (src_isinvalidated) > > > > > > + ereport(ERROR, > > > > > > + > > > > > > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > > > > + errmsg("cannot copy an invalidated > > > > > > replication slot"))); > > > > > > > > > > > > Consider the following scenario: > > > > > > step 1) Set up streaming replication between the primary and standby nodes. > > > > > > step 2) Create a logical replication slot (test1) on the standby node. > > > > > > step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause > > > > > > is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or > > > > > > add a sleep in InvalidatePossiblyObsoleteSlot function like below: > > > > > > if (cause == RS_INVAL_WAL_LEVEL) > > > > > > { > > > > > > while (bsleep) > > > > > > sleep(1); > > > > > > } > > > > > > step 4) Reduce wal_level on the primary to replica and restart the primary > > > > > node. > > > > > > step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1', > > > > > > 'test2'); -- It will wait till the lock held by > > > > > > InvalidatePossiblyObsoleteSlot is released while trying to create a > > > > > > slot. > > > > > > step 6) Increase wal_level back to logical on the primary node and > > > > > > restart the primary. > > > > > > step 7) Now allow the invalidation to happen (continue the breakpoint > > > > > > held at step 3), the replication control lock will be released and the > > > > > > invalidated slot will be copied > > > > > > > > > > > > After this: > > > > > > postgres=# SELECT 'copy' FROM > > > > > > pg_copy_logical_replication_slot('test1', 'test2'); ?column? > > > > > > ---------- > > > > > > copy > > > > > > (1 row) > > > > > > > > > > > > -- The invalidated slot (test1) is copied successfully: > > > > > > postgres=# select * from pg_replication_slots ; > > > > > > slot_name | plugin | slot_type | datoid | database | temporary > > > > > > | active | active_pid | xmin | catalog_xmin | restart_lsn | > > > > > > confirmed_flush_lsn | wal_status | safe_wal_size | two_phas > > > > > > e | inactive_since | conflicting | > > > > > > invalidation_reason | failover | synced > > > > > > > > > > > -----------+---------------+-----------+--------+----------+-----------+ > > > > > --------+------------+------+--------------+-------------+--------------- > > > > > ------+------------+---------------+--------- > > > > > > > > > > > --+----------------------------------+-------------+---------------------- > > > > > --+----------+-------- > > > > > > test1 | test_decoding | logical | 5 | postgres | f > > > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > > > | lost | | f > > > > > > | 2025-02-13 15:26:54.666725+05:30 | t | > > > > > > wal_level_insufficient | f | f > > > > > > test2 | test_decoding | logical | 5 | postgres | f > > > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > > > | reserved | | f > > > > > > | 2025-02-13 15:30:30.477836+05:30 | f | > > > > > > | f | f > > > > > > (2 rows) > > > > > > > > > > > > -- A subsequent attempt to decode changes from the invalidated slot > > > > > > (test2) fails: > > > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > > > NULL); > > > > > > WARNING: detected write past chunk end in TXN 0x5e77e6c6f300 > > > > > > ERROR: logical decoding on standby requires "wal_level" >= "logical" > > > > > > on the primary > > > > > > > > > > > > -- Alternatively, the following error may occur: > > > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > > > NULL); > > > > > > WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0 > > > > > > data > > > > > > ------------ > > > > > > BEGIN 744 > > > > > > COMMIT 744 > > > > > > (2 rows) > > > > > > > > > > > > This is an edge case that can occur under specific conditions > > > > > > involving replication slot invalidation when there is a huge lag > > > > > > between primary and standby. > > > > > > There might be a similar concurrency case for wal_removed too. > > > > > > > > > > > > > > > > Hi Vignesh, > > > > > > > > > > Thanks for reviewing the patch. > > > > > > > > Thanks for updating the patch. I have a question related to it. > > > > > > > > > > > > > > I have tested the above scenario and was able to reproduce it. I have fixed it in > > > > > the v2 patch. > > > > > Currently we are taking a shared lock on ReplicationSlotControlLock. > > > > > This issue can be resolved if we take an exclusive lock instead. > > > > > Thoughts? > > > > > > > > It's not clear to me why increasing the lock level can solve it, could you > > > > elaborate a bit more on this ? > > > > > > > In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on > > > 'ReplicationSlotControlLock' > > > Also in function 'copy_replication_slot' we take a SHARED lock on > > > 'ReplicationSlotControlLock' during fetching of source slot. > > > > > > So, for the case described by Vignesh in [1], first > > > InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock > > > on 'ReplicationSlotControlLock'. We are now holding the function using > > > the sleep > > > if (cause == RS_INVAL_WAL_LEVEL) > > > { > > > while (bsleep) > > > sleep(1); > > > } > > > > > > Now we create a copy of the slot since 'copy_replication_slot' takes > > > a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock > > > and fetch the info of the source slot (the slot is not invalidated > > > till now). and the function 'copy_replication_slot' calls function > > > 'create_logical_replication_slot' which takes a EXCLUSIVE lock on > > > ReplicationSlotControlLock and hence it will wait for function > > > InvalidateObsoleteReplicationSlot to release lock. Once the function > > > 'InvalidateObsoleteReplicationSlot' releases the lock, the execution > > > of 'create_logical_replication_slot' continues and creates a copy of > > > the source slot. > > > > > > Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE > > > lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence, > > > it will wait for the 'InvalidateObsoleteReplicationSlot' to release > > > the lock and then fetch the source slot info and try to create the > > > copied slot (which will fail as source slot is invalidated before we > > > fetch its info) > > > > > > > Besides, do we need one more invalidated check in the following codes after > > > > creating the slot ? > > > > > > > > /* > > > > * Check if the source slot still exists and is valid. We regard it as > > > > * invalid if the type of replication slot or name has been changed, > > > > * or the restart_lsn either is invalid or has gone backward. (The > > > > ... > > > > > > > > > > This approach seems more feasible to me. It also resolves the issue > > > suggested by Vignesh in [1]. I have made changes for the same in v3 > > > patch. > > > > > > > I agree to check if the source slot got invalidated during the copy. > > But why do we need to search the slot by the slot name again as > > follows? > > > > + /* Check if source slot was invalidated while copying of slot */ > > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > > + > > + for (int i = 0; i < max_replication_slots; i++) > > + { > > + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; > > + > > + if (s->in_use && strcmp(NameStr(s->data.name), > > NameStr(*src_name)) == 0) > > + { > > + /* Copy the slot contents while holding spinlock */ > > + SpinLockAcquire(&s->mutex); > > + first_slot_contents = *s; > > + SpinLockRelease(&s->mutex); > > + src = s; > > + break; > > + } > > + } > > + > > + LWLockRelease(ReplicationSlotControlLock); > > > > I think 'src' already points to the source slot. > > > Hi Sawada san, > > Thanks for reviewing the patch. > I have used the 'src' instead of iterating again. I have attached the > updated v4 patch. Thank you for updating the patch! I have one comment: + /* Check if source slot was invalidated while copying of slot */ + SpinLockAcquire(&src->mutex); + first_slot_contents = *src; + SpinLockRelease(&src->mutex); We don't need to copy the source slot contents again since we already do as follows: /* Copy data of source slot again */ SpinLockAcquire(&src->mutex); second_slot_contents = *src; SpinLockRelease(&src->mutex); I think we can use second_slot_contents for that check. I've investigated the slot invalidation and copying slots behaviors. We cannot copy a slot if it doesn't reserve WAL, but IIUC the slot's restart_lsn is not reset by slot invalidation due to other than RS_INVAL_WAL_REMOVED. Therefore, it's possible that we copy a slot invalidated by for example RS_INVAL_IDLE_TIMEOUT, and the copied slot's restart_lsn might have already been removed, which ultimately causes an assertion failure in ocpy_replication_slot(): #ifdef USE_ASSERT_CHECKING /* Check that the restart_lsn is available */ { XLogSegNo segno; XLByteToSeg(copy_restart_lsn, segno, wal_segment_size); Assert(XLogGetLastRemovedSegno() < segno); } #endif I think this issue exists from v16 or later, I've not tested yet though. If my understanding is right, this patch has to be backpatched. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Some review comments for patch v2-0001. ====== Commit message 1. Currently we can copy an invalidated logical and physical replication slot using functions 'pg_copy_logical_replication_slot' and 'pg_copy_physical_replication_slot' respectively. With this patch we will throw an error in such cases. /we can copy an invalidated logical and physical replication slot/we can copy invalidated logical and physical replication slots/ ====== doc/src/sgml/func.sgml pg_copy_physical_replication_slot: 2. - is omitted, the same value as the source slot is used. + is omitted, the same value as the source slot is used. Copy of an + invalidated physical replication slot in not allowed. Typo /in/is/ Also, IMO you don't need to say "physical replication slot" because it is clear from the function's name. SUGGESTION Copy of an invalidated slot is not allowed. ~~~ pg_copy_logical_replication_slot: 3. + Copy of an invalidated logical replication slot in not allowed. Typo /in/is/ Also, IMO you don't need to say "logical replication slot" because it is clear from the function's name. SUGGESTION Copy of an invalidated slot is not allowed. ====== src/backend/replication/slotfuncs.c copy_replication_slot: 4. + /* Check if source slot was invalidated while copying of slot */ + SpinLockAcquire(&src->mutex); + first_slot_contents = *src; + SpinLockRelease(&src->mutex); + + src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE); + + if (src_isinvalidated) + ereport(ERROR, + (errmsg("could not copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation."))); 4a. We already know that it was not invalid the FIRST time we looked at it, so IMO we only need to confirm that the SECOND look gives the same answer. IOW, I thought the code should be like below. (AFAICT Sawada-san's review says the same as this). Also, I think it is better to say "became invalidated" instead of "was invalidated", to imply the state was known to be ok before the copy. SUGGESTION + /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR, ~ 4b. Unnecessary parentheses in the ereport. ~ 4c. There seems some weird mix of tense "cannot copy" versus "could not copy" already in this file. But, maybe at least you can be consistent within the patch and always say "cannot". ====== Kind Regards, Peter Smith. Fujitsu Australia
On Sat, 22 Feb 2025 at 04:49, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Feb 21, 2025 at 4:30 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > On Fri, 21 Feb 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu) > > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > > > On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > > > On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > > > On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > Currently, we can copy an invalidated slot using the function > > > > > > > > 'pg_copy_logical_replication_slot'. As per the suggestion in the > > > > > > > > thread [1], we should prohibit copying of such slots. > > > > > > > > > > > > > > > > I have created a patch to address the issue. > > > > > > > > > > > > > > This patch does not fix all the copy_replication_slot scenarios > > > > > > > completely, there is a very corner concurrency case where an > > > > > > > invalidated slot still gets copied: > > > > > > > + /* We should not copy invalidated replication slots */ > > > > > > > + if (src_isinvalidated) > > > > > > > + ereport(ERROR, > > > > > > > + > > > > > > > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > > > > > + errmsg("cannot copy an invalidated > > > > > > > replication slot"))); > > > > > > > > > > > > > > Consider the following scenario: > > > > > > > step 1) Set up streaming replication between the primary and standby nodes. > > > > > > > step 2) Create a logical replication slot (test1) on the standby node. > > > > > > > step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause > > > > > > > is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or > > > > > > > add a sleep in InvalidatePossiblyObsoleteSlot function like below: > > > > > > > if (cause == RS_INVAL_WAL_LEVEL) > > > > > > > { > > > > > > > while (bsleep) > > > > > > > sleep(1); > > > > > > > } > > > > > > > step 4) Reduce wal_level on the primary to replica and restart the primary > > > > > > node. > > > > > > > step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1', > > > > > > > 'test2'); -- It will wait till the lock held by > > > > > > > InvalidatePossiblyObsoleteSlot is released while trying to create a > > > > > > > slot. > > > > > > > step 6) Increase wal_level back to logical on the primary node and > > > > > > > restart the primary. > > > > > > > step 7) Now allow the invalidation to happen (continue the breakpoint > > > > > > > held at step 3), the replication control lock will be released and the > > > > > > > invalidated slot will be copied > > > > > > > > > > > > > > After this: > > > > > > > postgres=# SELECT 'copy' FROM > > > > > > > pg_copy_logical_replication_slot('test1', 'test2'); ?column? > > > > > > > ---------- > > > > > > > copy > > > > > > > (1 row) > > > > > > > > > > > > > > -- The invalidated slot (test1) is copied successfully: > > > > > > > postgres=# select * from pg_replication_slots ; > > > > > > > slot_name | plugin | slot_type | datoid | database | temporary > > > > > > > | active | active_pid | xmin | catalog_xmin | restart_lsn | > > > > > > > confirmed_flush_lsn | wal_status | safe_wal_size | two_phas > > > > > > > e | inactive_since | conflicting | > > > > > > > invalidation_reason | failover | synced > > > > > > > > > > > > > -----------+---------------+-----------+--------+----------+-----------+ > > > > > > --------+------------+------+--------------+-------------+--------------- > > > > > > ------+------------+---------------+--------- > > > > > > > > > > > > > --+----------------------------------+-------------+---------------------- > > > > > > --+----------+-------- > > > > > > > test1 | test_decoding | logical | 5 | postgres | f > > > > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > > > > | lost | | f > > > > > > > | 2025-02-13 15:26:54.666725+05:30 | t | > > > > > > > wal_level_insufficient | f | f > > > > > > > test2 | test_decoding | logical | 5 | postgres | f > > > > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > > > > | reserved | | f > > > > > > > | 2025-02-13 15:30:30.477836+05:30 | f | > > > > > > > | f | f > > > > > > > (2 rows) > > > > > > > > > > > > > > -- A subsequent attempt to decode changes from the invalidated slot > > > > > > > (test2) fails: > > > > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > > > > NULL); > > > > > > > WARNING: detected write past chunk end in TXN 0x5e77e6c6f300 > > > > > > > ERROR: logical decoding on standby requires "wal_level" >= "logical" > > > > > > > on the primary > > > > > > > > > > > > > > -- Alternatively, the following error may occur: > > > > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > > > > NULL); > > > > > > > WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0 > > > > > > > data > > > > > > > ------------ > > > > > > > BEGIN 744 > > > > > > > COMMIT 744 > > > > > > > (2 rows) > > > > > > > > > > > > > > This is an edge case that can occur under specific conditions > > > > > > > involving replication slot invalidation when there is a huge lag > > > > > > > between primary and standby. > > > > > > > There might be a similar concurrency case for wal_removed too. > > > > > > > > > > > > > > > > > > > Hi Vignesh, > > > > > > > > > > > > Thanks for reviewing the patch. > > > > > > > > > > Thanks for updating the patch. I have a question related to it. > > > > > > > > > > > > > > > > > I have tested the above scenario and was able to reproduce it. I have fixed it in > > > > > > the v2 patch. > > > > > > Currently we are taking a shared lock on ReplicationSlotControlLock. > > > > > > This issue can be resolved if we take an exclusive lock instead. > > > > > > Thoughts? > > > > > > > > > > It's not clear to me why increasing the lock level can solve it, could you > > > > > elaborate a bit more on this ? > > > > > > > > > In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on > > > > 'ReplicationSlotControlLock' > > > > Also in function 'copy_replication_slot' we take a SHARED lock on > > > > 'ReplicationSlotControlLock' during fetching of source slot. > > > > > > > > So, for the case described by Vignesh in [1], first > > > > InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock > > > > on 'ReplicationSlotControlLock'. We are now holding the function using > > > > the sleep > > > > if (cause == RS_INVAL_WAL_LEVEL) > > > > { > > > > while (bsleep) > > > > sleep(1); > > > > } > > > > > > > > Now we create a copy of the slot since 'copy_replication_slot' takes > > > > a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock > > > > and fetch the info of the source slot (the slot is not invalidated > > > > till now). and the function 'copy_replication_slot' calls function > > > > 'create_logical_replication_slot' which takes a EXCLUSIVE lock on > > > > ReplicationSlotControlLock and hence it will wait for function > > > > InvalidateObsoleteReplicationSlot to release lock. Once the function > > > > 'InvalidateObsoleteReplicationSlot' releases the lock, the execution > > > > of 'create_logical_replication_slot' continues and creates a copy of > > > > the source slot. > > > > > > > > Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE > > > > lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence, > > > > it will wait for the 'InvalidateObsoleteReplicationSlot' to release > > > > the lock and then fetch the source slot info and try to create the > > > > copied slot (which will fail as source slot is invalidated before we > > > > fetch its info) > > > > > > > > > Besides, do we need one more invalidated check in the following codes after > > > > > creating the slot ? > > > > > > > > > > /* > > > > > * Check if the source slot still exists and is valid. We regard it as > > > > > * invalid if the type of replication slot or name has been changed, > > > > > * or the restart_lsn either is invalid or has gone backward. (The > > > > > ... > > > > > > > > > > > > > This approach seems more feasible to me. It also resolves the issue > > > > suggested by Vignesh in [1]. I have made changes for the same in v3 > > > > patch. > > > > > > > > > > I agree to check if the source slot got invalidated during the copy. > > > But why do we need to search the slot by the slot name again as > > > follows? > > > > > > + /* Check if source slot was invalidated while copying of slot */ > > > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > > > + > > > + for (int i = 0; i < max_replication_slots; i++) > > > + { > > > + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; > > > + > > > + if (s->in_use && strcmp(NameStr(s->data.name), > > > NameStr(*src_name)) == 0) > > > + { > > > + /* Copy the slot contents while holding spinlock */ > > > + SpinLockAcquire(&s->mutex); > > > + first_slot_contents = *s; > > > + SpinLockRelease(&s->mutex); > > > + src = s; > > > + break; > > > + } > > > + } > > > + > > > + LWLockRelease(ReplicationSlotControlLock); > > > > > > I think 'src' already points to the source slot. > > > > > Hi Sawada san, > > > > Thanks for reviewing the patch. > > I have used the 'src' instead of iterating again. I have attached the > > updated v4 patch. > > Thank you for updating the patch! I have one comment: > > + /* Check if source slot was invalidated while copying of slot */ > + SpinLockAcquire(&src->mutex); > + first_slot_contents = *src; > + SpinLockRelease(&src->mutex); > > We don't need to copy the source slot contents again since we already > do as follows: > > /* Copy data of source slot again */ > SpinLockAcquire(&src->mutex); > second_slot_contents = *src; > SpinLockRelease(&src->mutex); > > I think we can use second_slot_contents for that check. > I agree. I have updated the v5 patch to use second_slot_contents > I've investigated the slot invalidation and copying slots behaviors. > We cannot copy a slot if it doesn't reserve WAL, but IIUC the slot's > restart_lsn is not reset by slot invalidation due to other than > RS_INVAL_WAL_REMOVED. Therefore, it's possible that we copy a slot > invalidated by for example RS_INVAL_IDLE_TIMEOUT, and the copied > slot's restart_lsn might have already been removed, which ultimately > causes an assertion failure in ocpy_replication_slot(): > > #ifdef USE_ASSERT_CHECKING > /* Check that the restart_lsn is available */ > { > XLogSegNo segno; > > XLByteToSeg(copy_restart_lsn, segno, wal_segment_size); > Assert(XLogGetLastRemovedSegno() < segno); > } > #endif > > I think this issue exists from v16 or later, I've not tested yet > though. If my understanding is right, this patch has to be > backpatched. > I have tested the above in HEAD, PG 17 and PG 16 and found that we can hit the above ASSERT condition in all three branches. With the following steps: 1. create a physical replication setup 2. In standby create a logical replication slot. 3. change wal_level of primary to 'replica' and restart primary. The slot is invalidated with 'wal_level_insufficient' 4. change wal_level of primary to 'logical' and restart primary. 5. In primary insert some records and run checkpoint. Also run a checkpoint on standby. So, some initial wal files are removed. 6. Now copy the logical replication slot created in step 2. Then we can hit the assert. I agree that backpatching the patch can resolve this as it prevents copying of invalidated slots. I have attached the following patches: v5-0001 : for HEAD v5_PG_17_PG_16-0001 : for PG17 and PG16 Thanks and Regards, Shlok Kyal
Attachment
On Sun, 23 Feb 2025 at 06:46, Peter Smith <smithpb2250@gmail.com> wrote: > > Some review comments for patch v2-0001. > > ====== > Commit message > > 1. > Currently we can copy an invalidated logical and physical replication slot > using functions 'pg_copy_logical_replication_slot' and > 'pg_copy_physical_replication_slot' respectively. > With this patch we will throw an error in such cases. > > /we can copy an invalidated logical and physical replication slot/we > can copy invalidated logical and physical replication slots/ > Updated the commit message > ====== > doc/src/sgml/func.sgml > > pg_copy_physical_replication_slot: > > 2. > - is omitted, the same value as the > source slot is used. > + is omitted, the same value as the source slot is used. Copy of an > + invalidated physical replication slot in not allowed. > > Typo /in/is/ > > Also, IMO you don't need to say "physical replication slot" because it > is clear from the function's name. > > SUGGESTION > Copy of an invalidated slot is not allowed. > Fixed > ~~~ > > pg_copy_logical_replication_slot: > > 3. > + Copy of an invalidated logical replication slot in not allowed. > > Typo /in/is/ > > Also, IMO you don't need to say "logical replication slot" because it > is clear from the function's name. > > SUGGESTION > Copy of an invalidated slot is not allowed. > > Fixed > ====== > src/backend/replication/slotfuncs.c > > copy_replication_slot: > > 4. > + /* Check if source slot was invalidated while copying of slot */ > + SpinLockAcquire(&src->mutex); > + first_slot_contents = *src; > + SpinLockRelease(&src->mutex); > + > + src_isinvalidated = (first_slot_contents.data.invalidated != RS_INVAL_NONE); > + > + if (src_isinvalidated) > + ereport(ERROR, > + (errmsg("could not copy replication slot \"%s\"", > + NameStr(*src_name)), > + errdetail("The source replication slot was invalidated during the > copy operation."))); > > 4a. > We already know that it was not invalid the FIRST time we looked at > it, so IMO we only need to confirm that the SECOND look gives the same > answer. IOW, I thought the code should be like below. (AFAICT > Sawada-san's review says the same as this). > > Also, I think it is better to say "became invalidated" instead of "was > invalidated", to imply the state was known to be ok before the copy. > > SUGGESTION > > + /* Check if source slot became invalidated during the copy operation */ > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) > + ereport(ERROR, > > ~ > > 4b. > Unnecessary parentheses in the ereport. > > ~ > > 4c. > There seems some weird mix of tense "cannot copy" versus "could not > copy" already in this file. But, maybe at least you can be consistent > within the patch and always say "cannot". > Fixed. I have addressed the above comments in v5 patch [1]. [1]: https://www.postgresql.org/message-id/CANhcyEUHp6cRfaKf0ZqHCppCqpqzmsf5swpbnYGyRU%2BN%2Bihi%3DQ%40mail.gmail.com Thanks and Regards, Shlok Kyal
On Mon, Feb 24, 2025 at 3:06 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Sat, 22 Feb 2025 at 04:49, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Fri, Feb 21, 2025 at 4:30 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > On Fri, 21 Feb 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu) > > > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > > > > > On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > > > > > On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > > > > > On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > Currently, we can copy an invalidated slot using the function > > > > > > > > > 'pg_copy_logical_replication_slot'. As per the suggestion in the > > > > > > > > > thread [1], we should prohibit copying of such slots. > > > > > > > > > > > > > > > > > > I have created a patch to address the issue. > > > > > > > > > > > > > > > > This patch does not fix all the copy_replication_slot scenarios > > > > > > > > completely, there is a very corner concurrency case where an > > > > > > > > invalidated slot still gets copied: > > > > > > > > + /* We should not copy invalidated replication slots */ > > > > > > > > + if (src_isinvalidated) > > > > > > > > + ereport(ERROR, > > > > > > > > + > > > > > > > > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > > > > > > + errmsg("cannot copy an invalidated > > > > > > > > replication slot"))); > > > > > > > > > > > > > > > > Consider the following scenario: > > > > > > > > step 1) Set up streaming replication between the primary and standby nodes. > > > > > > > > step 2) Create a logical replication slot (test1) on the standby node. > > > > > > > > step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause > > > > > > > > is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or > > > > > > > > add a sleep in InvalidatePossiblyObsoleteSlot function like below: > > > > > > > > if (cause == RS_INVAL_WAL_LEVEL) > > > > > > > > { > > > > > > > > while (bsleep) > > > > > > > > sleep(1); > > > > > > > > } > > > > > > > > step 4) Reduce wal_level on the primary to replica and restart the primary > > > > > > > node. > > > > > > > > step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1', > > > > > > > > 'test2'); -- It will wait till the lock held by > > > > > > > > InvalidatePossiblyObsoleteSlot is released while trying to create a > > > > > > > > slot. > > > > > > > > step 6) Increase wal_level back to logical on the primary node and > > > > > > > > restart the primary. > > > > > > > > step 7) Now allow the invalidation to happen (continue the breakpoint > > > > > > > > held at step 3), the replication control lock will be released and the > > > > > > > > invalidated slot will be copied > > > > > > > > > > > > > > > > After this: > > > > > > > > postgres=# SELECT 'copy' FROM > > > > > > > > pg_copy_logical_replication_slot('test1', 'test2'); ?column? > > > > > > > > ---------- > > > > > > > > copy > > > > > > > > (1 row) > > > > > > > > > > > > > > > > -- The invalidated slot (test1) is copied successfully: > > > > > > > > postgres=# select * from pg_replication_slots ; > > > > > > > > slot_name | plugin | slot_type | datoid | database | temporary > > > > > > > > | active | active_pid | xmin | catalog_xmin | restart_lsn | > > > > > > > > confirmed_flush_lsn | wal_status | safe_wal_size | two_phas > > > > > > > > e | inactive_since | conflicting | > > > > > > > > invalidation_reason | failover | synced > > > > > > > > > > > > > > > -----------+---------------+-----------+--------+----------+-----------+ > > > > > > > --------+------------+------+--------------+-------------+--------------- > > > > > > > ------+------------+---------------+--------- > > > > > > > > > > > > > > > --+----------------------------------+-------------+---------------------- > > > > > > > --+----------+-------- > > > > > > > > test1 | test_decoding | logical | 5 | postgres | f > > > > > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > > > > > | lost | | f > > > > > > > > | 2025-02-13 15:26:54.666725+05:30 | t | > > > > > > > > wal_level_insufficient | f | f > > > > > > > > test2 | test_decoding | logical | 5 | postgres | f > > > > > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > > > > > | reserved | | f > > > > > > > > | 2025-02-13 15:30:30.477836+05:30 | f | > > > > > > > > | f | f > > > > > > > > (2 rows) > > > > > > > > > > > > > > > > -- A subsequent attempt to decode changes from the invalidated slot > > > > > > > > (test2) fails: > > > > > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > > > > > NULL); > > > > > > > > WARNING: detected write past chunk end in TXN 0x5e77e6c6f300 > > > > > > > > ERROR: logical decoding on standby requires "wal_level" >= "logical" > > > > > > > > on the primary > > > > > > > > > > > > > > > > -- Alternatively, the following error may occur: > > > > > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > > > > > NULL); > > > > > > > > WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0 > > > > > > > > data > > > > > > > > ------------ > > > > > > > > BEGIN 744 > > > > > > > > COMMIT 744 > > > > > > > > (2 rows) > > > > > > > > > > > > > > > > This is an edge case that can occur under specific conditions > > > > > > > > involving replication slot invalidation when there is a huge lag > > > > > > > > between primary and standby. > > > > > > > > There might be a similar concurrency case for wal_removed too. > > > > > > > > > > > > > > > > > > > > > > Hi Vignesh, > > > > > > > > > > > > > > Thanks for reviewing the patch. > > > > > > > > > > > > Thanks for updating the patch. I have a question related to it. > > > > > > > > > > > > > > > > > > > > I have tested the above scenario and was able to reproduce it. I have fixed it in > > > > > > > the v2 patch. > > > > > > > Currently we are taking a shared lock on ReplicationSlotControlLock. > > > > > > > This issue can be resolved if we take an exclusive lock instead. > > > > > > > Thoughts? > > > > > > > > > > > > It's not clear to me why increasing the lock level can solve it, could you > > > > > > elaborate a bit more on this ? > > > > > > > > > > > In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on > > > > > 'ReplicationSlotControlLock' > > > > > Also in function 'copy_replication_slot' we take a SHARED lock on > > > > > 'ReplicationSlotControlLock' during fetching of source slot. > > > > > > > > > > So, for the case described by Vignesh in [1], first > > > > > InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock > > > > > on 'ReplicationSlotControlLock'. We are now holding the function using > > > > > the sleep > > > > > if (cause == RS_INVAL_WAL_LEVEL) > > > > > { > > > > > while (bsleep) > > > > > sleep(1); > > > > > } > > > > > > > > > > Now we create a copy of the slot since 'copy_replication_slot' takes > > > > > a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock > > > > > and fetch the info of the source slot (the slot is not invalidated > > > > > till now). and the function 'copy_replication_slot' calls function > > > > > 'create_logical_replication_slot' which takes a EXCLUSIVE lock on > > > > > ReplicationSlotControlLock and hence it will wait for function > > > > > InvalidateObsoleteReplicationSlot to release lock. Once the function > > > > > 'InvalidateObsoleteReplicationSlot' releases the lock, the execution > > > > > of 'create_logical_replication_slot' continues and creates a copy of > > > > > the source slot. > > > > > > > > > > Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE > > > > > lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence, > > > > > it will wait for the 'InvalidateObsoleteReplicationSlot' to release > > > > > the lock and then fetch the source slot info and try to create the > > > > > copied slot (which will fail as source slot is invalidated before we > > > > > fetch its info) > > > > > > > > > > > Besides, do we need one more invalidated check in the following codes after > > > > > > creating the slot ? > > > > > > > > > > > > /* > > > > > > * Check if the source slot still exists and is valid. We regard it as > > > > > > * invalid if the type of replication slot or name has been changed, > > > > > > * or the restart_lsn either is invalid or has gone backward. (The > > > > > > ... > > > > > > > > > > > > > > > > This approach seems more feasible to me. It also resolves the issue > > > > > suggested by Vignesh in [1]. I have made changes for the same in v3 > > > > > patch. > > > > > > > > > > > > > I agree to check if the source slot got invalidated during the copy. > > > > But why do we need to search the slot by the slot name again as > > > > follows? > > > > > > > > + /* Check if source slot was invalidated while copying of slot */ > > > > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > > > > + > > > > + for (int i = 0; i < max_replication_slots; i++) > > > > + { > > > > + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; > > > > + > > > > + if (s->in_use && strcmp(NameStr(s->data.name), > > > > NameStr(*src_name)) == 0) > > > > + { > > > > + /* Copy the slot contents while holding spinlock */ > > > > + SpinLockAcquire(&s->mutex); > > > > + first_slot_contents = *s; > > > > + SpinLockRelease(&s->mutex); > > > > + src = s; > > > > + break; > > > > + } > > > > + } > > > > + > > > > + LWLockRelease(ReplicationSlotControlLock); > > > > > > > > I think 'src' already points to the source slot. > > > > > > > Hi Sawada san, > > > > > > Thanks for reviewing the patch. > > > I have used the 'src' instead of iterating again. I have attached the > > > updated v4 patch. > > > > Thank you for updating the patch! I have one comment: > > > > + /* Check if source slot was invalidated while copying of slot */ > > + SpinLockAcquire(&src->mutex); > > + first_slot_contents = *src; > > + SpinLockRelease(&src->mutex); > > > > We don't need to copy the source slot contents again since we already > > do as follows: > > > > /* Copy data of source slot again */ > > SpinLockAcquire(&src->mutex); > > second_slot_contents = *src; > > SpinLockRelease(&src->mutex); > > > > I think we can use second_slot_contents for that check. > > > I agree. I have updated the v5 patch to use second_slot_contents > > > I've investigated the slot invalidation and copying slots behaviors. > > We cannot copy a slot if it doesn't reserve WAL, but IIUC the slot's > > restart_lsn is not reset by slot invalidation due to other than > > RS_INVAL_WAL_REMOVED. Therefore, it's possible that we copy a slot > > invalidated by for example RS_INVAL_IDLE_TIMEOUT, and the copied > > slot's restart_lsn might have already been removed, which ultimately > > causes an assertion failure in ocpy_replication_slot(): > > > > #ifdef USE_ASSERT_CHECKING > > /* Check that the restart_lsn is available */ > > { > > XLogSegNo segno; > > > > XLByteToSeg(copy_restart_lsn, segno, wal_segment_size); > > Assert(XLogGetLastRemovedSegno() < segno); > > } > > #endif > > > > I think this issue exists from v16 or later, I've not tested yet > > though. If my understanding is right, this patch has to be > > backpatched. > > > I have tested the above in HEAD, PG 17 and PG 16 and found that we can > hit the above ASSERT condition in all three branches. With the > following steps: > 1. create a physical replication setup > 2. In standby create a logical replication slot. > 3. change wal_level of primary to 'replica' and restart primary. The > slot is invalidated with 'wal_level_insufficient' > 4. change wal_level of primary to 'logical' and restart primary. > 5. In primary insert some records and run checkpoint. Also run a > checkpoint on standby. So, some initial wal files are removed. > 6. Now copy the logical replication slot created in step 2. Then we > can hit the assert. > > I agree that backpatching the patch can resolve this as it prevents > copying of invalidated slots. > > I have attached the following patches: > v5-0001 : for HEAD > v5_PG_17_PG_16-0001 : for PG17 and PG16 I've checked if this issue exists also on v15 or older, but IIUC it doesn't exist, fortunately. Here is the summary: Scenario-1: the source gets invalidated before the copy function fetches its contents for the first time. In this case, since the source slot's restart_lsn is already an invalid LSN it raises an error appropriately. In v15, we have only one slot invaldation reason, WAL removal, therefore we always reset the slot's restart_lsn to InvalidXlogRecPtr. Scenario-2: the source gets invalidated before the copied slot is created (i.e., between first content copy and create_logical/physical_replication_slot()). In this case, the copied slot could have a valid restart_lsn value that however might point to a WAL segment that might have already been removed. However, since copy_restart_lsn will be an invalid LSN (=0), it's caught by the following if condition: if (copy_restart_lsn < src_restart_lsn || src_islogical != copy_islogical || strcmp(copy_name, NameStr(*src_name)) != 0) ereport(ERROR, (errmsg("could not copy replication slot \"%s\"", NameStr(*src_name)), errdetail("The source replication slot was modified incompatibly during the copy operation."))); Scenario-3: the source gets invalidated after creating the copied slot (i.e. after create_logical/physical_replication_slot()). In this case, since the newly copied slot have the same restart_lsn as the source slot, both slots are invalidated. If the above analysis is right, I think the patches are mostly ready. I've made some changes to the patches: - removed src_isinvalidated variable as it's not necessarily necessary. - updated the commit message. Please review them. If there are no further comments on these patches, I'm going to push them. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
On Tue, 25 Feb 2025 at 01:03, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Feb 24, 2025 at 3:06 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > On Sat, 22 Feb 2025 at 04:49, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Fri, Feb 21, 2025 at 4:30 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > On Fri, 21 Feb 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > On Wed, Feb 19, 2025 at 3:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > > > On Tue, 18 Feb 2025 at 15:26, Zhijie Hou (Fujitsu) > > > > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > > > > > > > On Monday, February 17, 2025 7:31 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > > > > > > > On Thu, 13 Feb 2025 at 15:54, vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > > > > > > > > > > > On Tue, 4 Feb 2025 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > Currently, we can copy an invalidated slot using the function > > > > > > > > > > 'pg_copy_logical_replication_slot'. As per the suggestion in the > > > > > > > > > > thread [1], we should prohibit copying of such slots. > > > > > > > > > > > > > > > > > > > > I have created a patch to address the issue. > > > > > > > > > > > > > > > > > > This patch does not fix all the copy_replication_slot scenarios > > > > > > > > > completely, there is a very corner concurrency case where an > > > > > > > > > invalidated slot still gets copied: > > > > > > > > > + /* We should not copy invalidated replication slots */ > > > > > > > > > + if (src_isinvalidated) > > > > > > > > > + ereport(ERROR, > > > > > > > > > + > > > > > > > > > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > > > > > > > > > + errmsg("cannot copy an invalidated > > > > > > > > > replication slot"))); > > > > > > > > > > > > > > > > > > Consider the following scenario: > > > > > > > > > step 1) Set up streaming replication between the primary and standby nodes. > > > > > > > > > step 2) Create a logical replication slot (test1) on the standby node. > > > > > > > > > step 3) Have a breakpoint in InvalidatePossiblyObsoleteSlot if cause > > > > > > > > > is RS_INVAL_WAL_LEVEL, no need to hold other invalidation causes or > > > > > > > > > add a sleep in InvalidatePossiblyObsoleteSlot function like below: > > > > > > > > > if (cause == RS_INVAL_WAL_LEVEL) > > > > > > > > > { > > > > > > > > > while (bsleep) > > > > > > > > > sleep(1); > > > > > > > > > } > > > > > > > > > step 4) Reduce wal_level on the primary to replica and restart the primary > > > > > > > > node. > > > > > > > > > step 5) SELECT 'copy' FROM pg_copy_logical_replication_slot('test1', > > > > > > > > > 'test2'); -- It will wait till the lock held by > > > > > > > > > InvalidatePossiblyObsoleteSlot is released while trying to create a > > > > > > > > > slot. > > > > > > > > > step 6) Increase wal_level back to logical on the primary node and > > > > > > > > > restart the primary. > > > > > > > > > step 7) Now allow the invalidation to happen (continue the breakpoint > > > > > > > > > held at step 3), the replication control lock will be released and the > > > > > > > > > invalidated slot will be copied > > > > > > > > > > > > > > > > > > After this: > > > > > > > > > postgres=# SELECT 'copy' FROM > > > > > > > > > pg_copy_logical_replication_slot('test1', 'test2'); ?column? > > > > > > > > > ---------- > > > > > > > > > copy > > > > > > > > > (1 row) > > > > > > > > > > > > > > > > > > -- The invalidated slot (test1) is copied successfully: > > > > > > > > > postgres=# select * from pg_replication_slots ; > > > > > > > > > slot_name | plugin | slot_type | datoid | database | temporary > > > > > > > > > | active | active_pid | xmin | catalog_xmin | restart_lsn | > > > > > > > > > confirmed_flush_lsn | wal_status | safe_wal_size | two_phas > > > > > > > > > e | inactive_since | conflicting | > > > > > > > > > invalidation_reason | failover | synced > > > > > > > > > > > > > > > > > -----------+---------------+-----------+--------+----------+-----------+ > > > > > > > > --------+------------+------+--------------+-------------+--------------- > > > > > > > > ------+------------+---------------+--------- > > > > > > > > > > > > > > > > > --+----------------------------------+-------------+---------------------- > > > > > > > > --+----------+-------- > > > > > > > > > test1 | test_decoding | logical | 5 | postgres | f > > > > > > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > > > > > > | lost | | f > > > > > > > > > | 2025-02-13 15:26:54.666725+05:30 | t | > > > > > > > > > wal_level_insufficient | f | f > > > > > > > > > test2 | test_decoding | logical | 5 | postgres | f > > > > > > > > > | f | | | 745 | 0/4029060 | 0/4029098 > > > > > > > > > | reserved | | f > > > > > > > > > | 2025-02-13 15:30:30.477836+05:30 | f | > > > > > > > > > | f | f > > > > > > > > > (2 rows) > > > > > > > > > > > > > > > > > > -- A subsequent attempt to decode changes from the invalidated slot > > > > > > > > > (test2) fails: > > > > > > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > > > > > > NULL); > > > > > > > > > WARNING: detected write past chunk end in TXN 0x5e77e6c6f300 > > > > > > > > > ERROR: logical decoding on standby requires "wal_level" >= "logical" > > > > > > > > > on the primary > > > > > > > > > > > > > > > > > > -- Alternatively, the following error may occur: > > > > > > > > > postgres=# SELECT data FROM pg_logical_slot_get_changes('test2', NULL, > > > > > > > > > NULL); > > > > > > > > > WARNING: detected write past chunk end in TXN 0x582d1b2d6ef0 > > > > > > > > > data > > > > > > > > > ------------ > > > > > > > > > BEGIN 744 > > > > > > > > > COMMIT 744 > > > > > > > > > (2 rows) > > > > > > > > > > > > > > > > > > This is an edge case that can occur under specific conditions > > > > > > > > > involving replication slot invalidation when there is a huge lag > > > > > > > > > between primary and standby. > > > > > > > > > There might be a similar concurrency case for wal_removed too. > > > > > > > > > > > > > > > > > > > > > > > > > Hi Vignesh, > > > > > > > > > > > > > > > > Thanks for reviewing the patch. > > > > > > > > > > > > > > Thanks for updating the patch. I have a question related to it. > > > > > > > > > > > > > > > > > > > > > > > I have tested the above scenario and was able to reproduce it. I have fixed it in > > > > > > > > the v2 patch. > > > > > > > > Currently we are taking a shared lock on ReplicationSlotControlLock. > > > > > > > > This issue can be resolved if we take an exclusive lock instead. > > > > > > > > Thoughts? > > > > > > > > > > > > > > It's not clear to me why increasing the lock level can solve it, could you > > > > > > > elaborate a bit more on this ? > > > > > > > > > > > > > In HEAD, InvalidateObsoleteReplicationSlots acquires a SHARED lock on > > > > > > 'ReplicationSlotControlLock' > > > > > > Also in function 'copy_replication_slot' we take a SHARED lock on > > > > > > 'ReplicationSlotControlLock' during fetching of source slot. > > > > > > > > > > > > So, for the case described by Vignesh in [1], first > > > > > > InvalidateObsoleteReplicationSlot is called and we hold a SHARED lock > > > > > > on 'ReplicationSlotControlLock'. We are now holding the function using > > > > > > the sleep > > > > > > if (cause == RS_INVAL_WAL_LEVEL) > > > > > > { > > > > > > while (bsleep) > > > > > > sleep(1); > > > > > > } > > > > > > > > > > > > Now we create a copy of the slot since 'copy_replication_slot' takes > > > > > > a SHARED lock on 'ReplicationSlotControlLock'. It will take the lock > > > > > > and fetch the info of the source slot (the slot is not invalidated > > > > > > till now). and the function 'copy_replication_slot' calls function > > > > > > 'create_logical_replication_slot' which takes a EXCLUSIVE lock on > > > > > > ReplicationSlotControlLock and hence it will wait for function > > > > > > InvalidateObsoleteReplicationSlot to release lock. Once the function > > > > > > 'InvalidateObsoleteReplicationSlot' releases the lock, the execution > > > > > > of 'create_logical_replication_slot' continues and creates a copy of > > > > > > the source slot. > > > > > > > > > > > > Now with the patch, 'copy_replication_slot' will take an EXCLUSIVE > > > > > > lock on 'ReplicationSlotControlLock'. to fetch the slot info. Hence, > > > > > > it will wait for the 'InvalidateObsoleteReplicationSlot' to release > > > > > > the lock and then fetch the source slot info and try to create the > > > > > > copied slot (which will fail as source slot is invalidated before we > > > > > > fetch its info) > > > > > > > > > > > > > Besides, do we need one more invalidated check in the following codes after > > > > > > > creating the slot ? > > > > > > > > > > > > > > /* > > > > > > > * Check if the source slot still exists and is valid. We regard it as > > > > > > > * invalid if the type of replication slot or name has been changed, > > > > > > > * or the restart_lsn either is invalid or has gone backward. (The > > > > > > > ... > > > > > > > > > > > > > > > > > > > This approach seems more feasible to me. It also resolves the issue > > > > > > suggested by Vignesh in [1]. I have made changes for the same in v3 > > > > > > patch. > > > > > > > > > > > > > > > > I agree to check if the source slot got invalidated during the copy. > > > > > But why do we need to search the slot by the slot name again as > > > > > follows? > > > > > > > > > > + /* Check if source slot was invalidated while copying of slot */ > > > > > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > > > > > + > > > > > + for (int i = 0; i < max_replication_slots; i++) > > > > > + { > > > > > + ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i]; > > > > > + > > > > > + if (s->in_use && strcmp(NameStr(s->data.name), > > > > > NameStr(*src_name)) == 0) > > > > > + { > > > > > + /* Copy the slot contents while holding spinlock */ > > > > > + SpinLockAcquire(&s->mutex); > > > > > + first_slot_contents = *s; > > > > > + SpinLockRelease(&s->mutex); > > > > > + src = s; > > > > > + break; > > > > > + } > > > > > + } > > > > > + > > > > > + LWLockRelease(ReplicationSlotControlLock); > > > > > > > > > > I think 'src' already points to the source slot. > > > > > > > > > Hi Sawada san, > > > > > > > > Thanks for reviewing the patch. > > > > I have used the 'src' instead of iterating again. I have attached the > > > > updated v4 patch. > > > > > > Thank you for updating the patch! I have one comment: > > > > > > + /* Check if source slot was invalidated while copying of slot */ > > > + SpinLockAcquire(&src->mutex); > > > + first_slot_contents = *src; > > > + SpinLockRelease(&src->mutex); > > > > > > We don't need to copy the source slot contents again since we already > > > do as follows: > > > > > > /* Copy data of source slot again */ > > > SpinLockAcquire(&src->mutex); > > > second_slot_contents = *src; > > > SpinLockRelease(&src->mutex); > > > > > > I think we can use second_slot_contents for that check. > > > > > I agree. I have updated the v5 patch to use second_slot_contents > > > > > I've investigated the slot invalidation and copying slots behaviors. > > > We cannot copy a slot if it doesn't reserve WAL, but IIUC the slot's > > > restart_lsn is not reset by slot invalidation due to other than > > > RS_INVAL_WAL_REMOVED. Therefore, it's possible that we copy a slot > > > invalidated by for example RS_INVAL_IDLE_TIMEOUT, and the copied > > > slot's restart_lsn might have already been removed, which ultimately > > > causes an assertion failure in ocpy_replication_slot(): > > > > > > #ifdef USE_ASSERT_CHECKING > > > /* Check that the restart_lsn is available */ > > > { > > > XLogSegNo segno; > > > > > > XLByteToSeg(copy_restart_lsn, segno, wal_segment_size); > > > Assert(XLogGetLastRemovedSegno() < segno); > > > } > > > #endif > > > > > > I think this issue exists from v16 or later, I've not tested yet > > > though. If my understanding is right, this patch has to be > > > backpatched. > > > > > I have tested the above in HEAD, PG 17 and PG 16 and found that we can > > hit the above ASSERT condition in all three branches. With the > > following steps: > > 1. create a physical replication setup > > 2. In standby create a logical replication slot. > > 3. change wal_level of primary to 'replica' and restart primary. The > > slot is invalidated with 'wal_level_insufficient' > > 4. change wal_level of primary to 'logical' and restart primary. > > 5. In primary insert some records and run checkpoint. Also run a > > checkpoint on standby. So, some initial wal files are removed. > > 6. Now copy the logical replication slot created in step 2. Then we > > can hit the assert. > > > > I agree that backpatching the patch can resolve this as it prevents > > copying of invalidated slots. > > > > I have attached the following patches: > > v5-0001 : for HEAD > > v5_PG_17_PG_16-0001 : for PG17 and PG16 > > I've checked if this issue exists also on v15 or older, but IIUC it > doesn't exist, fortunately. Here is the summary: > > Scenario-1: the source gets invalidated before the copy function > fetches its contents for the first time. In this case, since the > source slot's restart_lsn is already an invalid LSN it raises an error > appropriately. In v15, we have only one slot invaldation reason, WAL > removal, therefore we always reset the slot's restart_lsn to > InvalidXlogRecPtr. > > Scenario-2: the source gets invalidated before the copied slot is > created (i.e., between first content copy and > create_logical/physical_replication_slot()). In this case, the copied > slot could have a valid restart_lsn value that however might point to > a WAL segment that might have already been removed. However, since > copy_restart_lsn will be an invalid LSN (=0), it's caught by the > following if condition: > > if (copy_restart_lsn < src_restart_lsn || > src_islogical != copy_islogical || > strcmp(copy_name, NameStr(*src_name)) != 0) > ereport(ERROR, > (errmsg("could not copy replication slot \"%s\"", > NameStr(*src_name)), > errdetail("The source replication slot was > modified incompatibly during the copy operation."))); > > Scenario-3: the source gets invalidated after creating the copied slot > (i.e. after create_logical/physical_replication_slot()). In this case, > since the newly copied slot have the same restart_lsn as the source > slot, both slots are invalidated. > > If the above analysis is right, I think the patches are mostly ready. > I've made some changes to the patches: > > - removed src_isinvalidated variable as it's not necessarily necessary. > - updated the commit message. > > Please review them. If there are no further comments on these patches, > I'm going to push them. > I have verified the above scenarios and I confirm the behaviour described. I have a small doubt. In PG_17 and PG_16 we can invalidate physical slots only for 'wal_removed' case [1]. And copying of such slot already give error 'cannot copy a replication slot that doesn't reserve WAL'. So, in PG17 and PG16 should we check for invalidated source slot only if we are copying logical slots? For HEAD the changes looks fine to me as in HEAD we can invalidate physical slots for 'wal_removed' and 'idle timeout'. [1]: https://github.com/postgres/postgres/blob/7c906c5b46f8189a04e67bc550cba33dd3851b6e/src/backend/replication/slot.c#L1600 Thanks and Regards, Shlok Kyal
On Tue, Feb 25, 2025 at 1:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > I've checked if this issue exists also on v15 or older, but IIUC it > doesn't exist, fortunately. Here is the summary: > > Scenario-1: the source gets invalidated before the copy function > fetches its contents for the first time. In this case, since the > source slot's restart_lsn is already an invalid LSN it raises an error > appropriately. In v15, we have only one slot invaldation reason, WAL > removal, therefore we always reset the slot's restart_lsn to > InvalidXlogRecPtr. > > Scenario-2: the source gets invalidated before the copied slot is > created (i.e., between first content copy and > create_logical/physical_replication_slot()). In this case, the copied > slot could have a valid restart_lsn value that however might point to > a WAL segment that might have already been removed. However, since > copy_restart_lsn will be an invalid LSN (=0), it's caught by the > following if condition: > > if (copy_restart_lsn < src_restart_lsn || > src_islogical != copy_islogical || > strcmp(copy_name, NameStr(*src_name)) != 0) > ereport(ERROR, > (errmsg("could not copy replication slot \"%s\"", > NameStr(*src_name)), > errdetail("The source replication slot was > modified incompatibly during the copy operation."))); > > Scenario-3: the source gets invalidated after creating the copied slot > (i.e. after create_logical/physical_replication_slot()). In this case, > since the newly copied slot have the same restart_lsn as the source > slot, both slots are invalidated. > Which part of the code will cover Scenario-3? Shouldn't we give ERROR for Scenario-3 as well? -- With Regards, Amit Kapila.
On Tue, Feb 25, 2025 at 2:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Feb 25, 2025 at 1:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I've checked if this issue exists also on v15 or older, but IIUC it > > doesn't exist, fortunately. Here is the summary: > > > > Scenario-1: the source gets invalidated before the copy function > > fetches its contents for the first time. In this case, since the > > source slot's restart_lsn is already an invalid LSN it raises an error > > appropriately. In v15, we have only one slot invaldation reason, WAL > > removal, therefore we always reset the slot's restart_lsn to > > InvalidXlogRecPtr. > > > > Scenario-2: the source gets invalidated before the copied slot is > > created (i.e., between first content copy and > > create_logical/physical_replication_slot()). In this case, the copied > > slot could have a valid restart_lsn value that however might point to > > a WAL segment that might have already been removed. However, since > > copy_restart_lsn will be an invalid LSN (=0), it's caught by the > > following if condition: > > > > if (copy_restart_lsn < src_restart_lsn || > > src_islogical != copy_islogical || > > strcmp(copy_name, NameStr(*src_name)) != 0) > > ereport(ERROR, > > (errmsg("could not copy replication slot \"%s\"", > > NameStr(*src_name)), > > errdetail("The source replication slot was > > modified incompatibly during the copy operation."))); > > > > Scenario-3: the source gets invalidated after creating the copied slot > > (i.e. after create_logical/physical_replication_slot()). In this case, > > since the newly copied slot have the same restart_lsn as the source > > slot, both slots are invalidated. > > > > Which part of the code will cover Scenario-3? Shouldn't we give ERROR > for Scenario-3 as well? In scenario-3, the backend process executing pg_copy_logical/physical_replication_slot() already holds the new copied slot and its restart_lsn is the same or older than the source slot's restart_lsn. Therefore, if the source slot is invalidated at that timing, the copied slot is invalidated too, resulting in an error by the backend. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Mon, Feb 24, 2025 at 10:09 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Tue, 25 Feb 2025 at 01:03, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > I've checked if this issue exists also on v15 or older, but IIUC it > > doesn't exist, fortunately. Here is the summary: > > > > Scenario-1: the source gets invalidated before the copy function > > fetches its contents for the first time. In this case, since the > > source slot's restart_lsn is already an invalid LSN it raises an error > > appropriately. In v15, we have only one slot invaldation reason, WAL > > removal, therefore we always reset the slot's restart_lsn to > > InvalidXlogRecPtr. > > > > Scenario-2: the source gets invalidated before the copied slot is > > created (i.e., between first content copy and > > create_logical/physical_replication_slot()). In this case, the copied > > slot could have a valid restart_lsn value that however might point to > > a WAL segment that might have already been removed. However, since > > copy_restart_lsn will be an invalid LSN (=0), it's caught by the > > following if condition: > > > > if (copy_restart_lsn < src_restart_lsn || > > src_islogical != copy_islogical || > > strcmp(copy_name, NameStr(*src_name)) != 0) > > ereport(ERROR, > > (errmsg("could not copy replication slot \"%s\"", > > NameStr(*src_name)), > > errdetail("The source replication slot was > > modified incompatibly during the copy operation."))); > > > > Scenario-3: the source gets invalidated after creating the copied slot > > (i.e. after create_logical/physical_replication_slot()). In this case, > > since the newly copied slot have the same restart_lsn as the source > > slot, both slots are invalidated. > > > > If the above analysis is right, I think the patches are mostly ready. > > I've made some changes to the patches: > > > > - removed src_isinvalidated variable as it's not necessarily necessary. > > - updated the commit message. > > > > Please review them. If there are no further comments on these patches, > > I'm going to push them. > > > I have verified the above scenarios and I confirm the behaviour described. > > I have a small doubt. > In PG_17 and PG_16 we can invalidate physical slots only for > 'wal_removed' case [1]. And copying of such slot already give error > 'cannot copy a replication slot that doesn't reserve WAL'. So, in PG17 > and PG16 should we check for invalidated source slot only if we are > copying logical slots? Although your analysis is correct, I believe we should retain the validation check. Even though invalidated physical slots in PostgreSQL 16 and 17always have an invalid restart_lsn, maintaining this check would be harmless. Furthermore, I prefer to maintain consistency in the codebase across back branches and the master branch rather than introducing variations. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Feb 25, 2025 at 11:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Tue, Feb 25, 2025 at 2:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > Scenario-3: the source gets invalidated after creating the copied slot > > > (i.e. after create_logical/physical_replication_slot()). In this case, > > > since the newly copied slot have the same restart_lsn as the source > > > slot, both slots are invalidated. > > > > > > > Which part of the code will cover Scenario-3? Shouldn't we give ERROR > > for Scenario-3 as well? > > In scenario-3, the backend process executing > pg_copy_logical/physical_replication_slot() already holds the new > copied slot and its restart_lsn is the same or older than the source > slot's restart_lsn. Therefore, if the source slot is invalidated at > that timing, the copied slot is invalidated too, resulting in an error > by the backend. > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with this operation. So, isn't it possible that the source slot exists at the later position in ReplicationSlotCtl->replication_slots and the loop traversing slots is already ahead from the point where the newly copied slot is created? If so, the newly created slot won't be marked as invalid whereas the source slot will be marked as invalid. I agree that even in such a case, at a later point, the newly created slot will also be marked as invalid. -- With Regards, Amit Kapila.
On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Feb 25, 2025 at 11:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Tue, Feb 25, 2025 at 2:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > Scenario-3: the source gets invalidated after creating the copied slot > > > > (i.e. after create_logical/physical_replication_slot()). In this case, > > > > since the newly copied slot have the same restart_lsn as the source > > > > slot, both slots are invalidated. > > > > > > > > > > Which part of the code will cover Scenario-3? Shouldn't we give ERROR > > > for Scenario-3 as well? > > > > In scenario-3, the backend process executing > > pg_copy_logical/physical_replication_slot() already holds the new > > copied slot and its restart_lsn is the same or older than the source > > slot's restart_lsn. Therefore, if the source slot is invalidated at > > that timing, the copied slot is invalidated too, resulting in an error > > by the backend. > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > this operation. So, isn't it possible that the source slot exists at > the later position in ReplicationSlotCtl->replication_slots and the > loop traversing slots is already ahead from the point where the newly > copied slot is created? Good point. I think it's possible. > If so, the newly created slot won't be marked > as invalid whereas the source slot will be marked as invalid. I agree > that even in such a case, at a later point, the newly created slot > will also be marked as invalid. The wal_status of the newly created slot would immediately become 'lost' and the next checkpoint will invalidate it. Do we need to do something to deal with this case? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > > this operation. So, isn't it possible that the source slot exists at > > the later position in ReplicationSlotCtl->replication_slots and the > > loop traversing slots is already ahead from the point where the newly > > copied slot is created? > > Good point. I think it's possible. > > > If so, the newly created slot won't be marked > > as invalid whereas the source slot will be marked as invalid. I agree > > that even in such a case, at a later point, the newly created slot > > will also be marked as invalid. > > The wal_status of the newly created slot would immediately become > 'lost' and the next checkpoint will invalidate it. Do we need to do > something to deal with this case? > + /* Check if source slot became invalidated during the copy operation */ + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) + ereport(ERROR, + errmsg("cannot copy replication slot \"%s\"", + NameStr(*src_name)), + errdetail("The source replication slot was invalidated during the copy operation.")); How about adding a similar test as above for MyReplicationSlot? That should suffice the need because we have already acquired the new slot by this time and invalidation should signal this process before marking the new slot as invalid. -- With Regards, Amit Kapila.
On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > > > this operation. So, isn't it possible that the source slot exists at > > > the later position in ReplicationSlotCtl->replication_slots and the > > > loop traversing slots is already ahead from the point where the newly > > > copied slot is created? > > > > Good point. I think it's possible. > > > > > If so, the newly created slot won't be marked > > > as invalid whereas the source slot will be marked as invalid. I agree > > > that even in such a case, at a later point, the newly created slot > > > will also be marked as invalid. > > > > The wal_status of the newly created slot would immediately become > > 'lost' and the next checkpoint will invalidate it. Do we need to do > > something to deal with this case? > > > > + /* Check if source slot became invalidated during the copy operation */ > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) > + ereport(ERROR, > + errmsg("cannot copy replication slot \"%s\"", > + NameStr(*src_name)), > + errdetail("The source replication slot was invalidated during the > copy operation.")); > > How about adding a similar test as above for MyReplicationSlot? That > should suffice the need because we have already acquired the new slot > by this time and invalidation should signal this process before > marking the new slot as invalid. IIUC in the scenario you mentioned, the loop traversing slots already passed the position of newly created slot in ReplicationSlotCtl->replication_slots array, so MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > > > > this operation. So, isn't it possible that the source slot exists at > > > > the later position in ReplicationSlotCtl->replication_slots and the > > > > loop traversing slots is already ahead from the point where the newly > > > > copied slot is created? > > > > > > Good point. I think it's possible. > > > > > > > If so, the newly created slot won't be marked > > > > as invalid whereas the source slot will be marked as invalid. I agree > > > > that even in such a case, at a later point, the newly created slot > > > > will also be marked as invalid. > > > > > > The wal_status of the newly created slot would immediately become > > > 'lost' and the next checkpoint will invalidate it. Do we need to do > > > something to deal with this case? > > > > > > > + /* Check if source slot became invalidated during the copy operation */ > > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) > > + ereport(ERROR, > > + errmsg("cannot copy replication slot \"%s\"", > > + NameStr(*src_name)), > > + errdetail("The source replication slot was invalidated during the > > copy operation.")); > > > > How about adding a similar test as above for MyReplicationSlot? That > > should suffice the need because we have already acquired the new slot > > by this time and invalidation should signal this process before > > marking the new slot as invalid. > > IIUC in the scenario you mentioned, the loop traversing slots already > passed the position of newly created slot in > ReplicationSlotCtl->replication_slots array, so > MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no? > Right, I don't have a simpler solution for this apart from making it somehow serialize this operation with InvalidateObsoleteReplicationSlots(). I don't think we need to go that far to handle the scenario being discussed. It is better to add a comment for this race and why it won't harm. -- With Regards, Amit Kapila.
On Thu, Feb 27, 2025 at 7:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > > > > > this operation. So, isn't it possible that the source slot exists at > > > > > the later position in ReplicationSlotCtl->replication_slots and the > > > > > loop traversing slots is already ahead from the point where the newly > > > > > copied slot is created? > > > > > > > > Good point. I think it's possible. > > > > > > > > > If so, the newly created slot won't be marked > > > > > as invalid whereas the source slot will be marked as invalid. I agree > > > > > that even in such a case, at a later point, the newly created slot > > > > > will also be marked as invalid. > > > > > > > > The wal_status of the newly created slot would immediately become > > > > 'lost' and the next checkpoint will invalidate it. Do we need to do > > > > something to deal with this case? > > > > > > > > > > + /* Check if source slot became invalidated during the copy operation */ > > > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) > > > + ereport(ERROR, > > > + errmsg("cannot copy replication slot \"%s\"", > > > + NameStr(*src_name)), > > > + errdetail("The source replication slot was invalidated during the > > > copy operation.")); > > > > > > How about adding a similar test as above for MyReplicationSlot? That > > > should suffice the need because we have already acquired the new slot > > > by this time and invalidation should signal this process before > > > marking the new slot as invalid. > > > > IIUC in the scenario you mentioned, the loop traversing slots already > > passed the position of newly created slot in > > ReplicationSlotCtl->replication_slots array, so > > MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no? > > > > Right, I don't have a simpler solution for this apart from making it > somehow serialize this operation with > InvalidateObsoleteReplicationSlots(). I don't think we need to go that > far to handle the scenario being discussed. Agreed. > It is better to add a > comment for this race and why it won't harm. +1 Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, 28 Feb 2025 at 08:56, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > > > > > this operation. So, isn't it possible that the source slot exists at > > > > > the later position in ReplicationSlotCtl->replication_slots and the > > > > > loop traversing slots is already ahead from the point where the newly > > > > > copied slot is created? > > > > > > > > Good point. I think it's possible. > > > > > > > > > If so, the newly created slot won't be marked > > > > > as invalid whereas the source slot will be marked as invalid. I agree > > > > > that even in such a case, at a later point, the newly created slot > > > > > will also be marked as invalid. > > > > > > > > The wal_status of the newly created slot would immediately become > > > > 'lost' and the next checkpoint will invalidate it. Do we need to do > > > > something to deal with this case? > > > > > > > > > > + /* Check if source slot became invalidated during the copy operation */ > > > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) > > > + ereport(ERROR, > > > + errmsg("cannot copy replication slot \"%s\"", > > > + NameStr(*src_name)), > > > + errdetail("The source replication slot was invalidated during the > > > copy operation.")); > > > > > > How about adding a similar test as above for MyReplicationSlot? That > > > should suffice the need because we have already acquired the new slot > > > by this time and invalidation should signal this process before > > > marking the new slot as invalid. > > > > IIUC in the scenario you mentioned, the loop traversing slots already > > passed the position of newly created slot in > > ReplicationSlotCtl->replication_slots array, so > > MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no? > > > > Right, I don't have a simpler solution for this apart from making it > somehow serialize this operation with > InvalidateObsoleteReplicationSlots(). I don't think we need to go that > far to handle the scenario being discussed. It is better to add a > comment for this race and why it won't harm. I tried to reproduce the scenario described using the following steps: Here are the steps: 1. set max_replication_slots = 2 2. create two logical replication slot, lets say slot1 and slot2. drop the first slot (slot1). 3. Now run pg_copy_logical_replication slot function on slot2. Lets say copied slot is 'slot_cp'. In function copy_replication_slot add breakpoint just after function 'create_logical_replication_slot'. slot_cp will be created. In array ReplicationSlotCtl->replication_slots, slot_cp will come before slot2. 4. Now invalidate the 'slot2'. Function 'InvalidateObsoleteReplicationSlots' will be called. Now it will loop through ReplicationSlotCtl->replication_slots and will get 'slot_cp' first. Since the slot is acquired by copy_replication_slot, It will wait for the slot to be released. Once slot is released, 'slot_cp' and 'slot2' will be invalidated. I have tried it with 'wal_removal' invalidation. So it is issue in PG_13 to HEAD. I also see a kill signal sent to function 'copy_replication_slot'. Terminal output: postgres=# select pg_copy_logical_replication_slot('slot2', 'slot_cp'); FATAL: terminating connection due to administrator command server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. 5. Execute 'SELECT * from pg_replication_slots'. slot_cp is present in the list with wal_status as lost. I have added the comment on existing patches for REL_16 to master. Created a new patch to add only comments in REL13-REL15. Thanks and Regards, Shlok Kyal
Attachment
On Mon, Mar 10, 2025 at 11:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > I tried to reproduce the scenario described using the following steps: > > Here are the steps: > > 1. set max_replication_slots = 2 > > 2. create two logical replication slot, lets say slot1 and slot2. drop > the first slot (slot1). > > 3. Now run pg_copy_logical_replication slot function on slot2. Lets > say copied slot is 'slot_cp'. > In function copy_replication_slot add breakpoint just after function > 'create_logical_replication_slot'. > slot_cp will be created. In array > ReplicationSlotCtl->replication_slots, slot_cp will come before slot2. > > 4. Now invalidate the 'slot2'. > Function 'InvalidateObsoleteReplicationSlots' will be called. Now it > will loop through ReplicationSlotCtl->replication_slots and will get > 'slot_cp' first. Since the slot is acquired by copy_replication_slot, > It will wait for the slot to be released. Once slot is released, > 'slot_cp' and 'slot2' will be invalidated. > It is also possible that InvalidateObsoleteReplicationSlots() has already past slot_cp location in which case it doesn't need to wait for the backend. I have changed the comments for the master branch patch. Do let me know what you think of the attached change. If you agree with this, then please prepare the patches for back-branches that reflect this change. -- With Regards, Amit Kapila.
Attachment
On Sun, Mar 9, 2025 at 11:16 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Fri, 28 Feb 2025 at 08:56, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > > > > > > this operation. So, isn't it possible that the source slot exists at > > > > > > the later position in ReplicationSlotCtl->replication_slots and the > > > > > > loop traversing slots is already ahead from the point where the newly > > > > > > copied slot is created? > > > > > > > > > > Good point. I think it's possible. > > > > > > > > > > > If so, the newly created slot won't be marked > > > > > > as invalid whereas the source slot will be marked as invalid. I agree > > > > > > that even in such a case, at a later point, the newly created slot > > > > > > will also be marked as invalid. > > > > > > > > > > The wal_status of the newly created slot would immediately become > > > > > 'lost' and the next checkpoint will invalidate it. Do we need to do > > > > > something to deal with this case? > > > > > > > > > > > > > + /* Check if source slot became invalidated during the copy operation */ > > > > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) > > > > + ereport(ERROR, > > > > + errmsg("cannot copy replication slot \"%s\"", > > > > + NameStr(*src_name)), > > > > + errdetail("The source replication slot was invalidated during the > > > > copy operation.")); > > > > > > > > How about adding a similar test as above for MyReplicationSlot? That > > > > should suffice the need because we have already acquired the new slot > > > > by this time and invalidation should signal this process before > > > > marking the new slot as invalid. > > > > > > IIUC in the scenario you mentioned, the loop traversing slots already > > > passed the position of newly created slot in > > > ReplicationSlotCtl->replication_slots array, so > > > MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no? > > > > > > > Right, I don't have a simpler solution for this apart from making it > > somehow serialize this operation with > > InvalidateObsoleteReplicationSlots(). I don't think we need to go that > > far to handle the scenario being discussed. It is better to add a > > comment for this race and why it won't harm. > > I tried to reproduce the scenario described using the following steps: > > Here are the steps: > > 1. set max_replication_slots = 2 > > 2. create two logical replication slot, lets say slot1 and slot2. drop > the first slot (slot1). > > 3. Now run pg_copy_logical_replication slot function on slot2. Lets > say copied slot is 'slot_cp'. > In function copy_replication_slot add breakpoint just after function > 'create_logical_replication_slot'. > slot_cp will be created. In array > ReplicationSlotCtl->replication_slots, slot_cp will come before slot2. > > 4. Now invalidate the 'slot2'. > Function 'InvalidateObsoleteReplicationSlots' will be called. Now it > will loop through ReplicationSlotCtl->replication_slots and will get > 'slot_cp' first. Since the slot is acquired by copy_replication_slot, > It will wait for the slot to be released. Once slot is released, > 'slot_cp' and 'slot2' will be invalidated. > > I have tried it with 'wal_removal' invalidation. So it is issue in > PG_13 to HEAD. > I also see a kill signal sent to function 'copy_replication_slot'. > > Terminal output: > postgres=# select pg_copy_logical_replication_slot('slot2', 'slot_cp'); > FATAL: terminating connection due to administrator command > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Succeeded. > > 5. Execute 'SELECT * from pg_replication_slots'. slot_cp is present in > the list with wal_status as lost. > > I have added the comment on existing patches for REL_16 to master. > Created a new patch to add only comments in REL13-REL15. I think we don't need a patch adding only comments to v15 or older. We can either write the fact in the commit message that v15 or older are not affected fortunately or add an explicit check if the copied slot doesn't have invalid restart_lsn. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Mon, 17 Mar 2025 at 12:18, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Mar 10, 2025 at 11:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > I tried to reproduce the scenario described using the following steps: > > > > Here are the steps: > > > > 1. set max_replication_slots = 2 > > > > 2. create two logical replication slot, lets say slot1 and slot2. drop > > the first slot (slot1). > > > > 3. Now run pg_copy_logical_replication slot function on slot2. Lets > > say copied slot is 'slot_cp'. > > In function copy_replication_slot add breakpoint just after function > > 'create_logical_replication_slot'. > > slot_cp will be created. In array > > ReplicationSlotCtl->replication_slots, slot_cp will come before slot2. > > > > 4. Now invalidate the 'slot2'. > > Function 'InvalidateObsoleteReplicationSlots' will be called. Now it > > will loop through ReplicationSlotCtl->replication_slots and will get > > 'slot_cp' first. Since the slot is acquired by copy_replication_slot, > > It will wait for the slot to be released. Once slot is released, > > 'slot_cp' and 'slot2' will be invalidated. > > > > It is also possible that InvalidateObsoleteReplicationSlots() has > already past slot_cp location in which case it doesn't need to wait > for the backend. > > I have changed the comments for the master branch patch. Do let me > know what you think of the attached change. If you agree with this, > then please prepare the patches for back-branches that reflect this > change. > This change looks good to me. I have prepared the patches with the suggested changes for PG16, PG17 and master. I have also addressed the comments by Sawada-san in [1] and added a line in the commit message for PG15 and earlier version. [1]: https://www.postgresql.org/message-id/CAD21AoBtjj_xoH%2BpgHP_CKykj00fayjFqQ60W6az-Drj2%2BmOdQ%40mail.gmail.com Thanks and Regards, Shlok Kyal
Attachment
On Mon, 17 Mar 2025 at 22:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Sun, Mar 9, 2025 at 11:16 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > On Fri, 28 Feb 2025 at 08:56, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > > > > > > > this operation. So, isn't it possible that the source slot exists at > > > > > > > the later position in ReplicationSlotCtl->replication_slots and the > > > > > > > loop traversing slots is already ahead from the point where the newly > > > > > > > copied slot is created? > > > > > > > > > > > > Good point. I think it's possible. > > > > > > > > > > > > > If so, the newly created slot won't be marked > > > > > > > as invalid whereas the source slot will be marked as invalid. I agree > > > > > > > that even in such a case, at a later point, the newly created slot > > > > > > > will also be marked as invalid. > > > > > > > > > > > > The wal_status of the newly created slot would immediately become > > > > > > 'lost' and the next checkpoint will invalidate it. Do we need to do > > > > > > something to deal with this case? > > > > > > > > > > > > > > > > + /* Check if source slot became invalidated during the copy operation */ > > > > > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) > > > > > + ereport(ERROR, > > > > > + errmsg("cannot copy replication slot \"%s\"", > > > > > + NameStr(*src_name)), > > > > > + errdetail("The source replication slot was invalidated during the > > > > > copy operation.")); > > > > > > > > > > How about adding a similar test as above for MyReplicationSlot? That > > > > > should suffice the need because we have already acquired the new slot > > > > > by this time and invalidation should signal this process before > > > > > marking the new slot as invalid. > > > > > > > > IIUC in the scenario you mentioned, the loop traversing slots already > > > > passed the position of newly created slot in > > > > ReplicationSlotCtl->replication_slots array, so > > > > MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no? > > > > > > > > > > Right, I don't have a simpler solution for this apart from making it > > > somehow serialize this operation with > > > InvalidateObsoleteReplicationSlots(). I don't think we need to go that > > > far to handle the scenario being discussed. It is better to add a > > > comment for this race and why it won't harm. > > > > I tried to reproduce the scenario described using the following steps: > > > > Here are the steps: > > > > 1. set max_replication_slots = 2 > > > > 2. create two logical replication slot, lets say slot1 and slot2. drop > > the first slot (slot1). > > > > 3. Now run pg_copy_logical_replication slot function on slot2. Lets > > say copied slot is 'slot_cp'. > > In function copy_replication_slot add breakpoint just after function > > 'create_logical_replication_slot'. > > slot_cp will be created. In array > > ReplicationSlotCtl->replication_slots, slot_cp will come before slot2. > > > > 4. Now invalidate the 'slot2'. > > Function 'InvalidateObsoleteReplicationSlots' will be called. Now it > > will loop through ReplicationSlotCtl->replication_slots and will get > > 'slot_cp' first. Since the slot is acquired by copy_replication_slot, > > It will wait for the slot to be released. Once slot is released, > > 'slot_cp' and 'slot2' will be invalidated. > > > > I have tried it with 'wal_removal' invalidation. So it is issue in > > PG_13 to HEAD. > > I also see a kill signal sent to function 'copy_replication_slot'. > > > > Terminal output: > > postgres=# select pg_copy_logical_replication_slot('slot2', 'slot_cp'); > > FATAL: terminating connection due to administrator command > > server closed the connection unexpectedly > > This probably means the server terminated abnormally > > before or while processing the request. > > The connection to the server was lost. Attempting reset: Succeeded. > > > > 5. Execute 'SELECT * from pg_replication_slots'. slot_cp is present in > > the list with wal_status as lost. > > > > I have added the comment on existing patches for REL_16 to master. > > Created a new patch to add only comments in REL13-REL15. > > I think we don't need a patch adding only comments to v15 or older. We > can either write the fact in the commit message that v15 or older are > not affected fortunately or add an explicit check if the copied slot > doesn't have invalid restart_lsn. > I felt adding a message in the commit message would be sufficient. I have made the changes for the same in [1]. [1]: https://www.postgresql.org/message-id/CANhcyEV%2BwwhquX3J9J5gKVdhirGFhNxDAXJArz3udp94vNsvPA%40mail.gmail.com Thanks and Regards, Shlok Kyal
On Tue, Mar 18, 2025 at 2:28 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Mon, 17 Mar 2025 at 22:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Sun, Mar 9, 2025 at 11:16 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > On Fri, 28 Feb 2025 at 08:56, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > > > > > > > > this operation. So, isn't it possible that the source slot exists at > > > > > > > > the later position in ReplicationSlotCtl->replication_slots and the > > > > > > > > loop traversing slots is already ahead from the point where the newly > > > > > > > > copied slot is created? > > > > > > > > > > > > > > Good point. I think it's possible. > > > > > > > > > > > > > > > If so, the newly created slot won't be marked > > > > > > > > as invalid whereas the source slot will be marked as invalid. I agree > > > > > > > > that even in such a case, at a later point, the newly created slot > > > > > > > > will also be marked as invalid. > > > > > > > > > > > > > > The wal_status of the newly created slot would immediately become > > > > > > > 'lost' and the next checkpoint will invalidate it. Do we need to do > > > > > > > something to deal with this case? > > > > > > > > > > > > > > > > > > > + /* Check if source slot became invalidated during the copy operation */ > > > > > > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) > > > > > > + ereport(ERROR, > > > > > > + errmsg("cannot copy replication slot \"%s\"", > > > > > > + NameStr(*src_name)), > > > > > > + errdetail("The source replication slot was invalidated during the > > > > > > copy operation.")); > > > > > > > > > > > > How about adding a similar test as above for MyReplicationSlot? That > > > > > > should suffice the need because we have already acquired the new slot > > > > > > by this time and invalidation should signal this process before > > > > > > marking the new slot as invalid. > > > > > > > > > > IIUC in the scenario you mentioned, the loop traversing slots already > > > > > passed the position of newly created slot in > > > > > ReplicationSlotCtl->replication_slots array, so > > > > > MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no? > > > > > > > > > > > > > Right, I don't have a simpler solution for this apart from making it > > > > somehow serialize this operation with > > > > InvalidateObsoleteReplicationSlots(). I don't think we need to go that > > > > far to handle the scenario being discussed. It is better to add a > > > > comment for this race and why it won't harm. > > > > > > I tried to reproduce the scenario described using the following steps: > > > > > > Here are the steps: > > > > > > 1. set max_replication_slots = 2 > > > > > > 2. create two logical replication slot, lets say slot1 and slot2. drop > > > the first slot (slot1). > > > > > > 3. Now run pg_copy_logical_replication slot function on slot2. Lets > > > say copied slot is 'slot_cp'. > > > In function copy_replication_slot add breakpoint just after function > > > 'create_logical_replication_slot'. > > > slot_cp will be created. In array > > > ReplicationSlotCtl->replication_slots, slot_cp will come before slot2. > > > > > > 4. Now invalidate the 'slot2'. > > > Function 'InvalidateObsoleteReplicationSlots' will be called. Now it > > > will loop through ReplicationSlotCtl->replication_slots and will get > > > 'slot_cp' first. Since the slot is acquired by copy_replication_slot, > > > It will wait for the slot to be released. Once slot is released, > > > 'slot_cp' and 'slot2' will be invalidated. > > > > > > I have tried it with 'wal_removal' invalidation. So it is issue in > > > PG_13 to HEAD. > > > I also see a kill signal sent to function 'copy_replication_slot'. > > > > > > Terminal output: > > > postgres=# select pg_copy_logical_replication_slot('slot2', 'slot_cp'); > > > FATAL: terminating connection due to administrator command > > > server closed the connection unexpectedly > > > This probably means the server terminated abnormally > > > before or while processing the request. > > > The connection to the server was lost. Attempting reset: Succeeded. > > > > > > 5. Execute 'SELECT * from pg_replication_slots'. slot_cp is present in > > > the list with wal_status as lost. > > > > > > I have added the comment on existing patches for REL_16 to master. > > > Created a new patch to add only comments in REL13-REL15. > > > > I think we don't need a patch adding only comments to v15 or older. We > > can either write the fact in the commit message that v15 or older are > > not affected fortunately or add an explicit check if the copied slot > > doesn't have invalid restart_lsn. > > > I felt adding a message in the commit message would be sufficient. I > have made the changes for the same in [1]. Thank you for updating the patches. These patches look good to me. I'm going to push these fixes to master, v17, and v16 branches tomorrow unless there is further comment. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Wed, Apr 2, 2025 at 2:58 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Tue, Mar 18, 2025 at 2:28 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > On Mon, 17 Mar 2025 at 22:57, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Sun, Mar 9, 2025 at 11:16 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > > > On Fri, 28 Feb 2025 at 08:56, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > > > > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > > > > > > > > > this operation. So, isn't it possible that the source slot exists at > > > > > > > > > the later position in ReplicationSlotCtl->replication_slots and the > > > > > > > > > loop traversing slots is already ahead from the point where the newly > > > > > > > > > copied slot is created? > > > > > > > > > > > > > > > > Good point. I think it's possible. > > > > > > > > > > > > > > > > > If so, the newly created slot won't be marked > > > > > > > > > as invalid whereas the source slot will be marked as invalid. I agree > > > > > > > > > that even in such a case, at a later point, the newly created slot > > > > > > > > > will also be marked as invalid. > > > > > > > > > > > > > > > > The wal_status of the newly created slot would immediately become > > > > > > > > 'lost' and the next checkpoint will invalidate it. Do we need to do > > > > > > > > something to deal with this case? > > > > > > > > > > > > > > > > > > > > > > + /* Check if source slot became invalidated during the copy operation */ > > > > > > > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) > > > > > > > + ereport(ERROR, > > > > > > > + errmsg("cannot copy replication slot \"%s\"", > > > > > > > + NameStr(*src_name)), > > > > > > > + errdetail("The source replication slot was invalidated during the > > > > > > > copy operation.")); > > > > > > > > > > > > > > How about adding a similar test as above for MyReplicationSlot? That > > > > > > > should suffice the need because we have already acquired the new slot > > > > > > > by this time and invalidation should signal this process before > > > > > > > marking the new slot as invalid. > > > > > > > > > > > > IIUC in the scenario you mentioned, the loop traversing slots already > > > > > > passed the position of newly created slot in > > > > > > ReplicationSlotCtl->replication_slots array, so > > > > > > MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no? > > > > > > > > > > > > > > > > Right, I don't have a simpler solution for this apart from making it > > > > > somehow serialize this operation with > > > > > InvalidateObsoleteReplicationSlots(). I don't think we need to go that > > > > > far to handle the scenario being discussed. It is better to add a > > > > > comment for this race and why it won't harm. > > > > > > > > I tried to reproduce the scenario described using the following steps: > > > > > > > > Here are the steps: > > > > > > > > 1. set max_replication_slots = 2 > > > > > > > > 2. create two logical replication slot, lets say slot1 and slot2. drop > > > > the first slot (slot1). > > > > > > > > 3. Now run pg_copy_logical_replication slot function on slot2. Lets > > > > say copied slot is 'slot_cp'. > > > > In function copy_replication_slot add breakpoint just after function > > > > 'create_logical_replication_slot'. > > > > slot_cp will be created. In array > > > > ReplicationSlotCtl->replication_slots, slot_cp will come before slot2. > > > > > > > > 4. Now invalidate the 'slot2'. > > > > Function 'InvalidateObsoleteReplicationSlots' will be called. Now it > > > > will loop through ReplicationSlotCtl->replication_slots and will get > > > > 'slot_cp' first. Since the slot is acquired by copy_replication_slot, > > > > It will wait for the slot to be released. Once slot is released, > > > > 'slot_cp' and 'slot2' will be invalidated. > > > > > > > > I have tried it with 'wal_removal' invalidation. So it is issue in > > > > PG_13 to HEAD. > > > > I also see a kill signal sent to function 'copy_replication_slot'. > > > > > > > > Terminal output: > > > > postgres=# select pg_copy_logical_replication_slot('slot2', 'slot_cp'); > > > > FATAL: terminating connection due to administrator command > > > > server closed the connection unexpectedly > > > > This probably means the server terminated abnormally > > > > before or while processing the request. > > > > The connection to the server was lost. Attempting reset: Succeeded. > > > > > > > > 5. Execute 'SELECT * from pg_replication_slots'. slot_cp is present in > > > > the list with wal_status as lost. > > > > > > > > I have added the comment on existing patches for REL_16 to master. > > > > Created a new patch to add only comments in REL13-REL15. > > > > > > I think we don't need a patch adding only comments to v15 or older. We > > > can either write the fact in the commit message that v15 or older are > > > not affected fortunately or add an explicit check if the copied slot > > > doesn't have invalid restart_lsn. > > > > > I felt adding a message in the commit message would be sufficient. I > > have made the changes for the same in [1]. > > Thank you for updating the patches. These patches look good to me. I'm > going to push these fixes to master, v17, and v16 branches tomorrow > unless there is further comment. Pushed (yesterday). Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com