Thread: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

Hello hackers, I found that we currently don't track txns committed in
BUILDING_SNAPSHOT state because of the code in xact_decode():
/*
* If the snapshot isn't yet fully built, we cannot decode anything, so
* bail out.
*/
if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT)
return;

This can cause a txn to take an incorrect historic snapshot and result in an
interruption of logical replication. Consider the following scenario:
(pub)create table t1 (id int primary key);
(pub)insert into t1 values (1);
(pub)create publication pub for table t1;
(sub)create table t1 (id int primary key);
(pub)begin; insert into t1 values (2); (txn1 in session1)
(sub)create subscription sub connection 'hostaddr=127.0.0.1 port=5432 user=xxx dbname=postgres' publication pub; (pub will switch to BUILDING_SNAPSHOT state soon)
(pub)begin; insert into t1 values (3); (txn2 in session2)
(pub)create table t2 (id int primary key); (session3)
(pub)commit; (commit txn1, and pub will switch to FULL_SNAPSHOT state soon)
(pub)begin; insert into t2 values (1); (txn3 in session3)
(pub)commit; (commit txn2, and pub will switch to CONSISTENT state soon)
(pub)commit; (commit txn3, and replay txn3 will failed because its snapshot cannot see table t2)

The output of pub's log:
ERROR: could not map filenumber "base/5/16395" to relation OID

Is this a bug? Should we also track the txns committed in BUILDING_SNAPSHOT state?

--
Regards,
ChangAo Chen
On Sun, Jun 09, 2024 at 11:21:52PM +0800, cca5507 wrote:
> Hello hackers, I found that we currently don't track txns committed in
> BUILDING_SNAPSHOT state because of the code in xact_decode():
>     /*
>      * If the snapshot isn't yet fully built, we cannot decode anything, so
>      * bail out.
>      */
>     if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT)
>         return;
>
> The output of pub's log:
> ERROR:  could not map filenumber "base/5/16395" to relation OID
>
> Is this a bug? Should we also track the txns committed in BUILDING_SNAPSHOT state?

Clearly, this is not an error you should be able to see as a user.  So
yes, that's something that needs to be fixed.
--
Michael

Attachment
Thank you for reply!
I am trying to fix it. This patch (pass check-world) will track txns
committed in BUILDING_SNAPSHOT state and can fix this bug.

--
Regards,
ChangAo Chen
Attachment
Hi,

On Mon, Jun 10, 2024 at 10:04:31PM +0800, cca5507 wrote:
> Thank you for reply!I am trying to fix it. This patch (pass check-world) will track txns
> committed in BUILDING_SNAPSHOT state and can fix this bug.

Thanks for the report and the patch!

I did not look at the patch in detail but I can see that it does no contain a test
related to this issue.

Would you mind to add a test in say contrib/test_decoding? (see catalog_change_snapshot
for example).

FWIW, to ease the writing of the test, I think you don't need pub/sub to produce
the issue. I think you can reproduce with a single instance and multiple sessions
: replace in your repro the "(sub)create subscription" by a "logical slot creation"
and finish the test by "pg_logical_slot_get_changes" on the slot created above.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Hi,

Thanks for pointing it out!

Here are the new version patches with a test case.

--
Regards,
ChangAo Chen

------------------ Original ------------------
From: "Bertrand Drouvot" <bertranddrouvot.pg@gmail.com>;
Date: Wed, Aug 7, 2024 06:33 PM
To: "cca5507"<cca5507@qq.com>;
Cc: "Michael Paquier"<michael@paquier.xyz>;"pgsql-hackers"<pgsql-hackers@lists.postgresql.org>;
Subject: Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state

Hi,

On Mon, Jun 10, 2024 at 10:04:31PM +0800, cca5507 wrote:
> Thank you for reply!I am trying to fix it. This patch (pass check-world) will track txns
> committed in BUILDING_SNAPSHOT state and can fix this bug.

Thanks for the report and the patch!

I did not look at the patch in detail but I can see that it does no contain a test
related to this issue.

Would you mind to add a test in say contrib/test_decoding? (see catalog_change_snapshot
for example).

FWIW, to ease the writing of the test, I think you don't need pub/sub to produce
the issue. I think you can reproduce with a single instance and multiple sessions
: replace in your repro the "(sub)create subscription" by a "logical slot creation"
and finish the test by "pg_logical_slot_get_changes" on the slot created above.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment
Hi,

On Thu, Aug 08, 2024 at 03:53:29PM +0800, cca5507 wrote:
> Hi,
> 
> 
> Thanks for pointing it out!
> 
> 
> Here are the new version patches with a test case.

Thanks!

I think the approach that the patch implements makes sense and that we should
track the transactions that have been commmitted while building the snapshot.

A few random comments:

1 ===

+        * that the xlog in BUILDING_SNAPSHOT is only useful for build

s/for build/to build? (same comment for the commit message)

2 ===

+        * snapshot and will not be decoded.

worth to mention DecodeTXNNeedSkip() here?

3 ===

+        * point in decoding changes. Note that we only handle XLOG_HEAP2_NEW_CID
+        * which mark a transaction as catalog modifying in BUILDING_SNAPSHOT

do you mean? Note that during BUILDING_SNAPSHOT, we only handle XLOG_HEAP2_NEW_CID
as it marks a transaction as catalog modifying. If so, what about making the 

-       if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
+       if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT ||

more clear about that? (to avoid any work when it's not needed)

4 ===

+        * useful for build the snapshot.

s/for/to/?

5 ===

+        * point in decoding changes. Note that we only handle XLOG_HEAP_INPLACE
+        * which mark a transaction as catalog modifying in BUILDING_SNAPSHOT, it's

s/which mark/which might mark/?

do you mean? Note that during BUILDING_SNAPSHOT, we only handle XLOG_HEAP_INPLACE
as it might mark a transaction as catalog modifying. If so, what about making the 

-       if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
+       if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT ||

more clear about that? (to avoid any work when it's not needed)

Idea of 3 === and 5 === is to proceed further in the SNAPBUILD_BUILDING_SNAPSHOT
case only if we know that the transaction is a catalog changing one (or might
be one).

6 ===

+        * useful for build the snapshot.

s/for/to/?

7 ===

+# Test snapshot build correctly
+

what about? Test tracking of committed transactions during BUILDING_SNAPSHOT

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Hi,

Thanks for the comments!

Here are the new version patches, I think it will be more clear.

--
Regards,
ChangAo Chen
Attachment
Hi,

On Sat, Aug 10, 2024 at 06:07:30PM +0800, cca5507 wrote:
> Hi,
> 
> 
> Thanks for the comments!
> 
> 
> Here are the new version patches, I think it will be more clear.

Thanks!

1 ===

When applying I get:

Applying: Track transactions committed in BUILDING_SNAPSHOT.
.git/rebase-apply/patch:71: space before tab in indent.
                 */
.git/rebase-apply/patch:94: space before tab in indent.
                 */
warning: 2 lines add whitespace errors.

2 ===

+        * have snapshot and the transaction will not be tracked by snapshot

s/have snapshot/have a snapshot/?

3 ===

+        * snapshot and will not be decoded

s/snapshot/a snapshot/?


4 ===

        if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
                ctx->fast_forward)
+       {
+               /*
+                * Note that during or after BUILDING_SNAPSHOT, we need handle the xlog
+                * that might mark a transaction as catalog modifying because the snapshot
+                * only tracks catalog modifying transactions. The transaction before
+                * BUILDING_SNAPSHOT will not be tracked anyway(see SnapBuildCommitTxn()
+                * for details), so just return.
+                */
+               if (SnapBuildCurrentState(builder) >= SNAPBUILD_BUILDING_SNAPSHOT)
+               {
+                       /* Currently only XLOG_HEAP2_NEW_CID means a catalog modifying */
+                       if (info == XLOG_HEAP2_NEW_CID && TransactionIdIsValid(xid))

What about?

if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT ||
    (SnapBuildCurrentState(builder) == SNAPBUILD_BUILDING_SNAPSHOT && info != XLOG_HEAP2_NEW_CID) ||
    ctx->fast_forward)
    return;

That way we'd still rely on what's being done in the XLOG_HEAP2_NEW_CID case (
should it change in the future).

5 ===

        if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
                ctx->fast_forward)
+       {
+               /*
+                * Note that during or after BUILDING_SNAPSHOT, we need handle the xlog
+                * that might mark a transaction as catalog modifying because the snapshot
+                * only tracks catalog modifying transactions. The transaction before
+                * BUILDING_SNAPSHOT will not be tracked anyway(see SnapBuildCommitTxn()
+                * for details), so just return.
+                */
+               if (SnapBuildCurrentState(builder) >= SNAPBUILD_BUILDING_SNAPSHOT)
+               {

What about?

if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT ||
    (SnapBuildCurrentState(builder) == SNAPBUILD_BUILDING_SNAPSHOT && info != XLOG_HEAP_INPLACE) ||
    ctx->fast_forward)
    return;

That way we'd still rely on what's being done in the XLOG_HEAP_INPLACE case (
should it change in the future).

6 ===

v3-0002 LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Hi,

4, 5 ===

> if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT ||
>     (SnapBuildCurrentState(builder) == SNAPBUILD_BUILDING_SNAPSHOT && info != XLOG_HEAP_INPLACE) ||
>     ctx->fast_forward)
>     return;

I think during fast forward, we also need handle the xlog that marks a transaction
as catalog modifying, or the snapshot might lose some transactions?

> That way we'd still rely on what's being done in the XLOG_HEAP_INPLACE case

+ if (SnapBuildCurrentState(builder) >= SNAPBUILD_BUILDING_SNAPSHOT)
+ {
+ /* Currently only XLOG_HEAP_INPLACE means a catalog modifying */
+ if (info == XLOG_HEAP_INPLACE && TransactionIdIsValid(xid))
+ ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
+ }

We only call ReorderBufferXidSetCatalogChanges() for the xlog that marks a transaction as catalog
modifying, and we don't care about the other steps being done in the xlog, so I think the current
approach is ok.

--
Regards,
ChangAo Chen
Hi,

On Mon, Aug 12, 2024 at 04:34:25PM +0800, cca5507 wrote:
> Hi,
> 
> 
> 4, 5 ===
> 
> 
> > if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT ||
> >     (SnapBuildCurrentState(builder) == SNAPBUILD_BUILDING_SNAPSHOT && info !=
XLOG_HEAP_INPLACE)||
 
> >     ctx->fast_forward)
> >     return;
> 
> 
> 
> I think during fast forward, we also need handle the xlog that marks a transaction
> as catalog modifying, or the snapshot might lose some transactions?

I think it's fine to skip during fast forward as we are not generating logical
changes. It's done that way in master, in your proposal and in my "if" proposals.
Note that my proposals related to the if conditions are for heap2_decode and
heap_decode (not xact_decode).

> > That way we'd still rely on what's being done in the XLOG_HEAP_INPLACE case
> 
> 
> +        if (SnapBuildCurrentState(builder) >= SNAPBUILD_BUILDING_SNAPSHOT)
> +        {
> +            /* Currently only XLOG_HEAP_INPLACE means a catalog modifying */
> +            if (info == XLOG_HEAP_INPLACE && TransactionIdIsValid(xid))
> +                ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
> +        }
> 
> 
> 
> We only call ReorderBufferXidSetCatalogChanges() for the xlog that marks a transaction as catalog
> modifying, and we don't care about the other steps being done in the xlog, so I think the current
> approach is ok.

Yeah, I think your proposal does not do anything wrong. I just prefer to put
everything in a single if condition (as per my proposal) so that we can jump
directly in the appropriate case. I think that way the code is easier to maintain
instead of having to deal with the same info values in distinct places.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Hi,

Thanks for the comments!

- I think it's fine to skip during fast forward as we are not generating logical
- changes. It's done that way in master, in your proposal and in my "if" proposals.
- Note that my proposals related to the if conditions are for heap2_decode and
- heap_decode (not xact_decode).

But note that in xact_decode(), case XLOG_XACT_INVALIDATIONS, we call
ReorderBufferXidSetCatalogChanges() even if we are fast-forwarding, it might
be better to be consistent.

In addition, we don't decode anything during fast forward, but the snapshot might
serialize to disk. If we skip calling ReorderBufferXidSetCatalogChanges(), the snapshot
may be wrong on disk.

--
Regards,
ChangAo Chen
Hi,

I refactor the code and fix the git apply warning according to [1].

Here are the new version patches.

--
Regards,
ChangAo Chen

[1] https://www.postgresql.org/message-id/Zrmh7X8jYCbFYXjH%40ip-10-97-1-34.eu-west-3.compute.internal
Attachment
Hi,

On Tue, Aug 13, 2024 at 12:23:04PM +0800, cca5507 wrote:
> Hi,
> 
> I refactor the code and fix the git apply warning according to [1].
> 
> 
> Here are the new version patches.

Thanks!

1 ===

+       /* True if the xlog marks the transaction as containing catalog changes */
+       bool    set_catalog_changes = (info == XLOG_HEAP2_NEW_CID);

        if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
                ctx->fast_forward)
+       {
+               /*
+                * If the transaction contains catalog changes, we need mark it in
+                * reorder buffer before return as the snapshot only tracks catalog
+                * modifying transactions. The transaction before BUILDING_SNAPSHOT
+                * won't be tracked anyway(see SnapBuildCommitTxn), so skip it.
+                */
+               if (set_catalog_changes && TransactionIdIsValid(xid) &&
+                       SnapBuildCurrentState(builder) >= SNAPBUILD_BUILDING_SNAPSHOT)
+                       ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
+
                return;
+       }

I still prefer to replace the above with:

if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT ||
    (SnapBuildCurrentState(builder) == SNAPBUILD_BUILDING_SNAPSHOT && info != XLOG_HEAP2_NEW_CID) ||
    ctx->fast_forward)
    return;

Let's see what others think.

2 ===

+       /* True if the xlog marks the transaction as containing catalog changes */
+       bool    set_catalog_changes = (info == XLOG_HEAP_INPLACE);

        if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
                ctx->fast_forward)
+       {
+               /*
+                * If the transaction contains catalog changes, we need mark it in
+                * reorder buffer before return as the snapshot only tracks catalog
+                * modifying transactions. The transaction before BUILDING_SNAPSHOT
+                * won't be tracked anyway(see SnapBuildCommitTxn), so skip it.
+                */
+               if (set_catalog_changes && TransactionIdIsValid(xid) &&
+                       SnapBuildCurrentState(builder) >= SNAPBUILD_BUILDING_SNAPSHOT)
+                       ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
+
                return;
+       }

I still prefer to replace the above with:

if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT ||
    (SnapBuildCurrentState(builder) == SNAPBUILD_BUILDING_SNAPSHOT && info != XLOG_HEAP_INPLACE) ||
    ctx->fast_forward)
    return;

Let's see what others think.

3 ===

I re-read your comments in [0] and it looks like you've concern about
the 2 "if" I'm proposing above and the fast forward handling. Is that the case
or is your fast forward concern unrelated to my proposals?



Not sure what happened but it looks like your reply in [0] is not part of the
initial thread [1], but created a new thread instead, making the whole
conversation difficult to follow.

[0]: https://www.postgresql.org/message-id/tencent_8DEC9842690A9B6AFD52D4659EF0700E9409%40qq.com
[1]: https://www.postgresql.org/message-id/flat/tencent_6AAF072A7623A11A85C0B5FD290232467808%40qq.com

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Hi,

- I re-read your comments in [0] and it looks like you've concern about
- the 2 "if" I'm proposing above and the fast forward handling. Is that the case
- or is your fast forward concern unrelated to my proposals?

In your proposals, we will just return when fast forward. But I think we need
handle XLOG_HEAP2_NEW_CID or XLOG_HEAP_INPLACE even if we are fast
forwarding as it decides whether the snapshot will track the transaction or not.

During fast forward, if there is a transaction that generates XLOG_HEAP2_NEW_CID
but no XLOG_XACT_INVALIDATIONS(I'm not sure), the snapshot won't track this
transaction in your proposals, I think it's wrong from a build snapshot perspective.

Although we don't decode anything during fast forward, the snapshot might be
serialized to disk when CONSISTENT, it would be better to keep the snapshot correct.

- Not sure what happened but it looks like your reply in [0] is not part of the
- initial thread [1], but created a new thread instead, making the whole
- conversation difficult to follow.

I'm not sure what happened but I attach the new thread to the CF:

https://commitfest.postgresql.org/49/5029

--
Regards,
ChangAo Chen
Hi,

On Tue, Aug 13, 2024 at 03:32:42PM +0800, cca5507 wrote:
> Hi,
> 
> - I re-read your comments in [0] and it looks like you've concern about
> - the 2 "if" I'm proposing above and the fast forward handling. Is that the case
> - or is your fast forward concern unrelated to my proposals?
> 
> 
> 
> In your proposals, we will just return when fast forward. But I think we need
> handle XLOG_HEAP2_NEW_CID or XLOG_HEAP_INPLACE even if we are fast
> forwarding as it decides whether the snapshot will track the transaction or not.
> 
> 
> During fast forward, if there is a transaction that generates XLOG_HEAP2_NEW_CID
> but no XLOG_XACT_INVALIDATIONS(I'm not sure), the snapshot won't track this
> transaction in your proposals, I think it's wrong from a build snapshot perspective.
> 
> 
> Although we don't decode anything during fast forward, the snapshot might be
> serialized to disk when CONSISTENT, it would be better to keep the snapshot correct.

IIUC your "fast forward" concern is not related to this particular thread but you
think it's already an issue on the master branch (outside of the BUILDING_SNAPSHOT
handling we are discussing here), is that correct? (that's also what your coding
changes makes me think of). If so, I'd suggest to open a dedicated thread for that
particular "fast forward" point and do the coding in the current thread as if the
fast forward is not an issue.

Does that make sense?

> 
> - Not sure what happened but it looks like your reply in [0] is not part of the
> - initial thread [1], but created a new thread instead, making the whole
> - conversation difficult to follow.
> 
> I'm not sure what happened but I attach the new thread to the CF:

Unfortunately your last reply did start a new email thread again.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Hi,

- IIUC your "fast forward" concern is not related to this particular thread but you
- think it's already an issue on the master branch (outside of the BUILDING_SNAPSHOT
- handling we are discussing here), is that correct? (that's also what your coding
- changes makes me think of). If so, I'd suggest to open a dedicated thread for that
- particular "fast forward" point and do the coding in the current thread as if the
- fast forward is not an issue.

- Does that make sense?

Yes.

But I think the v4-0001 in [1] is fine.

Let's see what others think.

--
Regards,
ChangAo Chen

[1]: https://www.postgresql.org/message-id/tencent_925A991463194F3C97830C3BB7D0A2C2BD07%40qq.com