Re: test_decoding assertion failure for the loss of top-sub transaction relationship - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: test_decoding assertion failure for the loss of top-sub transaction relationship
Date
Msg-id CAA4eK1+A37ocd+2G__hiPdeKxyR1odwcBT7sk1CaRAZBVCJdEg@mail.gmail.com
Whole thread Raw
In response to Re: test_decoding assertion failure for the loss of top-sub transaction relationship  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
On Fri, Sep 2, 2022 at 12:25 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Fri, 2 Sep 2022 11:27:23 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in
> > On Fri, Sep 2, 2022 at 11:25 AM kuroda.hayato@fujitsu.com
> > <kuroda.hayato@fujitsu.com> wrote:
> > > How about following?
> > >
> > > diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
> > > index bf72ad45ec..a630522907 100644
> > > --- a/src/backend/replication/logical/snapbuild.c
> > > +++ b/src/backend/replication/logical/snapbuild.c
> > > @@ -1086,8 +1086,17 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr lsn, TransactionId xid,
> > >                 }
> > >         }
> > >
> > > -       /* if top-level modified catalog, it'll need a snapshot */
> > > -       if (SnapBuildXidHasCatalogChanges(builder, xid, xinfo))
> > > +       /*
> > > +        * if top-level or one of sub modified catalog, it'll need a snapshot.
> > > +        *
> > > +        * Normally the second check is not needed because the relation between
> > > +        * top-sub transactions is tracked on the ReorderBuffer layer, and the top
> > > +        * transaction is marked as containing catalog changes if its children are.
> > > +        * But in some cases the relation may be missed, in which case only the sub
> > > +        * transaction may be marked as containing catalog changes.
> > > +        */
> > > +       if (SnapBuildXidHasCatalogChanges(builder, xid, xinfo)
> > > +               || sub_needs_timetravel)
> > >         {
> > >                 elog(DEBUG2, "found top level transaction %u, with catalog changes",
> > >                          xid);
> > > @@ -1095,11 +1104,6 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr lsn, TransactionId xid,
> > >                 needs_timetravel = true;
> > >                 SnapBuildAddCommittedTxn(builder, xid);
> > >         }
> > > -       else if (sub_needs_timetravel)
> > > -       {
> > > -               /* track toplevel txn as well, subxact alone isn't meaningful */
> > > -               SnapBuildAddCommittedTxn(builder, xid);
> > > -       }
> > >         else if (needs_timetravel)
> > >         {
> > >                 elog(DEBUG2, "forced transaction %u to do timetravel", xid);
> >
> > Yeah, I am fine with this as well.
>
> I'm basically fine, too. But this is a bug that needs back-patching
> back to 10.
>

I have not verified but I think we need to backpatch this till 14
because prior to that in DecodeCommit, we use to set the top-level txn
as having catalog changes based on the if there are invalidation
messages in the commit record. So, in the current scenario shared by
Osumi-San, before SnapBuildCommitTxn(), the top-level txn will be
marked as having catalog changes.

> This change changes the condition for the DEBUG2 message.
> So we need to add an awkward if() condition for the DEBUG2 message.
> Looking that the messages have different debug-level, I doubt there
> have been a chance they are useful.  If we remove the two DEBUGx
> messages, I'm fine with the change.
>

I think these DEBUG2 messages could be useful, so instead of removing
these, I suggest we should follow Dilip's proposed fix and maybe add a
new DEBUG2 message on the lines of (("forced transaction %u to do
timetravel due to one of its subtransaction", xid) in the else if
(sub_needs_timetravel) condition if we think that will be useful too
but I am fine leaving the addition of new DEBUG2 message.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Clarify restriction on partitioned tables primary key / unique indexes
Next
From: vignesh C
Date:
Subject: Re: Handle infinite recursion in logical replication setup