RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns - Mailing list pgsql-hackers

From osumi.takamichi@fujitsu.com
Subject RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date
Msg-id OSBPR01MB48886B287F714BF3B0F219A1EDB79@OSBPR01MB4888.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Monday, October 11, 2021 3:28 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Fri, Oct 8, 2021 at 4:50 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> wrote:
> >
> > At Thu, 7 Oct 2021 13:20:14 +0900, Masahiko Sawada
> > <sawada.mshk@gmail.com> wrote in
> > > Another idea to fix this problem would be that before calling
> > > SnapBuildCommitTxn() we create transaction entries in ReorderBuffer
> > > for (sub)transactions whose COMMIT record has
> XACT_XINFO_HAS_INVALS,
> > > and then mark all of them as catalog-changed by calling
> > > ReorderBufferXidSetCatalogChanges(). I've attached a PoC patch for
> > > this idea. What the patch does is essentially the same as what the
> > > proposed patch does. But the patch doesn't modify the
> > > SnapBuildCommitTxn(). And we remember the list of last running
> > > transactions in reorder buffer and the list is periodically purged
> > > during decoding RUNNING_XACTS records, eventually making it empty.
> >
> > I came up with the third way.  SnapBuildCommitTxn already properly
> > handles the case where a ReorderBufferTXN with
> > RBTXN_HAS_CATALOG_CHANGES.  So this issue can be resolved by
> create
> > such ReorderBufferTXNs in SnapBuildProcessRunningXacts.
> 
> Thank you for the idea and patch!
> 
> It's much simpler than mine. I think that creating an entry of a catalog-changed
> transaction in the reorder buffer before
> SunapBuildCommitTxn() is the right direction.
> 
> After more thought, given DDLs are not likely to happen than DML in practice,
> probably we can always mark both the top transaction and its subtransactions
> as containing catalog changes if the commit record has
> XACT_XINFO_HAS_INVALS? I believe this is not likely to lead to overhead in
> practice. That way, the patch could be more simple and doesn't need the
> change of AssertTXNLsnOrder().
> 
> I've attached another PoC patch. Also, I've added the tests for this issue in
> test_decoding.
I also felt that your patch addresses the problem in a good way.
Even without setting xid by NEW_CID decoding like in the original scenario,
we can set catalog change flag.

One really minor comment I have is,
in DecodeCommit(), you don't need to declar i. It's defined at the top of the function.

+               for (int i = 0; i < parsed->nsubxacts; i++)


Best Regards,
    Takamichi Osumi


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [RFC] building postgres with meson
Next
From: Amul Sul
Date:
Subject: Re: when the startup process doesn't (logging startup delays)