Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date
Msg-id CAA4eK1LefopSMb6XQ7b6hw+GmZw-wDAOWhq+YXM9iC4Y6Mdtjw@mail.gmail.com
Whole thread Raw
In response to Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
List pgsql-hackers
On Tue, Jul 26, 2022 at 7:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jul 25, 2022 at 7:57 PM shiy.fnst@fujitsu.com
> <shiy.fnst@fujitsu.com> wrote:
> >
> > Hi,
> >
> > I did some performance test for the master branch patch (based on v6 patch) to
> > see if the bsearch() added by this patch will cause any overhead.
>
> Thank you for doing performance tests!
>
> >
> > I tested them three times and took the average.
> >
> > The results are as follows, and attach the bar chart.
> >
> > case 1
> > ---------
> > No catalog modifying transaction.
> > Decode 800k pgbench transactions. (8 clients, 100k transactions per client)
> >
> > master      7.5417
> > patched     7.4107
> >
> > case 2
> > ---------
> > There's one catalog modifying transaction.
> > Decode 100k/500k/1M transactions.
> >
> >             100k        500k        1M
> > master      0.0576      0.1491      0.4346
> > patched     0.0586      0.1500      0.4344
> >
> > case 3
> > ---------
> > There are 64 catalog modifying transactions.
> > Decode 100k/500k/1M transactions.
> >
> >             100k        500k        1M
> > master      0.0600      0.1666      0.4876
> > patched     0.0620      0.1653      0.4795
> >
> > (Because the result of case 3 shows that there is a overhead of about 3% in the
> > case decoding 100k transactions with 64 catalog modifying transactions, I
> > tested the next run of 100k xacts with or without catalog modifying
> > transactions, to see if it affects subsequent decoding.)
> >
> > case 4.1
> > ---------
> > After the test steps in case 3 (64 catalog modifying transactions, decode 100k
> > transactions), run 100k xacts and then decode.
> >
> > master      0.3699
> > patched     0.3701
> >
> > case 4.2
> > ---------
> > After the test steps in case 3 (64 catalog modifying transactions, decode 100k
> > transactions), run 64 DDLs(without checkpoint) and 100k xacts, then decode.
> >
> > master      0.3687
> > patched     0.3696
> >
> > Summary of the tests:
> > After applying this patch, there is a overhead of about 3% in the case decoding
> > 100k transactions with 64 catalog modifying transactions. This is an extreme
> > case, so maybe it's okay.
>
> Yes. If we're worried about the overhead and doing bsearch() is the
> cause, probably we can try simplehash instead of the array.
>

I am not sure if we need to go that far for this extremely corner
case. Let's first try your below idea.

> An improvement idea is that we pass the parsed->xinfo down to
> SnapBuildXidHasCatalogChanges(), and then return from that function
> before doing bearch() if the parsed->xinfo doesn't have
> XACT_XINFO_HAS_INVALS. That would save calling bsearch() for
> non-catalog-modifying transactions. Is it worth trying?
>

I think this is worth trying and this might reduce some of the
overhead as well in the case presented by Shi-San.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Cygwin cleanup
Next
From: Dilip Kumar
Date:
Subject: Re: making relfilenodes 56 bits