Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id 20230228183918.t5csokpirbh4evju@awork3.anarazel.de
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
List pgsql-hackers
Hi,

On 2023-02-25 16:00:05 +0530, Amit Kapila wrote:
> On Tue, Feb 21, 2023 at 7:55 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
> >> I think this overhead seems to be mostly due to the need to perform
> >> tuples_equal multiple times for duplicate values.

I think more work needs to be done to determine the source of the
overhead. It's not clear to me why there'd be an increase in tuples_equal()
calls in the tests upthread.


> Wouldn't a table-level option like 'apply_index_scan' be better than a
> subscription-level option with a default value as false? Anyway, the
> bigger point is that we don't see a better way to proceed here than to
> introduce some option to control this behavior.

I don't think this should default to false. The quadratic apply performance
the sequential scans cause, are a much bigger hazard for users than some apply
performance reqression.


> I see this as a way to provide this feature for users but I would
> prefer to proceed with this if we can get some more buy-in from senior
> community members (at least one more committer) and some user(s) if
> possible. So, I once again request others to chime in and share their
> opinion.

I'd prefer not having an option, because we figure out the cause of the
performance regression (reducing it to be small enough to not care). After
that an option defaulting to using indexes. I don't think an option defaulting
to false makes sense.

I don't care whether it's subscription or relation level option.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Kuntal Ghosh
Date:
Subject: Re: Improve WALRead() to suck data directly from WAL buffers when possible
Next
From: Greg Stark
Date:
Subject: Commitfest 2023-03 starting tomorrow!