Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CAA4eK1KDAt59R0dN8rF4JHrFhWz5Avz2u1DKtq-yewkoDJ4PVw@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Andres Freund <andres@anarazel.de>)
Responses Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Tue, Mar 7, 2023 at 1:34 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-03-01 14:10:07 +0530, Amit Kapila wrote:
> > On Wed, Mar 1, 2023 at 12:09 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > > I see this as a way to provide this feature for users but I would
> > > > prefer to proceed with this if we can get some more buy-in from senior
> > > > community members (at least one more committer) and some user(s) if
> > > > possible. So, I once again request others to chime in and share their
> > > > opinion.
> > >
> > > I'd prefer not having an option, because we figure out the cause of the
> > > performance regression (reducing it to be small enough to not care). After
> > > that an option defaulting to using indexes.
> > >
> >
> > Sure, if we can reduce regression to be small enough then we don't
> > need to keep the default as false, otherwise, also, we can consider it
> > to keep an option defaulting to using indexes depending on the
> > investigation for regression. Anyway, the main concern was whether it
> > is okay to have an option for this which I think we have an agreement
> > on, now I will continue my review.
>
> I think even as-is it's reasonable to just use it. The sequential scan
> approach is O(N^2), which, uh, is not good. And having an index over thousands
> of non-differing values will generally perform badly, not just in this
> context.
>

Yes, it is true that generally also index scan with a lot of
duplicates may not perform well but during the scan, we do costing to
ensure such cases and may prefer other index or sequence scan. Then we
have "enable_indexscan" GUC that the user can use if required. So, I
feel it is better to have a knob to disallow usage of such indexes and
the default would be to use an index, if available.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: proposal: possibility to read dumped table's name from file
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)