Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CAA4eK1L6P8hM+17fNyWogSnueTJebvZUX7YseL54HSFpX_0m_A@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
Responses Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
List pgsql-hackers
On Fri, Jan 27, 2023 at 6:32 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
>>
>> I suppose the options are:
>> 1. use regular planner uniformly
>> 2. use regular planner only when there's no replica identity (or configurable?)
>> 3. only use low-level functions
>> 4. keep using sequential scans for every single updated row
>> 5. introduce a hidden logical row identifier in the heap that is guaranteed unique within a table and can be used as
areplica identity when no unique index exists 
>
>
> One other option I considered was to ask the index explicitly on the subscriber side from the user when REPLICA
IDENTITYis FULL. But, it is a pretty hard choice for any user, even a planner sometimes fails to pick the right index
:) Also, it is probably controversial to change any of the APIs for this purpose? 
>

I agree that it won't be a very convenient option for the user but how
about along with asking for an index from the user (when the user
didn't provide an index), we also allow to make use of any unique
index over a subset of the transmitted columns, and if there's more
than one candidate index pick any one. Additionally, we can allow
disabling the use of an index scan for this particular case. If we are
too worried about API change for allowing users to specify the index
then we can do that later or as a separate patch.

> I'd be happy to hear from more experienced hackers on the trade-offs for the above, and I'd be open to work on that
ifthere is a clear winner. For me (3) is a decent solution for the problem. 
>

From the discussion above it is not very clear that adding maintenance
costs in this area is worth it even though that can give better
results as far as this feature is concerned.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: dynamic result sets support in extended query protocol
Next
From: Filipp Krylov
Date:
Subject: Re: JSONPath Child Operator?