Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CAA4eK1KP-sV4aER51J-2mELjNzq_zVSLf1+W90Vu0feo-thVNA@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
List pgsql-hackers
On Thu, Mar 2, 2023 at 6:50 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
>
>>
>>
>> In the above profile number of calls to index_fetch_heap(),
>> heapam_index_fetch_tuple() explains the reason for the regression you
>> are seeing with the index scan. Because the update will generate dead
>> tuples in the same transaction and those dead tuples won't be removed,
>> we get those from the index and then need to perform
>> index_fetch_heap() to find out whether the tuple is dead or not. Now,
>> for sequence scan also we need to scan those dead tuples but there we
>> don't need to do back-and-forth between index and heap.
>
>
> Thanks for the insights, I think what you describe makes a lot of sense.
>
...
...
>
> I think we figured out the cause of the performance regression. I think it is not small
> enough for some scenarios like the above. But those scenarios seem like synthetic
> test cases, with not much user impacting implications. Still, I think you are better suited
> to comment on this.
>
> If you consider that this is a significant issue,  we could consider the second patch as well
> such that for this unlikely scenario users could disable index scans.
>

I think we can't completely ignore this regression because the key
point of this patch is to pick one of the non-unique indexes to
perform scan and now it will be difficult to predict how many
duplicates (and or dead rows) some index has without more planner
support. Personally, I feel it is better to have a table-level option
for this so that users have some knob to avoid regressions in
particular cases. In general, I agree that it will be a win in more
number of cases than it can regress.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Rework LogicalOutputPluginWriterUpdateProgress
Next
From: David Rowley
Date:
Subject: Re: Making empty Bitmapsets always be NULL