Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CAA4eK1+B+dcieiyafG2yXo0Y-a1Sj6BSSbf9paUR300jdowajg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Thu, Mar 2, 2023 at 2:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Mar 2, 2023 at 1:37 PM shiy.fnst@fujitsu.com
> <shiy.fnst@fujitsu.com> wrote:
> > - results of `gprof`
> > case1:
> > master
> >   %   cumulative   self              self     total
> >  time   seconds   seconds    calls  ms/call  ms/call  name
> >   1.37      0.66     0.01   654312     0.00     0.00  LWLockAttemptLock
> >   0.00      0.73     0.00   573358     0.00     0.00  LockBuffer
> >   0.00      0.73     0.00    10014     0.00     0.06  heap_getnextslot
> >
> > patched
> >   %   cumulative   self              self     total
> >  time   seconds   seconds    calls  ms/call  ms/call  name
> >   9.70      1.27     0.36 50531459     0.00     0.00  LWLockAttemptLock
> >   3.23      2.42     0.12 100259200     0.00     0.00  LockBuffer
> >   6.20      1.50     0.23 50015101     0.00     0.00  heapam_index_fetch_tuple
> >   4.04      2.02     0.15 50015101     0.00     0.00  index_fetch_heap
> >   1.35      3.21     0.05    10119     0.00     0.00  index_getnext_slot
> >
>
> In the above profile number of calls to index_fetch_heap(),
> heapam_index_fetch_tuple() explains the reason for the regression you
> are seeing with the index scan. Because the update will generate dead
> tuples in the same transaction and those dead tuples won't be removed,
> we get those from the index and then need to perform
> index_fetch_heap() to find out whether the tuple is dead or not. Now,
> for sequence scan also we need to scan those dead tuples but there we
> don't need to do back-and-forth between index and heap. I think we can
> once check with more number of tuples (say with 20000, 50000, etc.)
> for case-1.
>

Andres, do you have any thoughts on this? We seem to have figured out
the cause of regression in the case Shi-San has reported and others
also agree with it. We can't think of doing anything better than what
the patch currently is doing, so thought of going with an option to
allow users to disable index scans. The current understanding is that
the patch will be a win in much more cases than the cases where one
can see regression but still having a knob could be useful in those
few cases.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Önder Kalacı
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: [Proposal] Add foreign-server health checks infrastructure