RE: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Conflict detection for update_deleted in logical replication
Date
Msg-id OS0PR01MB57164C9A65F29875AE63F0BD94132@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Conflict detection for update_deleted in logical replication  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Thursday, January 9, 2025 9:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Hi,

> 
> On Wed, Jan 8, 2025 at 3:00 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> wrote:
> >
> > On Wednesday, January 8, 2025 6:33 PM Masahiko Sawada
> <sawada.mshk@gmail.com> wrote:
> >
> > Hi,
> >
> > > On Wed, Jan 8, 2025 at 1:53 AM Amit Kapila <amit.kapila16@gmail.com>
> > > wrote:
> > > > On Wed, Jan 8, 2025 at 3:02 PM Masahiko Sawada
> > > <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Thu, Dec 19, 2024 at 11:11 PM Nisha Moond
> > > <nisha.moond412@gmail.com> wrote:
> > > > > >
> > > > > >
> > > > > > [3] Test with pgbench run on both publisher and subscriber.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Test setup:
> > > > > >
> > > > > > - Tests performed on pgHead + v16 patches
> > > > > >
> > > > > > - Created a pub-sub replication system.
> > > > > >
> > > > > > - Parameters for both instances were:
> > > > > >
> > > > > >
> > > > > >
> > > > > >    share_buffers = 30GB
> > > > > >
> > > > > >    min_wal_size = 10GB
> > > > > >
> > > > > >    max_wal_size = 20GB
> > > > > >
> > > > > >    autovacuum = false
> > > > >
> > > > > Since you disabled autovacuum on the subscriber, dead tuples
> > > > > created by non-hot updates are accumulated anyway regardless of
> > > > > detect_update_deleted setting, is that right?
> > > > >
> > > >
> > > > I think hot-pruning mechanism during the update operation will
> > > > remove dead tuples even when autovacuum is disabled.
> > >
> > > True, but why did it disable autovacuum? It seems that
> > > case1-2_setup.sh doesn't specify fillfactor, which makes hot-updates less
> likely to happen.
> >
> > IIUC, we disable autovacuum as a general practice in read-write tests
> > for stable TPS numbers.
> 
> Okay. TBH I'm not sure what we can say with these results. At a glance, in a
> typical bi-directional-like setup,  we can interpret these results as that if
> users turn retain_conflict_info on the TPS goes 50% down.  But I'm not sure
> this 50% dip is the worst case that users possibly face. It could be better in
> practice thanks to autovacuum, or it also could go even worse due to further
> bloats if we run the test longer.

I think it shouldn't go worse because ideally the amount of bloat would not
increase beyond what we see here due to this patch unless there is some
misconfiguration that leads to one of the node not working properly (say it is
down). However, my colleague is running longer tests and we will share the
results soon.

> Suppose that users had 50% performance dip due to dead tuple retention for
> update_deleted detection, is there any way for users to improve the situation?
> For example, trying to advance slot.xmin more frequently might help to reduce
> dead tuple accumulation. I think it would be good if we could have a way to
> balance between the publisher performance and the subscriber performance.

AFAICS, most of the time in each xid advancement is spent on waiting for the
target remote_lsn to be applied and flushed, so increasing the frequency could
not help. This can be proved to be reasonable in the testcase 4 shared by
Nisha[1], in that test, we do not request a remote_lsn but simply wait for the
commit_ts of incoming transaction to exceed the candidate_xid_time, the
regression is still the same. I think it indicates that we indeed need to wait
for this amount of time before applying all the transactions that have earlier
commit timestamp. IOW, the performance impact on the subscriber side is a
reasonable behavior if we want to detect the update_deleted conflict reliably.

[1] https://www.postgresql.org/message-id/CABdArM4OEwmh_31dQ8_F__VmHwk2ag_M%3DYDD4H%2ByYQBG%2BbHGzg%40mail.gmail.com

Best Regards,
Hou zj

pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Next
From: Ashutosh Bapat
Date:
Subject: Re: pgindent exit status if a file encounters an error