Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Conflict detection for update_deleted in logical replication
Date
Msg-id CAD21AoDLN52WCFG+qEx=3VUrpnP6sViZdBvxT0MyOM6gGrAa6w@mail.gmail.com
Whole thread Raw
In response to RE: Conflict detection for update_deleted in logical replication  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
On Thu, Feb 6, 2025 at 9:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Feb 7, 2025 at 2:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I'd like to confirm what users would expect of this
> > max_conflict_retention_duration option and it works as expected. IIUC
> > users would want to use this option when they want to balance between
> > the reliable update_deleted conflict detection and the performance. I
> > think they want to detect updated_deleted reliably as much as possible
> > but, at the same time, would like to avoid a huge performance dip
> > caused by it. IOW, once the apply lag becomes larger than the limit,
> > they would expect to prioritize the performance (recovery) over the
> > reliable update_deleted conflict detection.
> >
>
> Yes, this understanding is correct.
>
> > With the subscription-level max_conflict_retention_duration, users can
> > set it to '5min' to a subscription, SUB1, while not setting it to
> > another subscription, SUB2, (assuming here that both subscriptions set
> > retain_conflict_info = true). This setting works fine if SUB2 could
> > easily catch up while SUB1 is delaying, because in this case, SUB1
> > would stop updating its xmin when delaying for 5 min or longer so the
> > slot's xmin can advance based only on SUB2's xmin. Which is good
> > because it ultimately allow vacuum to remove dead tuples and
> > contributes to better performance. On the other hand, in cases where
> > SUB2 is as delayed as or more than SUB1, even if SUB1 stopped updating
> > its xmin, the slot's xmin would not be able to advance. IIUC
> > pg_conflict_detection slot won't be invalidated as long as there is at
> > least one subscription that sets retain_conflict_info = true and
> > doesn't set max_conflict_retention_duration, even if other
> > subscriptions set max_conflict_retention_duration.
> >
>
> Right.
>
> > I'm not really sure that these behaviors are the expected behavior of
> > users who set max_conflict_retention_duration to some subscriptions.
> > Or I might have set the wrong expectation or assumption on this
> > parameter. I'm fine with a subscription-level
> > max_conflict_retention_duration if it's clear this option works as
> > expected by users who want to use it.
> >
>
> It seems you are not convinced to provide this parameter at the
> subscription level and anyway providing it as GUC will simplify the
> implementation and it would probably be easier for users to configure
> rather than giving it at the subscription level for all subscriptions
> that have set retain_conflict_info set to true. I guess in the future
> we can provide it at the subscription level as well if there is a
> clear use case for it. Does that make sense to you?

Yes, that makes sense to me.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Fix for Extra Parenthesis in pgbench progress message
Next
From: James Hunter
Date:
Subject: Re: should we have a fast-path planning for OLTP starjoins?