Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Conflict detection for update_deleted in logical replication |
Date | |
Msg-id | CAD21AoCbjVTjejQxBkyo9kop2HMw85wSJqpB=JapsSE+Kw_iRg@mail.gmail.com Whole thread Raw |
In response to | Re: Conflict detection for update_deleted in logical replication (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On Tue, Feb 4, 2025 at 10:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Feb 5, 2025 at 6:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Fri, Jan 31, 2025 at 9:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > I was not sure of the point of > > > > making the max_conflict_retention_duration a per-subscription > > > > parameter. > > > > > > > > > > The idea is to keep it at the same level as the other related > > > parameter 'retain_conflict_info'. It could be useful for cases where > > > publishers are from two different nodes (NP1 and NP2) and we have > > > separate subscriptions for both nodes. Now, it is possible that users > > > won't expect conflicts on the tables from one of the nodes NP1 then > > > she could choose to enable 'retain_conflict_info' and > > > 'max_conflict_retention_duration' only for the subscription pointing > > > to publisher NP2. > > > > > > Now, say the publisher node that can generate conflicts (NP2) has > > > fewer writes and the corresponding apply worker could easily catch up > > > and almost always be in sync with the publisher. In contrast, the > > > other node that has no conflicts has a large number of writes. In such > > > cases, giving new options at the subscription level will be helpful. > > > > > > If we want to provide it at the global level, then the performance or > > > dead tuple control may not be any better than the current patch but > > > won't allow the provision for the above kinds of cases. Second, adding > > > two new GUCs is another thing I want to prevent. But OTOH, the > > > implementation could be slightly simpler if we provide these options > > > as GUC though I am not completely sure of that point. Having said > > > that, I am open to changing it to a non-subscription level. Do you > > > think it would be better to provide one or both of these parameters as > > > GUCs or do you have something else in mind? > > > > It makes sense to me to have the retain_conflict_info as a > > subscription-level parameter. I was thinking of making only > > max_conflict_retention_duration a global parameter, but I might be > > missing something. With a subscription-level > > max_conflict_retention_duration, how can users choose the setting > > values for each subscription, and is there a case that can be covered > > only by a subscription-level max_conflict_retention_duration? > > > > Users can configure depending on the workload of the publisher > considering the publishers are different nodes as explained in my > previous response. Also, I think it will help in resolutions where the > worker for which the duration for updating the worker_level xmin has > not exceeded the max_conflict_retention_duration can reliably detect > update_delete. Then this parameter will only be required for > subscriptions that have enabled retain_conflict_info. I am not > completely sure if these are reasons enough to keep at the > subscription level but OTOH Dilip also seems to favor keeping > max_conflict_retention_duration at susbcription-level. I'd like to confirm what users would expect of this max_conflict_retention_duration option and it works as expected. IIUC users would want to use this option when they want to balance between the reliable update_deleted conflict detection and the performance. I think they want to detect updated_deleted reliably as much as possible but, at the same time, would like to avoid a huge performance dip caused by it. IOW, once the apply lag becomes larger than the limit, they would expect to prioritize the performance (recovery) over the reliable update_deleted conflict detection. With the subscription-level max_conflict_retention_duration, users can set it to '5min' to a subscription, SUB1, while not setting it to another subscription, SUB2, (assuming here that both subscriptions set retain_conflict_info = true). This setting works fine if SUB2 could easily catch up while SUB1 is delaying, because in this case, SUB1 would stop updating its xmin when delaying for 5 min or longer so the slot's xmin can advance based only on SUB2's xmin. Which is good because it ultimately allow vacuum to remove dead tuples and contributes to better performance. On the other hand, in cases where SUB2 is as delayed as or more than SUB1, even if SUB1 stopped updating its xmin, the slot's xmin would not be able to advance. IIUC pg_conflict_detection slot won't be invalidated as long as there is at least one subscription that sets retain_conflict_info = true and doesn't set max_conflict_retention_duration, even if other subscriptions set max_conflict_retention_duration. I'm not really sure that these behaviors are the expected behavior of users who set max_conflict_retention_duration to some subscriptions. Or I might have set the wrong expectation or assumption on this parameter. I'm fine with a subscription-level max_conflict_retention_duration if it's clear this option works as expected by users who want to use it. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: