Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Conflict detection for update_deleted in logical replication |
Date | |
Msg-id | CAD21AoDuGu4UwpRDtrmGkvjYt+U3RBseU8uGTx6CYtCN8MMNuA@mail.gmail.com Whole thread Raw |
In response to | Re: Conflict detection for update_deleted in logical replication (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Conflict detection for update_deleted in logical replication
|
List | pgsql-hackers |
On Fri, Jan 31, 2025 at 9:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Sat, Feb 1, 2025 at 2:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Jan 30, 2025 at 10:39 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Fri, Jan 31, 2025 at 4:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > I have one question about the 0004 patch; it implemented > > > > max_conflict_retntion_duration as a subscription parameter. But the > > > > launcher invalidates the pg_conflict_detection slot only if all > > > > subscriptions with retain_conflict_info stopped retaining dead tuples > > > > due to the max_conflict_retention_duration parameter. Therefore, even > > > > if users set the parameter to a low value to avoid table bloats, it > > > > would not make sense if other subscriptions set it to a larger value. > > > > Is my understanding correct? > > > > > > > > > > Yes, your understanding is correct. I think this could be helpful > > > during resolution because the worker for which the duration has > > > exceeded cannot detect conflicts reliably but others can. So, this > > > info can be useful while performing resolutions. Do you have an > > > opinion/suggestion on this matter? > > > > I imagined a scenario like where two apply workers are running and > > have different max_conflict_retention_duration values (say '5 min' and > > '15 min'). Suppose both workers are roughly the same behind the > > publisher(s), when both workers cannot advance the workers' xmin > > values for 5 min or longer, one worker stops retaining dead tuples. > > However, the pg_conflict_detection slot is not invalidated yet since > > another worker is still using it, so both workers would continue to be > > getting slower. The subscriber would end up retaining dead tuples > > until both workers are behind for 15 min or longer, before > > invalidating the slot. In this case, stopping dead tuple retention on > > the first worker would help neither advance the slot's xmin nor > > improve another worker's performance. > > Won't the same be true for 'retain_conflict_info' option as well? I > mean even if one worker is retaining dead tuples, the performance of > others will also be impacted. I guess the situation might be a bit different. It's a user's choice to disable retain_conflict_info, and it should be done manually. That is, in this case, I think users will be able to figure out that both apply workers are the same behind the publishers and they need to disable retain_conflict_info on both subscriptions in order to remove accumulated dead tuples (which is the cause of performance dip). On the other hand, ISTM max_conflict_retentation_duration is something like a switch to recover the system performance by automatically disabling retain_conflict_info (and it will automatically go back to be enabled again). I guess users who use the max_conflict_retention_duration would expect that the system performance will tend to recover by automatically disabling reatin_conflict_info if the apply worker is lagging for longer than the specified value. However, there are cases where this cannot be expected. > > > > > I was not sure of the point of > > making the max_conflict_retention_duration a per-subscription > > parameter. > > > > The idea is to keep it at the same level as the other related > parameter 'retain_conflict_info'. It could be useful for cases where > publishers are from two different nodes (NP1 and NP2) and we have > separate subscriptions for both nodes. Now, it is possible that users > won't expect conflicts on the tables from one of the nodes NP1 then > she could choose to enable 'retain_conflict_info' and > 'max_conflict_retention_duration' only for the subscription pointing > to publisher NP2. > > Now, say the publisher node that can generate conflicts (NP2) has > fewer writes and the corresponding apply worker could easily catch up > and almost always be in sync with the publisher. In contrast, the > other node that has no conflicts has a large number of writes. In such > cases, giving new options at the subscription level will be helpful. > > If we want to provide it at the global level, then the performance or > dead tuple control may not be any better than the current patch but > won't allow the provision for the above kinds of cases. Second, adding > two new GUCs is another thing I want to prevent. But OTOH, the > implementation could be slightly simpler if we provide these options > as GUC though I am not completely sure of that point. Having said > that, I am open to changing it to a non-subscription level. Do you > think it would be better to provide one or both of these parameters as > GUCs or do you have something else in mind? It makes sense to me to have the retain_conflict_info as a subscription-level parameter. I was thinking of making only max_conflict_retention_duration a global parameter, but I might be missing something. With a subscription-level max_conflict_retention_duration, how can users choose the setting values for each subscription, and is there a case that can be covered only by a subscription-level max_conflict_retention_duration? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: