RE: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers
From | Zhijie Hou (Fujitsu) |
---|---|
Subject | RE: Conflict detection for update_deleted in logical replication |
Date | |
Msg-id | OS0PR01MB57162AD809E3DB77290C05219422A@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Conflict detection for update_deleted in logical replication (shveta malik <shveta.malik@gmail.com>) |
List | pgsql-hackers |
On Monday, August 4, 2025 2:16 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Fri, Aug 1, 2025 at 9:16 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> > wrote: > > > > > > Thanks for confirming. Here is V56 patch set which addressed all the > > comments including the comments from Amit[1] and Shveta[2]. > > > > I have merged V55-0002 into 0001 and updated the list of author and > > reviewers based on my knowledge. > > > > Thank You Hou-San for the patches. Please find a few initial comments on 002: > > > 1) > src/sgml/system-views.sgml: > + <para> > + <literal>conflict_retention_exceeds_max_duration</literal> > means that > + the duration for retaining conflict information, which is used > + in logical replication conflict detection, has exceeded the > maximum > + allowable limit. It is set only for the slot > + <literal>pg_conflict_detection</literal>, which is created when > + <link > linkend="sql-createsubscription-params-with-retain-dead-tuples"><literal > >retain_dead_tuples</literal></link> > + is enabled. > + </para> > > We can mention 'max_conflict_retention_duration' here i.e: > ...has exceeded the maximum allowable limit of > max_conflict_retention_duration. Added. > > > 2) > Shall we rename 'conflict_retention_exceeds_max_duration' as > 'conflict_info_retention_exceeds_limit'? It is better to incorporate 'info' keyword, > but then 'conflict_info_retention_exceeds_max_duration' becomes too long > and thus I suggest 'conflict_info_retention_exceeds_limit'. Thoughts? I will think on this. > > 3) > src/sgml/monitoring.sgml: > + <row> > + <entry role="catalog_table_entry"><para role="column_definition"> > + <structfield>retain_dead_tuples</structfield> > <type>boolean</type> > + </para> > + <para> > + True if <link > linkend="sql-createsubscription-params-with-retain-dead-tuples"><literal > >retain_dead_tuples</literal></link> > + is enabled and the duration for which conflict information is > + retained for conflict detection by this apply worker does not exceed > + <link > + linkend="guc-max-conflict-retention-duration"><literal>max_conflict_re > + tention_duration</literal></link>; > NULL for > + parallel apply workers and table synchronization workers. > + </para></entry> > + </row> > > a) > In the html file, the link does not take me to 'max_conflict_retention_duration' > GUC. It takes to that page but to some other location. > > b) > + the duration for which conflict information is > + retained for conflict detection by this apply worker > > Shall this be better: 'the duration for which information useful for conflict > detection is retained by this apply worker' Changed. > > 4) > src/sgml/config.sgml: > > a) > + Maximum duration for which each apply worker can request to retain > the > + information useful for conflict detection when > + <literal>retain_dead_tuples</literal> is enabled for the associated > + subscriptions. > > Shall it be : > "Maximum duration for which each apply worker is allowed to retain.." > or "can retain"? Changed to "is allowed to" > > b) > src/sgml/config.sgml > + subscriptions. The default value is <literal>0</literal>, indicating > + that conflict information is retained until it is no longer needed for > + detection purposes. If this value is specified without units, it is > + taken as milliseconds. > > 'that conflict information is retained' --> 'that information useful for conflict > detection is retained' I changed to "the information" because the nearby texts have already mentioned the usage of this information. > > c) > src/sgml/config.sgml > + The replication slot > + <quote><literal>pg_conflict_detection</literal></quote> that > used to > + retain conflict information will be invalidated if all apply workers > + associated with the subscriptions, where > > 'that used' --> 'that is used' Fixed. > > 5) > ApplyLauncherMain(): > + /* > + * Stop the conflict information retention only if all workers > + * for subscriptions with retain_dead_tuples enabled have > + * requested it. > + */ > + stop_retention &= sub->enabled; > > This comment is not clear. By enabling or disabling subscription, how can it > request for 'stop or continue conflict info retention'? > > Do you mean we can not 'invalidate the slot' and thus stop retention if a sub > with rdt=ON is disabled? Yes, there is no apply worker for disabled subscription, thus no way to request that. > If so, we can pair it up with the previous comment > itself where we have mentioned that we can not advance xmin when sub is > disabled, as that comment indicates a clear reason too. Changed. > > 6) > Above brings me to a point that in doc, shall we mention that if a sub with > rdt=on is disabled, even 'max_conflict_retention_duration' is not considered > for other subs which have rdt=ON. I think the documentation specifies that only active apply workers can make such requests, which appears sufficient to me. > > 7) > Shall we rename 'max_conflict_retention_duration' to > 'max_conflict_info_retention_duration' as the latter one is more clear? I will think on it. > > 8) > + * nonremovable xid. Similarly, stop the conflict information > + * retention only if all workers for subscriptions with > + * retain_dead_tuples enabled have requested it. > > Shall we rephrase to: > Similarly, can't stop the conflict information retention unless all such workers > are running. Changed. Here is V57 patch set which addressed most of comments. In this version, I also fixed a bug that the apply worker continued to find dead tuples even if it has already stop retaining dead tuples. Best Regards, Hou zj
Attachment
pgsql-hackers by date: