Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From shveta malik
Subject Re: Conflict detection for update_deleted in logical replication
Date
Msg-id CAJpy0uBsWZhEZziLg2PpmVKegOcWfie4gXU3nW2cfCj7cs2t_g@mail.gmail.com
Whole thread Raw
In response to RE: Conflict detection for update_deleted in logical replication  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
Responses Re: Conflict detection for update_deleted in logical replication
List pgsql-hackers
On Fri, Aug 1, 2025 at 9:16 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
>
> Thanks for confirming. Here is V56 patch set which addressed all the
> comments including the comments from Amit[1] and Shveta[2].
>
> I have merged V55-0002 into 0001 and updated the list of author
> and reviewers based on my knowledge.
>

Thank You Hou-San for the patches. Please find a few initial comments on 002:


1)
src/sgml/system-views.sgml:
+         <para>
+          <literal>conflict_retention_exceeds_max_duration</literal> means that
+          the duration for retaining conflict information, which is used
+          in logical replication conflict detection, has exceeded the maximum
+          allowable limit. It is set only for the slot
+          <literal>pg_conflict_detection</literal>, which is created when
+          <link
linkend="sql-createsubscription-params-with-retain-dead-tuples"><literal>retain_dead_tuples</literal></link>
+          is enabled.
+         </para>

We can mention 'max_conflict_retention_duration' here i.e:
...has exceeded the maximum allowable limit of max_conflict_retention_duration.


2)
Shall we rename 'conflict_retention_exceeds_max_duration' as
'conflict_info_retention_exceeds_limit'? It is better to incorporate
'info' keyword, but then
'conflict_info_retention_exceeds_max_duration' becomes too long and
thus I suggest 'conflict_info_retention_exceeds_limit'. Thoughts?


3)
src/sgml/monitoring.sgml:
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>retain_dead_tuples</structfield> <type>boolean</type>
+      </para>
+      <para>
+       True if <link
linkend="sql-createsubscription-params-with-retain-dead-tuples"><literal>retain_dead_tuples</literal></link>
+       is enabled and the duration for which conflict information is
+       retained for conflict detection by this apply worker does not exceed
+       <link linkend="guc-max-conflict-retention-duration"><literal>max_conflict_retention_duration</literal></link>;
NULL for
+       parallel apply workers and table synchronization workers.
+      </para></entry>
+     </row>

a)
In the html file, the link does not take me to
'max_conflict_retention_duration' GUC. It takes to that page but to
some other location.

b)
+       the duration for which conflict information is
+       retained for conflict detection by this apply worker

Shall this be better:  'the duration for which information useful for
conflict detection is retained by this apply worker'

4)
src/sgml/config.sgml:

a)
+        Maximum duration for which each apply worker can request to retain the
+        information useful for conflict detection when
+        <literal>retain_dead_tuples</literal> is enabled for the associated
+        subscriptions.

Shall it be :
"Maximum duration for which each apply worker is allowed to retain.."
or "can retain"?

b)
src/sgml/config.sgml
+        subscriptions. The default value is <literal>0</literal>, indicating
+        that conflict information is retained until it is no longer needed for
+        detection purposes. If this value is specified without units, it is
+        taken as milliseconds.

'that conflict information is retained' --> 'that information useful
for conflict detection is retained'

c)
src/sgml/config.sgml
+        The replication slot
+        <quote><literal>pg_conflict_detection</literal></quote> that used to
+        retain conflict information will be invalidated if all apply workers
+        associated with the subscriptions, where

'that used' --> 'that is used'

5)
ApplyLauncherMain():
+ /*
+ * Stop the conflict information retention only if all workers
+ * for subscriptions with retain_dead_tuples enabled have
+ * requested it.
+ */
+ stop_retention &= sub->enabled;

This comment is not clear. By enabling or disabling subscription, how
can it request for 'stop or continue conflict info retention'?

Do you mean we can not 'invalidate the slot' and thus stop retention
if a sub with rdt=ON is disabled? If so, we can pair it up with the
previous comment itself where we have mentioned that we can not
advance xmin when sub is disabled, as that comment indicates a clear
reason too.

6)
Above brings me to a point that in doc, shall we mention that if a sub
with rdt=on is disabled, even 'max_conflict_retention_duration' is not
considered for other subs which have rdt=ON.

7)
Shall we rename 'max_conflict_retention_duration' to
'max_conflict_info_retention_duration' as the latter one is more
clear?

8)
+ * nonremovable xid. Similarly, stop the conflict information
+ * retention only if all workers for subscriptions with
+ * retain_dead_tuples enabled have requested it.

Shall we rephrase to:
 Similarly, can't stop the conflict information retention unless all
such workers are running.

thanks
Shveta



pgsql-hackers by date:

Previous
From: Yugo Nagata
Date:
Subject: Doc: Add note for running ANALYZE after ALTER TABLE ... SET EXPRESSION
Next
From: Naga Appani
Date:
Subject: Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring