RE: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: Conflict detection for update_deleted in logical replication
Date
Msg-id OS0PR01MB57162AD809E3DB77290C05219422A@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Conflict detection for update_deleted in logical replication  (shveta malik <shveta.malik@gmail.com>)
List pgsql-hackers
On Monday, August 4, 2025 2:16 PM shveta malik <shveta.malik@gmail.com> wrote:
> 
> On Fri, Aug 1, 2025 at 9:16 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> wrote:
> >
> >
> > Thanks for confirming. Here is V56 patch set which addressed all the
> > comments including the comments from Amit[1] and Shveta[2].
> >
> > I have merged V55-0002 into 0001 and updated the list of author and
> > reviewers based on my knowledge.
> >
> 
> Thank You Hou-San for the patches. Please find a few initial comments on 002:
> 
> 
> 1)
> src/sgml/system-views.sgml:
> +         <para>
> +          <literal>conflict_retention_exceeds_max_duration</literal>
> means that
> +          the duration for retaining conflict information, which is used
> +          in logical replication conflict detection, has exceeded the
> maximum
> +          allowable limit. It is set only for the slot
> +          <literal>pg_conflict_detection</literal>, which is created when
> +          <link
> linkend="sql-createsubscription-params-with-retain-dead-tuples"><literal
> >retain_dead_tuples</literal></link>
> +          is enabled.
> +         </para>
> 
> We can mention 'max_conflict_retention_duration' here i.e:
> ...has exceeded the maximum allowable limit of
> max_conflict_retention_duration.

Added.

> 
> 
> 2)
> Shall we rename 'conflict_retention_exceeds_max_duration' as
> 'conflict_info_retention_exceeds_limit'? It is better to incorporate 'info' keyword,
> but then 'conflict_info_retention_exceeds_max_duration' becomes too long
> and thus I suggest 'conflict_info_retention_exceeds_limit'. Thoughts?

I will think on this.

> 
> 3)
> src/sgml/monitoring.sgml:
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>retain_dead_tuples</structfield>
> <type>boolean</type>
> +      </para>
> +      <para>
> +       True if <link
> linkend="sql-createsubscription-params-with-retain-dead-tuples"><literal
> >retain_dead_tuples</literal></link>
> +       is enabled and the duration for which conflict information is
> +       retained for conflict detection by this apply worker does not exceed
> +       <link
> + linkend="guc-max-conflict-retention-duration"><literal>max_conflict_re
> + tention_duration</literal></link>;
> NULL for
> +       parallel apply workers and table synchronization workers.
> +      </para></entry>
> +     </row>
> 
> a)
> In the html file, the link does not take me to 'max_conflict_retention_duration'
> GUC. It takes to that page but to some other location.
> 
> b)
> +       the duration for which conflict information is
> +       retained for conflict detection by this apply worker
> 
> Shall this be better:  'the duration for which information useful for conflict
> detection is retained by this apply worker'

Changed.

> 
> 4)
> src/sgml/config.sgml:
> 
> a)
> +        Maximum duration for which each apply worker can request to retain
> the
> +        information useful for conflict detection when
> +        <literal>retain_dead_tuples</literal> is enabled for the associated
> +        subscriptions.
> 
> Shall it be :
> "Maximum duration for which each apply worker is allowed to retain.."
> or "can retain"?

Changed to "is allowed to"

> 
> b)
> src/sgml/config.sgml
> +        subscriptions. The default value is <literal>0</literal>, indicating
> +        that conflict information is retained until it is no longer needed for
> +        detection purposes. If this value is specified without units, it is
> +        taken as milliseconds.
> 
> 'that conflict information is retained' --> 'that information useful for conflict
> detection is retained'

I changed to "the information" because the nearby texts have already
mentioned the usage of this information.

> 
> c)
> src/sgml/config.sgml
> +        The replication slot
> +        <quote><literal>pg_conflict_detection</literal></quote> that
> used to
> +        retain conflict information will be invalidated if all apply workers
> +        associated with the subscriptions, where
> 
> 'that used' --> 'that is used'

Fixed.

> 
> 5)
> ApplyLauncherMain():
> + /*
> + * Stop the conflict information retention only if all workers
> + * for subscriptions with retain_dead_tuples enabled have
> + * requested it.
> + */
> + stop_retention &= sub->enabled;
> 
> This comment is not clear. By enabling or disabling subscription, how can it
> request for 'stop or continue conflict info retention'?
> 
> Do you mean we can not 'invalidate the slot' and thus stop retention if a sub
> with rdt=ON is disabled?

Yes, there is no apply worker for disabled
subscription, thus no way to request that.

> If so, we can pair it up with the previous comment
> itself where we have mentioned that we can not advance xmin when sub is
> disabled, as that comment indicates a clear reason too.

Changed.

> 
> 6)
> Above brings me to a point that in doc, shall we mention that if a sub with
> rdt=on is disabled, even 'max_conflict_retention_duration' is not considered
> for other subs which have rdt=ON.

I think the documentation specifies that only active apply workers can make such
requests, which appears sufficient to me.

> 
> 7)
> Shall we rename 'max_conflict_retention_duration' to
> 'max_conflict_info_retention_duration' as the latter one is more clear?

I will think on it.

> 
> 8)
> + * nonremovable xid. Similarly, stop the conflict information
> + * retention only if all workers for subscriptions with
> + * retain_dead_tuples enabled have requested it.
> 
> Shall we rephrase to:
>  Similarly, can't stop the conflict information retention unless all such workers
> are running.

Changed.

Here is V57 patch set which addressed most of comments.

In this version, I also fixed a bug that the apply worker continued to
find dead tuples even if it has already stop retaining dead tuples.

Best Regards,
Hou zj



Attachment

pgsql-hackers by date:

Previous
From: Erik Wienhold
Date:
Subject: psql: Count all table footer lines in pager setup
Next
From: Sami Imseih
Date:
Subject: Re: Improve LWLock tranche name visibility across backends