RE: pgsql: Preserve conflict-relevant data during logical replication. - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: pgsql: Preserve conflict-relevant data during logical replication.
Date
Msg-id TY4PR01MB16907A6D1F3A6E1C333FF1BF29401A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: pgsql: Preserve conflict-relevant data during logical replication.  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tuesday, September 2, 2025 11:03 PM Robert Haas <robertmhaas@gmail.com> wrote:
> 
> On Wed, Jul 23, 2025 at 11:44 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > The fix looks good to me. I'll push your patch in sometime.
> 
> The tests in this patch are insufficient to prove that this logic works properly. I
> tried with this patch:
> 
> diff --git a/src/backend/access/transam/twophase.c
> b/src/backend/access/transam/twophase.c
> index 7918176fc58..7c1d0ef07b5 100644
> --- a/src/backend/access/transam/twophase.c
> +++ b/src/backend/access/transam/twophase.c
> @@ -2325,6 +2325,16 @@ RecordTransactionCommitPrepared(TransactionId
> xid,
>      TimestampTz committs;
>      bool        replorigin;
> 
> +    /*
> +     * Note it is important to set committs value after marking ourselves as
> +     * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is
> because
> +     * we want to ensure all transactions that have acquired commit
> timestamp
> +     * are finished before we allow the logical replication client to advance
> +     * its xid which is used to hold back dead rows for conflict detection.
> +     * See comments atop worker.c.
> +     */
> +    committs = GetCurrentTimestamp();
> +
>      /*
>       * Are we using the replication origins feature?  Or, in other words, are
>       * we replaying remote actions?
> @@ -2344,16 +2354,6 @@ RecordTransactionCommitPrepared(TransactionId
> xid,
>       */
>      pg_write_barrier();
> 
> -    /*
> -     * Note it is important to set committs value after marking ourselves as
> -     * in the commit critical section (DELAY_CHKPT_IN_COMMIT). This is
> because
> -     * we want to ensure all transactions that have acquired commit
> timestamp
> -     * are finished before we allow the logical replication client to advance
> -     * its xid which is used to hold back dead rows for conflict detection.
> -     * See comments atop worker.c.
> -     */
> -    committs = GetCurrentTimestamp();
> -
>      /*
>       * Emit the XLOG commit record. Note that we mark 2PC commits as
>       * potentially having AccessExclusiveLocks since we don't know whether
> or
> 
> If it is in fact important to acquire the commit timestamp after setting
> delayChkptFlags, you'd hope this would lead to a test failure, but it doesn't for
> me. I understand it probably requires an injection point to be certain of hitting
> the race condition, but I think that would be worth doing. Otherwise, if
> something gets broken here by accident, it might be a long time before anyone
> notices.

Thanks for pointing it out!

I agree that adding a test is valuable to mitigate the risk of future code
changes. We added a similar safeguard for the RecordTransactionCommit() function
by adding Assert(xactStopTimestamp == 0) after marking the DELAY_CHKPT_xxx flag,
and did not do any precautionary check for RecordTransactionCommitPrepared as
the code to acquire the timestamp and setting flag was close by and had explicit
comments.

I'll prepare a test case that uses the injection point and share it in the
original thread.

Best Regards,
Hou zj

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Solaris compiler status
Next
From: Daniel Gustafsson
Date:
Subject: Re: Solaris compiler status