Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Conflict detection for update_deleted in logical replication
Date
Msg-id CAFiTN-sUsh86PQi=SgUKba8Z+2NBXFs9XWihyzKobZ8KDeYsBA@mail.gmail.com
Whole thread Raw
In response to Re: Conflict detection for update_deleted in logical replication  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses RE: Conflict detection for update_deleted in logical replication
List pgsql-hackers
On Wed, Dec 11, 2024 at 2:32 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> Attach the V16 patch set which addressed above comments.
>
> There is a new 0002 patch where I tried to dynamically adjust the interval for
> advancing the transaction ID. Instead of always waiting for
> wal_receiver_status_interval, we can start with a short interval and increase
> it if there is no activity (no xid assigned on subscriber), but not beyond
> wal_receiver_status_interval.
>
> The intention is to more effectively advance xid to avoid retaining too much
> dead tuples. My colleague will soon share detailed performance data and
> analysis related to this enhancement.

I am starting to review the patches, and trying to understand the
concept that how you are preventing vacuum to remove the dead tuple
which might required by the concurrent remote update, so I was looking
at the commit message which explains the idea quite clearly but I have
one question

The process of advancing the non-removable transaction ID in the apply worker
involves:

== copied from commit message of 0001 start==
1) Call GetOldestActiveTransactionId() to take oldestRunningXid as the
candidate xid.
2) Send a message to the walsender requesting the publisher status, which
includes the latest WAL write position and information about transactions
that are in the commit phase.
3) Wait for the status from the walsender. After receiving the first status, do
not proceed if there are concurrent remote transactions that are still in the
commit phase. These transactions might have been assigned an earlier commit
timestamp but have not yet written the commit WAL record. Continue to request
the publisher status until all these transactions have completed.
4) Advance the non-removable transaction ID if the current flush location has
reached or surpassed the last received WAL position.
== copied from commit message of 0001 start==

So IIUC in step 2) we send the message and get the list of all the
transactions which are in the commit phase? What do you exactly mean
by a transaction which is in the commit phase?  Can I assume
transactions which are currently running on the publisher?  And in
step 3) we wait for all the transactions to get committed which we saw
running (or in the commit phase) and we anyway don't worry about the
newly started transactions as they would not be problematic for us.
And in step 4) we would wait for all the flush location to reach "last
received WAL position", here my question is what exactly will be the
"last received WAL position" I assume it would be the position
somewhere after the position of the commit WAL of all the transaction
we were interested on the publisher?

At high level the overall idea looks promising to me but wanted to put
more thought on lower level details about what transactions exactly we
are waiting for and what WAL LSN we are waiting to get flushed.


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: remove pgrminclude?
Next
From: Nishant Sharma
Date:
Subject: Re: on_error table, saving error info to a table