Re: logical replication restrictions - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: logical replication restrictions
Date
Msg-id 594ad346-da36-4f53-91ad-da366787c67f@www.fastmail.com
Whole thread Raw
In response to Re: logical replication restrictions  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: logical replication restrictions
List pgsql-hackers
On Tue, Jul 5, 2022, at 9:29 AM, Amit Kapila wrote:
I wonder why we don't apply the delay on commit/commit_prepared
records only similar to physical replication. See recoveryApplyDelay.
One more advantage would be then we don't need to worry about
transactions that we are going to skip due SKIP feature for
subscribers.
I added an explanation at the top of apply_delay(). I didn't read the "parallel
apply" patch yet. I'll do soon to understand how the current design for
streamed transactions conflicts with the parallel apply patch.

+ * It applies the delay for the next transaction but before starting the
+ * transaction. The main reason for this design is to avoid a long-running
+ * transaction (which can cause some operational challenges) if the user sets a
+ * high value for the delay. This design is different from the physical
+ * replication (that applies the delay at commit time) mainly because write
+ * operations may allow some issues (such as bloat and locks) that can be
+ * minimized if it does not keep the transaction open for such a long time.
+ */
+static void
+apply_delay(TimestampTz ts)

Regarding the skip transaction feature, we could certainly skip the
transactions combined with the apply delay. However, it introduces complexity
for a rare use case IMO. Besides that, the skip transaction code path is fast,
hence, it is very unlikely that the current patch will impose some issues to
the skip transaction feature. (Remember that the main goal for this feature is
to provide an old state of the database.)

One more thing that might be worth discussing is whether introducing a
new subscription parameter for this feature is a better idea or can we
use guc (either an existing or a new one). Users may want to set this
only for a particular subscription or set of subscriptions in which
case it is better to have this as a subscription level parameter.
OTOH, I was slightly worried that if this will be used for all
subscriptions on a subscriber then it will be burdensome for users.
That's a good point. Logical replication is per database and it is slightly
different from physical replication that is per cluster. In physical
replication, you have no choice but to have a GUC. It is very unlikely that
someone wants to delay all logical replicas. Therefore, the benefit of having a
GUC is quite small.


--
Euler Taveira

pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: Typo in pg_db_role_setting.h
Next
From: David Geier
Date:
Subject: Re: Reducing planning time on tables with many indexes