Re: Minimal logical decoding on standbys - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Minimal logical decoding on standbys
Date
Msg-id CALj2ACXFkMOOD2PwMVLw4vb2cZHn-dd60ooiZay2jTdA-XfoYg@mail.gmail.com
Whole thread Raw
In response to Re: Minimal logical decoding on standbys  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Responses Re: Minimal logical decoding on standbys  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
Re: Minimal logical decoding on standbys  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Tue, Jan 10, 2023 at 2:03 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> Please find attached, V37 taking care of:

Thanks. I started to digest the design specified in the commit message
and these patches. Here are some quick comments:

1. Does logical decoding on standby work without any issues if the
standby is set for cascading replication?

2. Do logical decoding output plugins work without any issues on the
standby with decoding enabled, say, when the slot is invalidated?

3. Is this feature still a 'minimal logical decoding on standby'?
Firstly, why is it 'minimal'?

4. What happens in case of failover to the standby that's already
decoding for its clients? Will the decoding work seamlessly? If not,
what are recommended things that users need to take care of
during/after failovers?

0002:
1.
-    if (InvalidateObsoleteReplicationSlots(_logSegNo))
+    InvalidateObsoleteOrConflictingLogicalReplicationSlots(_logSegNo,
&invalidated, InvalidOid, NULL);

Isn't the function name too long and verbose? How about just
InvalidateLogicalReplicationSlots() let the function comment talk
about what sorts of replication slots it invalides?

2.
+                                errdetail("Logical decoding on
standby requires wal_level to be at least logical on master"));
+ *     master wal_level is set back to replica, so existing logical
slots need to
invalidate such slots. Also do the same thing if wal_level on master

Can we use 'primary server' instead of 'master' like elsewhere? This
comment also applies for other patches too, if any.

3. Can we show a new status in pg_get_replication_slots's wal_status
for invalidated due to the conflict so that the user can monitor for
the new status and take necessary actions?

4. How will the user be notified when logical replication slots are
invalidated due to conflict with the primary server? How can they
(firstly, is there a way?) repair/restore such replication slots? Or
is recreating/reestablishing logical replication only the way out for
them? What users can do to avoid their logical replication slots
getting invalidated and run into these situations? Because
recreating/reestablishing logical replication comes with cost
(sometimes huge) as it involves building another instance, syncing
tables etc. Isn't it a good idea to touch up on all these aspects in
the documentation at least as to why we're doing this and why we can't
do this?

5.
@@ -1253,6 +1253,14 @@ StartLogicalReplication(StartReplicationCmd *cmd)

     ReplicationSlotAcquire(cmd->slotname, true);

+    if (!TransactionIdIsValid(MyReplicationSlot->data.xmin)
+         && !TransactionIdIsValid(MyReplicationSlot->data.catalog_xmin))
+        ereport(ERROR,
+                (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                 errmsg("cannot read from logical replication slot \"%s\"",
+                        cmd->slotname),
+                 errdetail("This slot has been invalidated because it
was conflicting with recovery.")));

Having the invalidation check in StartLogicalReplication() looks fine,
however, what happens if the slot gets invalidated when the
replication is in-progress? Do we need to error out in WalSndLoop() or
XLogSendLogical() too? Or is it already taken care of somewhere?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: SQL/JSON revisited
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: [Proposal] Add foreign-server health checks infrastructure