Re: Possible bug in logical replication. - Mailing list pgsql-hackers

From Arseny Sher
Subject Re: Possible bug in logical replication.
Date
Msg-id 87y3fgoyrn.fsf@ars-thinkpad
Whole thread Raw
In response to Re: Possible bug in logical replication.  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Possible bug in logical replication.
List pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:

> Can somebody (Arseny, Konstantin, horiguti, Sawada) please confirm that
> Michaël's commit fixes the reported bug?

I confirm that starting reading WAL since restart_lsn as implemented in
f731cfa fixes this issue, as well as the second issue tushar mentioned
at [1]. I think that the code still can be improved a bit though --
consider the attached patch:
* pg_logical_replication_slot_advance comment was not very informative
  and even a bit misleading: it said that we use confirmed_flush_lsn and
  restart_lsn, but didn't explain why.
* Excessive check in its main loop.
* Copy-paste comment fix.

[1] https://www.postgresql.org/message-id/5f85bf41-098e-c4e1-7332-9171fef57a0a%40enterprisedb.com

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

From d8ed8ae3eec54b716d7dbb35379d0047a96c6c75 Mon Sep 17 00:00:00 2001
From: Arseny Sher <sher-ars@yandex.ru>
Date: Fri, 15 Jun 2018 18:11:17 +0300
Subject: [PATCH] Cosmetic review for f731cfa.

* pg_logical_replication_slot_advance comment was not very informative and even
  a bit misleading: it said that we use confirmed_flush_lsn and restart_lsn, but
  didn't explain why.
* Excessive check in its main loop.
* Copy-paste comment fix.
---
 src/backend/replication/slotfuncs.c | 25 ++++++++++---------------
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2806e1076c..597e81f4d9 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -341,24 +341,26 @@ pg_physical_replication_slot_advance(XLogRecPtr moveto)
 
 /*
  * Helper function for advancing logical replication slot forward.
- * The slot's restart_lsn is used as start point for reading records,
- * while confirmed_lsn is used as base point for the decoding context.
- * The LSN position to move to is checked by doing a per-record scan and
- * logical decoding which makes sure that confirmed_lsn is updated to a
- * LSN which allows the future slot consumer to get consistent logical
- * changes.
+ * While we could just do LogicalConfirmReceivedLocation updating
+ * confirmed_flush_lsn, we'd better digest WAL to advance restart_lsn allowing
+ * to recycle WAL and old catalog tuples. We do it in special fast_forward
+ * mode without actual replay.
  */
 static XLogRecPtr
 pg_logical_replication_slot_advance(XLogRecPtr moveto)
 {
     LogicalDecodingContext *ctx;
     ResourceOwner old_resowner = CurrentResourceOwner;
+    /*
+     * Start reading WAL at restart_lsn, which certainly points to the valid
+     * record.
+     */
     XLogRecPtr    startlsn = MyReplicationSlot->data.restart_lsn;
     XLogRecPtr    retlsn = MyReplicationSlot->data.confirmed_flush;
 
     PG_TRY();
     {
-        /* restart at slot's confirmed_flush */
+        /* start_lsn doesn't matter here, we don't replay xacts at all */
         ctx = CreateDecodingContext(InvalidXLogRecPtr,
                                     NIL,
                                     true,
@@ -388,17 +390,10 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto)
              */
             startlsn = InvalidXLogRecPtr;
 
-            /*
-             * The {begin_txn,change,commit_txn}_wrapper callbacks above will
-             * store the description into our tuplestore.
-             */
+            /* Changes are not actually produced in fast_forward mode. */
             if (record != NULL)
                 LogicalDecodingProcessRecord(ctx, ctx->reader);
 
-            /* Stop once the moving point wanted by caller has been reached */
-            if (moveto <= ctx->reader->EndRecPtr)
-                break;
-
             CHECK_FOR_INTERRUPTS();
         }
 
-- 
2.11.0


pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: WAL prefetch
Next
From: Tom Lane
Date:
Subject: Re: BUG #15237: I got "ERROR: source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression"