Re: Time delayed LR (WAS Re: logical replication restrictions) - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Time delayed LR (WAS Re: logical replication restrictions)
Date
Msg-id CAFiTN-uiOgS6G3=Uq6QdKskm2_iPmv9Z8x7wxmEfnsMqyo7dOg@mail.gmail.com
Whole thread Raw
In response to RE: Time delayed LR (WAS Re: logical replication restrictions)  ("Takamichi Osumi (Fujitsu)" <osumi.takamichi@fujitsu.com>)
Responses Re: Time delayed LR (WAS Re: logical replication restrictions)  (Amit Kapila <amit.kapila16@gmail.com>)
RE: Time delayed LR (WAS Re: logical replication restrictions)  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
List pgsql-hackers
On Fri, Dec 23, 2022 at 9:16 PM Takamichi Osumi (Fujitsu)
<osumi.takamichi@fujitsu.com> wrote:
>
> On Thursday, December 22, 2022 3:02 PM Takamichi Osumi (Fujitsu) <osumi.takamichi@fujitsu.com> wrote:
> > Attached the updated patch.
> > Again, I used one basic patch in another thread to wake up logical replication
> > worker shared in [2] for TAP tests.
> The v11 caused a cfbot failure in [1]. But, failed tests looked irrelevant
> to the feature to me at present.
>

I have done some review for the patch and I have a few comments.

1.
A.
+       <literal>wal_sender_timeout</literal> on the publisher. Otherwise, the
+       walsender repeatedly terminates due to timeout during the delay of
+       the subscriber.


B.
+/*
+ * In order to avoid walsender's timeout during time delayed replication,
+ * it's necessaary to keep sending feedbacks during the delay from the worker
+ * process. Meanwhile, the feature delays the apply before starting the
+ * transaction and thus we don't write WALs for the suspended changes during
+ * the wait. Hence, in the case the worker process sends a feedback during the
+ * delay, avoid having positions of the flushed and apply LSN overwritten by
+ * the latest LSN.
+ */

- Seems like these two statements are conflicting, I mean if we are
sending feedback then why the walsender will timeout?

- Typo /necessaary/necessary


2.
+     *
+     * During the time delayed replication, avoid reporting the suspeended
+     * latest LSN are already flushed and written, to the publisher.
      */
Typo /suspeended/suspended

3.
+        if (wal_receiver_status_interval > 0
+            && diffms > wal_receiver_status_interval)
+        {
+            WaitLatch(MyLatch,
+                      WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+                      (long) wal_receiver_status_interval,
+                      WAIT_EVENT_RECOVERY_APPLY_DELAY);
+            send_feedback(last_received, true, false);
+        }
+        else
+            WaitLatch(MyLatch,
+                      WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+                      diffms,
+                      WAIT_EVENT_RECOVERY_APPLY_DELAY);

I think here we should add some comments to explain about sending
feedback, something like what we have explained at the time of
defining the "in_delaying_apply" variable.

4.

+     * Although the delay is applied in BEGIN messages, streamed transactions
+     * apply the delay in a STREAM COMMIT message. That's ok because no
+     * changes have been applied yet (apply_spooled_messages() will do it).
+     * The STREAM START message would be a natural choice for this delay but
+     * there is no commit time yet (it will be available when the in-progress
+     * transaction finishes), hence, it was not possible to apply a delay at
+     * that time.
+     */
+    maybe_delay_apply(commit_data.committime);

I am wondering how this will interact with the parallel apply worker
where we do not spool the data in file?  How are we going to get the
commit time of the transaction without applying the changes?

5.
+    /*
+     * The following operations use these special functions to detect
+     * overflow. Number of ms per informed days.
+     */

This comment doesn't make much sense, I think this needs to be rephrased.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Force streaming every change in logical decoding
Next
From: Bharath Rupireddy
Date:
Subject: Re: Improve WALRead() to suck data directly from WAL buffers when possible