Re: Replication & recovery_min_apply_delay - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Replication & recovery_min_apply_delay
Date
Msg-id 201901301432.p3utg64hum27@alvherre.pgsql
Whole thread Raw
In response to Replication & recovery_min_apply_delay  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: Replication & recovery_min_apply_delay  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Hi

On 2019-Jan-30, Konstantin Knizhnik wrote:

> One of our customers was faced with the following problem: he has
> setup  physical primary-slave replication but for some reasons
> specified very large (~12 hours) recovery_min_apply_delay.

We also came across this exact same problem some time ago.  It's pretty
nasty.  I wrote a quick TAP reproducer, attached (needed a quick patch
for PostgresNode itself too.)

I tried several failed strategies:
1. setting lastSourceFailed just before sleeping for apply delay, with
   the idea that for the next fetch we would try stream.  But this
   doesn't work because WaitForWalToBecomeAvailable is not executed.

2. split WaitForWalToBecomeAvailable in two pieces, so that we can call
   the first half in the restore loop.  But this causes 1s of wait
   between segments (error recovery) and we never actually catch up.

What back then I thought was the *real* solution but I didn't get around
to implementing is the idea you describe to start a walreceiver at an
earlier point.

> I wonder if it can be considered as acceptable solution of the problem or
> there can be some better approach?

I didn't find one.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Michail Nikolaev
Date:
Subject: Re: Synchronous replay take III
Next
From: Tom Lane
Date:
Subject: Re: Unused parameters & co in code