Home > mailing lists

Re: Replication & recovery_min_apply_delay - Mailing list pgsql-hackers

From	Michael Paquier
Subject	Re: Replication & recovery_min_apply_delay
Date	September 10, 2019 06:23:25
Msg-id	20190910062325.GD11737@paquier.xyz Whole thread Raw
In response to	Re: Replication & recovery_min_apply_delay (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses	Re: Replication & recovery_min_apply_delay
List	pgsql-hackers

Tree view

On Tue, Sep 10, 2019 at 12:46:49AM +0300, Alexander Korotkov wrote:
> On Wed, Sep 4, 2019 at 4:37 PM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> receivedUpto is just static variable in xlog.c, maintained by WAL receiver.
>> But as I mentioned above, WAL receiver is not started at the moment when
>> we need to know LSN of last record.
>>
>> Certainly it should be possible to somehow persist receveidUpto, so we
>> do not need to scan WAL to determine the last LSN at next start.
>> By persisting last LSN introduce a lot of questions and problems.
>> For example when it needs to be flushed for the disk. If it is done
>> after each received transaction, then it can significantly suffer
>> performance.
>> If it is done more or less asynchronously, then there us a risk that we
>> requested streaming with wrong position.
>> In any case it will significantly complicate the patch and make it more
>> sensible for various errors.
>
> I think we don't necessary need exact value of receveidUpto.  But it
> could be some place to start scanning WAL from.  We currently call
> UpdateControlFile() in a lot of places.  In particular we call it each
> checkpoint.  If even we would start scanning WAL from one checkpoint
> back value of receveidUpto, we could still save a lot of resources.

A minimum to set would be the minimum consistency LSN, but there are a
lot of gotchas to take into account when it comes to crash recovery.

> As I get this patch fixes a problem with very large recovery apply
> delay.  In this case, amount of accumulated WAL corresponding to that
> delay could be also huge.  Scanning all this amount of WAL could be
> costly.  And it's nice to evade.

Yes, I suspect that it could be very costly in some configurations if
there is a large gap between the last replayed LSN and the last LSN
the WAL receiver has flushed.

There is an extra thing, which has not been mentioned yet on this
thread, that we need to be very careful about:
   <para>
       When the standby is started and <varname>primary_conninfo</varname> is set
       correctly, the standby will connect to the primary after replaying all
       WAL files available in the archive. If the connection is established
       successfully, you will see a walreceiver process in the standby, and
       a corresponding walsender process in the primary.
   </para>
This is a long-standing behavior, and based on the first patch
proposed we would start a WAL receiver once consistency has been
reached if there is any delay defined even if restore_command is
enabled.  We cannot assume either that everybody will want to start a
WAL receiver in this configuration if there is archiving behind with a
lot of segments which allow for a larger catchup window..
--
Michael

Attachment

signature.asc

pgsql-hackers by date:

From: Amit Kapila
Date: 10 September 2019, 05:48:12
Subject: Re: pgbench - allow to create partitioned tables

From: Michael Paquier
Date: 10 September 2019, 06:38:21
Subject: Re: [patch]socket_timeout in interfaces/libpq

Re: Replication & recovery_min_apply_delay - Mailing list pgsql-hackers

Attachment

Previous

Next