Re: Replication & recovery_min_apply_delay - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Replication & recovery_min_apply_delay
Date
Msg-id 20190910062325.GD11737@paquier.xyz
Whole thread Raw
In response to Re: Replication & recovery_min_apply_delay  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: Replication & recovery_min_apply_delay
List pgsql-hackers
On Tue, Sep 10, 2019 at 12:46:49AM +0300, Alexander Korotkov wrote:
> On Wed, Sep 4, 2019 at 4:37 PM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> receivedUpto is just static variable in xlog.c, maintained by WAL receiver.
>> But as I mentioned above, WAL receiver is not started at the moment when
>> we need to know LSN of last record.
>>
>> Certainly it should be possible to somehow persist receveidUpto, so we
>> do not need to scan WAL to determine the last LSN at next start.
>> By persisting last LSN introduce a lot of questions and problems.
>> For example when it needs to be flushed for the disk. If it is done
>> after each received transaction, then it can significantly suffer
>> performance.
>> If it is done more or less asynchronously, then there us a risk that we
>> requested streaming with wrong position.
>> In any case it will significantly complicate the patch and make it more
>> sensible for various errors.
>
> I think we don't necessary need exact value of receveidUpto.  But it
> could be some place to start scanning WAL from.  We currently call
> UpdateControlFile() in a lot of places.  In particular we call it each
> checkpoint.  If even we would start scanning WAL from one checkpoint
> back value of receveidUpto, we could still save a lot of resources.

A minimum to set would be the minimum consistency LSN, but there are a
lot of gotchas to take into account when it comes to crash recovery.

> As I get this patch fixes a problem with very large recovery apply
> delay.  In this case, amount of accumulated WAL corresponding to that
> delay could be also huge.  Scanning all this amount of WAL could be
> costly.  And it's nice to evade.

Yes, I suspect that it could be very costly in some configurations if
there is a large gap between the last replayed LSN and the last LSN
the WAL receiver has flushed.

There is an extra thing, which has not been mentioned yet on this
thread, that we need to be very careful about:
   <para>
       When the standby is started and <varname>primary_conninfo</varname> is set
       correctly, the standby will connect to the primary after replaying all
       WAL files available in the archive. If the connection is established
       successfully, you will see a walreceiver process in the standby, and
       a corresponding walsender process in the primary.
   </para>
This is a long-standing behavior, and based on the first patch
proposed we would start a WAL receiver once consistency has been
reached if there is any delay defined even if restore_command is
enabled.  We cannot assume either that everybody will want to start a
WAL receiver in this configuration if there is archiving behind with a
lot of segments which allow for a larger catchup window..
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: pgbench - allow to create partitioned tables
Next
From: Michael Paquier
Date:
Subject: Re: [patch]socket_timeout in interfaces/libpq