Re: Keepalive for max_standby_delay - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Keepalive for max_standby_delay
Date
Msg-id 4C04E2CB.90304@enterprisedb.com
Whole thread Raw
In response to Re: Keepalive for max_standby_delay  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Keepalive for max_standby_delay  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 27/05/10 20:26, Simon Riggs wrote:
> On Wed, 2010-05-26 at 16:22 -0700, Josh Berkus wrote:
>>> Just this second posted about that, as it turns out.
>>>
>>> I have a v3 *almost* ready of the keepalive patch. It still makes sense
>>> to me after a few days reflection, so is worth discussion and review. In
>>> or out, I want this settled within a week. Definitely need some R&R
>>> here.
>>
>> Does the keepalive fix all the issues with max_standby_delay?  Tom?
>
> OK, here's v4.
>
> Summary
>
> * WALSender adds a timestamp onto the header of every WAL chunk sent.
>
> * Each WAL record now has a conceptual "send timestamp" that remains
> constant while that record is replayed. This is used as the basis from
> which max_standby_delay is calculated when required during replay.
>
> * Send timestamp is calculated as the later of the timestamp of chunk in
> which WAL record was sent and the latest XLog time.
>
> * WALSender sends an empty message as a keepalive when nothing else to
> send. (No longer a special message type for the keepalive).
>
> I think its close, but if there's a gaping hole here somewhere then I'll
> punt for this release.

This certainly alleviates some of the problems. You still need to ensure 
that master and standby have synchronized clocks, and you still get zero 
grace time after a long period of inactivity when not using streaming 
replication, however.

Sending a keep-alive message every 100ms seems overly aggressive to me.


If we really want to try to salvage max_standby_delay with a meaning 
similar to what it has now, I think we should go with the idea some 
people bashed around earlier and define the grace period as the 
difference between a WAL record becoming available to the standby for 
replay, and between replaying it. An approximation of that is to do 
"lastIdle = gettimeofday()" in XLogPageRead() whenever it needs to wait 
for new WAL to arrive, whether that's via streaming replication or by a 
success return code from restore_command, and compare the difference of 
that with current timestamp in WaitExceedsMaxStandbyDelay().

That's very simple, doesn't require synchronized clocks, and works the 
same with file- and stream-based setups.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: functional call named notation clashes with SQL feature
Next
From: KaiGai Kohei
Date:
Subject: Re: [RFC] A tackle to the leaky VIEWs for RLS