Thanks for the review.
On Tue, 2010-06-01 at 13:36 +0300, Heikki Linnakangas wrote:
> If we really want to try to salvage max_standby_delay with a meaning
> similar to what it has now, I think we should go with the idea some
> people bashed around earlier and define the grace period as the
> difference between a WAL record becoming available to the standby for
> replay, and between replaying it. An approximation of that is to do
> "lastIdle = gettimeofday()" in XLogPageRead() whenever it needs to wait
> for new WAL to arrive, whether that's via streaming replication or by a
> success return code from restore_command, and compare the difference of
> that with current timestamp in WaitExceedsMaxStandbyDelay().
That wouldn't cope with a continuous stream of records arriving, unless
you also include the second half of the patch.
> That's very simple, doesn't require synchronized clocks, and works the
> same with file- and stream-based setups.
Nor does it provide a mechanism for monitoring of SR. standby_delay is
explicitly defined in terms of the gap between two servers, so is a
useful real world concept. apply_delay is somewhat less interesting.
I'm sure most people would rather have monitoring and therefore the
requirement for synchronised-ish clocks, than no monitoring. If you
think no monitoring is OK, I don't, but there are other ways, so its not
a point to fight about.
> This certainly alleviates some of the problems. You still need to ensure
> that master and standby have synchronized clocks, and you still get zero
> grace time after a long period of inactivity when not using streaming
> replication, however.
Second issue can be added once we approve the rest of this if you like.
> Sending a keep-alive message every 100ms seems overly aggressive to me.
It's sent every wal_sender_delay. Why is that a negative?
-- Simon Riggs www.2ndQuadrant.com