Re: Summary and Plan for Hot Standby - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Summary and Plan for Hot Standby
Date
Msg-id 1258297496.14054.1543.camel@ebony
Whole thread Raw
In response to Re: Summary and Plan for Hot Standby  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
On Sun, 2009-11-15 at 14:47 +0000, Greg Stark wrote:
> On Sun, Nov 15, 2009 at 2:32 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >> - The "standby delay" is measured as current timestamp - timestamp of
> >> last replayed commit record. If there's little activity in the master,
> >> that can lead to surprising results. For example, imagine that
> >> max_standby_delay is set to 8 hours. The standby is fully up-to-date
> >> with the master, and there's no write activity in master.  After 10
> >> hours, a long reporting query is started in the standby. Ten minutes
> >> later, a small transaction is executed in the master that conflicts with
> >> the reporting query. I would expect the reporting query to be canceled 8
> >> hours after the conflicting transaction began, but it is in fact
> >> canceled immediately, because it's over 8 hours since the last commit
> >> record was replayed.
> >
> > An issue that will be easily fixable with streaming, since it
> > effectively needs a heartbeat to listen to. Adding a regular stream of
> > WAL records is also possible, but there is no need, unless streaming is
> > somehow in doubt. Again, there is work to do once both are in.
> 
> I don't think you need a heartbeat to solve this particular case. You
> just need to define the "standby delay" to be "current timestamp -
> timestamp of the conflicting candidate commit record".

That's not possible unfortunately.

We only have times for commits and aborts. So there could be untimed WAL
records ahead of the last timed record.

The times of events we know from the log records give us no clue as to
when the last non-commit/abort record arrived. We can only do that by

(i) specifically augmenting the log with regular, timed WAL records, or
(ii) asking WALreceiver directly when it last spoke with the master

(ii) is the obvious way to do this when we have streaming replication,
and HS assumes this will be available. It need not, and we can do (i)

Heikki's case is close to one I would expect to see in many cases: a
database that is only active during day feeds a system that runs queries
24x7. Run a VACUUM on the master at night and you could get conflicts
that follow the pattern described.

-- Simon Riggs           www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Summary and Plan for Hot Standby
Next
From: Simon Riggs
Date:
Subject: Re: Summary and Plan for Hot Standby