Re: Replication Node Identifiers and crashsafe Apply Progress - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Replication Node Identifiers and crashsafe Apply Progress
Date
Msg-id 20131121111227.GI7240@alap2.anarazel.de
Whole thread Raw
In response to Re: Replication Node Identifiers and crashsafe Apply Progress  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On 2013-11-19 14:16:04 +0000, Greg Stark wrote:
> On Thu, Nov 14, 2013 at 5:26 PM, Andres Freund <andres@2ndquadrant.com>wrote:
> 
> > But for that the receiving side needs to know up to where changes have
> > been applied. One relatively easy solution for that is that the
> > receiving side does something like:
> > UPDATE replication_progress SET lsn = '0/10000600' WHERE source_id = ...;
> > before the end of every replayed transaction. But that obviously will
> > quickly cause bloat.
> >
> > Our solution to that is that a replaying process can tell the backend
> > that it is currently doing so and setup three variables for every
> > transaction:
> >
> 
> This is a pretty massive design decision to hinge on such a minor
> implementation detail of table bloat (which I don't think would actually be
> an issue anyway -- isn't that what we have HOT for?)

Not sure what HOT is going to help with? Even with HOT we can only
remove tuples that are invisible to everyone. If there are longrunning
queries on the standby - and running analytics on standbys is a rather
frequent use-case - that won't be the case for a long, long time.

> Fundamentally the question here is where to keep all the book-keeping state
> about replicas, in a central repository in the master or locally in each
> replica. At first blush it seems obvious to me that locally in each replica
> is the more flexible choice.

This really is about storing the state of apply on each replica
efficiently.

Imagine the standby just received data for a transaction x and has
replayed it locally. If it crashes in that moment, it needs to know
whether that transaction has successfully committed or not. And that has
to work even if the commit succeeded internally but hasn't yet returned
success!
So, what this provides is a facility to say 'hey, this local transaction
was at X on the source Y' and a way to get the last X for each Y at the
end of crash recovery.
Then the replication solution can restart replication from X onwards for
each Y.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Replication Node Identifiers and crashsafe Apply Progress
Next
From: Andres Freund
Date:
Subject: Re: Replication Node Identifiers and crashsafe Apply Progress