Re: Replication Node Identifiers and crashsafe Apply Progress - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Replication Node Identifiers and crashsafe Apply Progress
Date
Msg-id CAM-w4HPEaQ0eCPLztryW=4dON0UCA_+YFnc5879TUeGk14ymDQ@mail.gmail.com
Whole thread Raw
In response to Replication Node Identifiers and crashsafe Apply Progress  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Replication Node Identifiers and crashsafe Apply Progress
List pgsql-hackers

On Thu, Nov 14, 2013 at 5:26 PM, Andres Freund <andres@2ndquadrant.com> wrote:
But for that the receiving side needs to know up to where changes have
been applied. One relatively easy solution for that is that the
receiving side does something like:
UPDATE replication_progress SET lsn = '0/10000600' WHERE source_id = ...;
before the end of every replayed transaction. But that obviously will
quickly cause bloat.

Our solution to that is that a replaying process can tell the backend
that it is currently doing so and setup three variables for every
transaction:

This is a pretty massive design decision to hinge on such a minor implementation detail of table bloat (which I don't think would actually be an issue anyway -- isn't that what we have HOT for?)

Fundamentally the question here is where to keep all the book-keeping state about replicas, in a central repository in the master or locally in each replica. At first blush it seems obvious to me that locally in each replica is the more flexible choice.

Replication systems become complex when you start restoring from old backups and not every node has the same view of the topology as every other node. I fear what will happen to a central repository when you fail over the master and it's out of sync with where the slaves have actually restored up to. Or where you fail over a slave to a standby of the slave and it needs to redo some of the logical replication to catch up. Or where you restore all your nodes, both master and slaves from backups taken at different points in time (presumably with the master ahead of the slaves).

Having a central repository makes the whole system simpler but it also makes it much more fragile. It's nice to have a single place to go to find out what the state of every replica is but it should do that by actually asking the replicas, not by maintaining state that might be out of sync.


--
greg

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Errors on missing pg_subtrans/ files with 9.3
Next
From: Andres Freund
Date:
Subject: Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1