Re: Replication identifiers, take 4 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Replication identifiers, take 4
Date
Msg-id 20150216091839.GA20205@awork2.anarazel.de
Whole thread Raw
In response to Re: Replication identifiers, take 4  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Replication identifiers, take 4  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On 2015-02-16 11:07:09 +0200, Heikki Linnakangas wrote:
> On 02/16/2015 02:21 AM, Andres Freund wrote:
> >Furthermore the fact that the origin of records is recorded allows to
> >avoid decoding them in logical decoding. That has both efficiency
> >advantages (we can do so before they are stored in memory/disk) and
> >functionality advantages. Imagine using a logical replication solution
> >to replicate inserts to a single table between two databases where
> >inserts are allowed on both - unless you prevent the replicated inserts
> >from being replicated again you obviously have a loop. This
> >infrastructure lets you avoid that.
> 
> That makes sense.
> 
> How does this work if you have multiple replication systems in use in the
> same cluster? You might use Slony to replication one table to one system,
> and BDR to replication another table with another system. Or the same
> replication software, but different hosts.

It should "just work". Replication identifiers are identified by a free
form text, each replication solution can add the
information/distinguising data they need in there.

Bdr uses something like
#define BDR_NODE_ID_FORMAT "bdr_"UINT64_FORMAT"_%u_%u_%u_%s"
with
remote_sysid, remote_tlid, remote_dboid, MyDatabaseId, configurable_name
as parameters as a replication identifier name.

I've been wondering whether the bdr_ part in the above should be split
of into a separate field, similar to how the security label stuff does
it. But I don't think it'd really buy us much, especially as we did
not do that for logical slot names.

Each of the used replication solution would probably ask their output
plugin to only stream locally generated (i.e. origin_id =
InvalidRepNodeId) changes, and possibly from a defined list of other
known hosts in the cascading case.

Does that answer your question?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Replication identifiers, take 4
Next
From: Andres Freund
Date:
Subject: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0