Re: [PATCH 07/16] Log enough data into the wal to reconstruct logical changes from it if wal_level=logical - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PATCH 07/16] Log enough data into the wal to reconstruct logical changes from it if wal_level=logical
Date
Msg-id 201206131757.33212.andres@2ndquadrant.com
Whole thread Raw
In response to Re: [PATCH 07/16] Log enough data into the wal to reconstruct logical changes from it if wal_level=logical  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
On Wednesday, June 13, 2012 05:27:06 PM Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> > This adds a new wal_level value 'logical'
> > 
> > Missing cases:
> > - heap_multi_insert
> > - primary key changes for updates
> > - no primary key
> > - LOG_NEWPAGE
> 
> First, Wow!
Thanks ;) I hope you will still be convinced after reading some of the code :P

> I look forward to the point where we can replace our trigger-based
> replication with this!  Your "missing cases" for primary key issues
> would not cause us any pain for our current system, since we require
> a primary key and don't support updates to PKs for replicated
> tables. While I don't expect that the first cut of this will be able
> to replace our replication-related functionality, I'm interested in
> making sure it can be extended in that direction, so I have a couple
> things to consider:
Ok.

> (1)  For our usage, with dozens of source databases feeding into
> multiple aggregate databases and interfaces, DDL replication is not
> of much if any interest.  It should be easy enough to ignore as long
> as it is low volume, so that doesn't worry me too much; but if I'm
> missing something any you run across any logical WAL logging for DDL
> which does generate a lot of WAL traffic, it would be nice to have a
> way to turn that off at generation time rather than filtering it or
> ignoring it later.  (Probably won't be an issue, just a head-up.)
I don't really see a problem there. I don't yet have a mental image of the 
API/Parameters to START_LOGICAL_REPLICATION to specify filters on the source 
side, but this should be possible.

> (2)  To match the functionality we now have, we would need the
> logical stream to include the *before* image of the whole tuple for
> each row updated or deleted.  I understand that this is not needed
> for the use cases you are initially targeting; I just hope the
> design leaves this option open without needing to disturb other use
> cases.  Perhaps this would require yet another wal_level value.
> Perhaps rather than testing the current value directly for
> determining whether to log something, the GUC processing could set
> some booleans for faster testing and less code churn when the
> initial implementation is expanded to support other use cases (like
> ours).
Hm. I don't see a big problem implementing this although I have to say that I 
am a bit hesitant to do this without in-core users of it for fear of silent 
breakage. WAL is kind of a central thing... ;). But then, the implementation 
should be relatively easy.
I don't see a need to break the wal level down into some booleans: changing 
the test from wal_level >= WAL_LEVEL_LOGICAL into something else shouldn't 
result in any measurable difference in that codepath.

I definitely have the use-case of replicating into databases where you need to 
do some transformation in mind.

> (3)  Similar to point 2, it would be extremely desirable to be able
> to determine table name and columns names for the tuples in a stream
> from that stream, without needing to query a hot standby or similar
> digging into other sources of information.  Not only will the
> various source databases all have different OID values for the same
> objects, and the aggregate targets have different values from each
> other and the sources, but some targets don't have the tables at
> all.  I'm talking about our database transaction repository and the
> interfaces to business partners which we currently drive off of the
> same transaction stream which drives replication.
I don't forsee this as a realistic thing. I think the required changes would 
be way to intrusive for too little gain. The performance and space 
requirements would probably also be rather noticeable. I think you will have 
to live with the mentioned 'proxy' pg instances which only contain the catalog 
(which normally isn't very big anyway) doing the decoding into your target 
database (which then doesn't need the same oids).
For how I imagine those proxy instances check my mail to merlin earlier today, 
I described it in some more detail there.
If I can find the time I should possibly develop (another) prototype of such a 
proxy instance. I don't forsee it needing much more infrastructure/code.

Is that a problem for your use-case? If yes, why?

> Would it be helpful or just a distraction if I were to provide a
> more detailed description of our whole replication / transaction
> store / interface area?
If you have it ready, yes. Otherwise I think I have a good enough image 
already. I tried to listen to you at pgcon and I have developed similar things 
before.
To be honest, I would prefer you spending the time on checking some of the 
code if not ;)

> If it would be useful, I could also describe some other replication
> patterns I have seen over the years.  In particular, one which might
> be interesting is where subsets of the data are distributed to
> multiple standalone machines which have intermittent or unreliable
> connections to a central site, which periodically collects data from
> all the remote sites, recalculates distribution, and sends
> transactions back out to those remote sites to add, remove, and
> update rows based on the distribution rules and the new data.
I don't think its that relevant for now. I think that time is spent more 
wisely when we get to conflict resolution and user interface...

Thanks!

Andres

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 9.3devel branch
Next
From: Robert Haas
Date:
Subject: Re: 9.3devel branch