Hi Merlin,
On Wednesday, June 13, 2012 04:21:12 PM Merlin Moncure wrote:
> On Wed, Jun 13, 2012 at 6:28 AM, Andres Freund <andres@2ndquadrant.com>
wrote:
> > +synchronized catalog at the decoding site. That adds some complexity to
> > use +cases like replicating into a different database or cross-version
> > +replication. For those it is relatively straight-forward to develop a
> > proxy pg +instance that only contains the catalog and does the
> > transformation to textual +changes.
> wow. Anyways, could you elaborate on a little on how this proxy
> instance concept would work?
To do the decoding into another form you need an up2date catalog + correct
binaries. So the idea would be to have a minimal instance which is just a copy
of the database with all the tables with an oid < FirstNormalObjectId i.e.
only the catalog tables. Then you can apply all xlog changes on system tables
using the existing infrastructure for HS (or use the command trigger
equivalent we need to build for BDR) and decode everything else into the
ApplyCache just as done in the patch. Then you would fill out the callbacks
for the ApplyCache (see patch 14/16 and 15/16 for an example) to do whatever
you want with the data. I.e. generate plain sql statements or run some
transform procedure.
> Let's take the case where I have N small-ish schema identical database
> shards that I want to aggregate into a single warehouse -- something that
> HS/SR currently can't do.
> There's a lot of ways to do that obviously but assuming the warehouse
> would have to have a unique schema, could it be done in your
> architecture?
Not sure what you mean by the warehouse having a unique schema? It has the
same schema as the OLTP counterparts? That would obviously be the easy case if
you take care and guarantee uniqueness of keys upfront. That basically would
be trivial ;)
It gets a bit more complex if you need to transform the data for the
warehouse. I don't plan to put in work to make that possible without some C
coding (filling out the callbacks and doing the work in there). It shouldn't
need much though.
Does that answer your question?
Andres
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services