Re: [PATCH 16/16] current version of the design document - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [PATCH 16/16] current version of the design document
Date
Msg-id 201206131640.32594.andres@2ndquadrant.com
Whole thread Raw
In response to Re: [PATCH 16/16] current version of the design document  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: [PATCH 16/16] current version of the design document  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-hackers
Hi Merlin,

On Wednesday, June 13, 2012 04:21:12 PM Merlin Moncure wrote:
> On Wed, Jun 13, 2012 at 6:28 AM, Andres Freund <andres@2ndquadrant.com> 
wrote:
> > +synchronized catalog at the decoding site. That adds some complexity to
> > use +cases like replicating into a different database or cross-version
> > +replication. For those it is relatively straight-forward to develop a
> > proxy pg +instance that only contains the catalog and does the
> > transformation to textual +changes.
> wow.  Anyways, could you elaborate on a little on how this proxy
> instance concept would work?
To do the decoding into another form you need an up2date catalog + correct 
binaries. So the idea would be to have a minimal instance which is just a copy 
of the database with all the tables with an oid < FirstNormalObjectId i.e. 
only the catalog tables. Then you can apply all xlog changes on system tables 
using the existing infrastructure for HS (or use the command trigger 
equivalent we need to build for BDR) and decode everything else into the 
ApplyCache just as done in the patch. Then you would fill out the callbacks 
for the ApplyCache (see patch 14/16 and 15/16 for an example) to do whatever 
you want with the data. I.e. generate plain sql statements or run some 
transform procedure.

> Let's take the case where I have N small-ish schema identical database
> shards that I want to aggregate into a single warehouse -- something that
> HS/SR currently can't do.
> There's a lot of ways to do that obviously but assuming the warehouse
> would have to have a unique schema, could it be done in your
> architecture?
Not sure what you mean by the warehouse having a unique schema? It has the 
same schema as the OLTP counterparts? That would obviously be the easy case if 
you take care and guarantee uniqueness of keys upfront. That basically would 
be trivial ;)
It gets a bit more complex if you need to transform the data for the 
warehouse. I don't plan to put in work to make that possible without some C 
coding (filling out the callbacks and doing the work in there). It shouldn't 
need much though.

Does that answer your question?

Andres

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.
Next
From: Robert Haas
Date:
Subject: Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.