Re: Changeset Extraction v7.6.1 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Changeset Extraction v7.6.1
Date
Msg-id CA+TgmoZEd4wbNPn-P6BrXXhs7_fPVfUWM_Nryo_XBM2RruKU_Q@mail.gmail.com
Whole thread Raw
In response to Re: Changeset Extraction v7.6.1  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Changeset Extraction v7.6.1  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Tue, Feb 18, 2014 at 4:33 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-02-17 21:35:23 -0500, Robert Haas wrote:
>> What
>> I don't understand is why we're not taking the test_decoding module,
>> polishing it up a little to produce some nice, easily
>> machine-parseable output, calling it basic_decoding, and shipping
>> that.  Then people who want something else can build it, but people
>> who are happy with something basic will already have it.
>
> Because every project is going to need their own plugin
> *anyway*. Londiste, slony sure are going to ignore changes to relations
> they don't need. Querying their own metadata. They will want
> compatibility to the earlier formats as far as possible. Sometime not
> too far away they will want to optionally support binary output because
> it's so much faster.
> There's just not much chance that either of these will be able to agree
> on a format short term.

Ah, so part of what you're expecting the output plugin to do is
filtering.  I can certainly see where there might be considerable
variation between solutions in that area - but I think that's separate
from the question of formatting per se.  Although I think we should
have an in-core output plugin with filtering capabilities eventually,
I'm happy to define that as out of scope for 9.4.  But isn't there a
way that we can ship something that will due for people who want to
just see the database's entire change stream float by?

TBH, as compared to what you've got now, I think this mostly boils
down to a question of quoting and escaping.  I'm not really concerned
with whether we ship something that's perfectly efficient, or that has
filtering capabilities, or that has a lot of fancy bells and whistles.What I *am* concerned about is that if the user
updatesa text field
 
that contains characters like " or ' or : or [ or ] or , that somebody
might be using as delimiters in the output format, that a program can
still parse that output format and reliably determine what the actual
change was.  I don't care all that much whether we use JSON or CSV or
something custom, but the data that gets spit out should not have
SQL-injection-like vulnerabilities.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Marti Raudsepp
Date:
Subject: Re: Draft release notes up for review
Next
From: Marti Raudsepp
Date:
Subject: Re: PoC: Partial sort