Re: YAML Was: CommitFest status/management - Mailing list pgsql-hackers

From Robert Haas
Subject Re: YAML Was: CommitFest status/management
Date
Msg-id 603c8f070912041733o5fcc43c1x6ca43799b9d5bf8e@mail.gmail.com
Whole thread Raw
In response to Re: YAML Was: CommitFest status/management  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On Fri, Dec 4, 2009 at 7:42 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> On top of that, if you did want YAML for easier readability, what
>> aspect of the output is more readable in YAML than it is in text
>> format?  The only answer I can think of is that you like having each
>> data element on a separate line, so that the plan is much longer but
>> somewhat narrower.  But if that's what you want, the JSON output is
>> almost as good - the only difference is a bit of extra punctuation.
>
> "almost as good" ... I agree with Kevin that it's more readable.
>
> The whole patch just adds 144 lines.  It doesn't look to me like there's
> significant maintenance burden involved, but of course I need to defer
> to the more experienced.  It's even possible that we could reduce the
> size of the patch still further if we really looked at it as just a
> differently punctuated JSON.
>
> Having compared the JSON and YAML output formats, I think having YAML as
> a 2nd human-readable format might be valuable, even though it adds
> nothing to machine-processing.
>
> Again, if there were a sensible way to do YAML as a contrib module, I'd
> go for that, but there isn't.

I don't think the maintenance burden is the issue, per se.  It's more
that I don't like the idea of putting in a bunch of formats that are
only trivially different from each other, and if there were ever a
case where we should reject a new format because it is too similar to
an existing one, this seems to be it.  If that's a bad reason for
rejecting a new format, then I don't have a second one, but we may end
up with a lot of formats - and that WILL be a maintenance burden,
especially if we ever want to make non-trivial extensions to the
output format.

Frankly, just adding new fields (perhaps controlled by new options) is
never going to be that big of a deal, but there will certainly come a
day when someone wants to do something more novel, like dumping
parse-tree representations of expressions or something along the line
of Tom Raney's visual explain tool, which dumped out every path the
planner considered.  I don't really want to be the person who has to
tell the person who writes that patch "sorry, but we have to reject
your patch until it supports every one of our numerous slightly
different output formats".

One possibility for contrib-module-izing this, and perhaps other
output formats that people might like to see, is to write a function
that takes the JSON or XML output as input and does the appropriate
translation.  For something like YAML, whose semantics are so close to
JSON, this should be pretty simple.  One of the reasons why I was hot
to get JSON support into the initial version of machine-readable
EXPLAIN is because JSON maps very cleanly onto the type of data
structures that are common in scripting languages: everything is lists
and hashes, nested inside each other, and text and numeric scalars.
So you can read a JSON object into a script written in Perl, PHP,
Python, Ruby, JavaScript, and probably half a dozen other languages
and get a native object.  From there, it's very easy to write the data
back out in whatever format you happen to prefer just by walking the
data structure.  I suspect that a JSON-to-YAML converter in Perl would
be less than 50 lines.

(The XML format can also be transformed using things like XSLT, but
I'm less familiar with those tools.)

...Robert


pgsql-hackers by date:

Previous
From: "Massa, Harald Armin"
Date:
Subject: Re: Block-level CRC checks
Next
From: Greg Smith
Date:
Subject: Clearing global statistics