Re: 7.2 - changed array_out() - quotes vs no quotes - Mailing list pgsql-hackers

From Elein
Subject Re: 7.2 - changed array_out() - quotes vs no quotes
Date
Msg-id 3C642EE1.2070103@nextbus.com
Whole thread Raw
In response to 7.2 - changed array_out() - quotes vs no quotes  (David Gould <dg@nextbus.com>)
List pgsql-hackers

The issue is that changing a user interface will break people's
code.  It is clear that a user interface change was made and
it is also clear that it broke people's code.  In any organization
that is a bug.

I did this job of gating server user interfaces for commercial postgres
based systems for several years. It is not always a nice job.  The
need to carefully ensure that interfaces were (almost) always backward
compatible was given, as was the necessity of highlighting any unavoidable
changes in interfaces so that the users would be impacted in the smallest
possible way.

In this case, the container aspect of the datatype requires more studied
parsing that most others with or without the changes.  Also, the array
handling routines in jdbc were only completed in 7.2, so we can assume
that java users had to do their own parsing of the representation of
this container type.

The counter argument that there were no complaints
is not particularly relevent since few people have been using the
beta versions and have been waiting for 7.2 to stabilize since it requires
a significant effort to do the initdb.  It is safe to say that the
masses will be a bit slowere on the adoption of an initdb release than
others.

You all have been stellar at supporting postgresql and responding to users
in very helpful and immediate ways.  We have benefited directly from that.
Keep up the good work by not changing user interfaces for anything other
than unavoidable circumstances.

elein

Tom Lane wrote:

> David Gould <dg@nextbus.com> writes:
> 
>>Yes. I think it is not excessive to insist that types have stable,
>>predicatable representations. The other types do, why should arrays be
>>even more special?
>>
> 
> The representation is stable and predictable.  You're simply hoping to
> avoid building smarts into your parser for it.  Unfortunately, some
> degree of smarts are *necessary* if you are going to deal with array
> items containing arbitrary text.  I can hardly believe that a client
> program that can deal with backslash-escapes is going to have trouble
> removing quotes.
> 
> 
>>Or, you don't even need the quotes, you could just promise never
>>to insert white space and to always escape embedded commas and curlys.
>>
> 
> No, we can't, because that would break applications that rely on the
> existing rules for array input: leading whitespace is insignificant
> unless quoted.  Besides, weren't you complaining because the quotes
> disappeared?  The above variant would still break your code.
> 
> 
>>So a dumb client could simply split on un-escaped commas and be done.
>>
> 
> I hardly think that a client that can tell the difference between an
> escaped comma and an un-escaped one qualifies as "dumb".
> 
> We could perhaps dispense with quotes on output if we escaped leading
> spaces.  For example, instead of
>     "  foo"
> emit
>     \  foo
> I don't think this is a step forward in readability, though.  And
> increased reliance on backslashes instead of double quotes won't really
> make anyone's life easier --- for example, you'd have to remember to
> double them when sending the same value back to the SQL parser.
> 
> 
>>>The only way I could see to make the behavior totally predictable at
>>>the datatype level (while not being broken) is to always quote every
>>>array element.
>>>
> 
>>Fine with me. That is what it did before.
>>
> 
> No, it has never done that.  In particular, I do not wish to change the
> longstanding no-quotes behavior for arrays of integers.  That *would*
> break other people's code.  (One of the things I hoped to accomplish
> with this change is to extend the same no-quotes behavior to floats and
> numerics.)
> 
> 
>>But to slip a client visible change late in a beta cycle to a specific
>>format that has been stable since UC Berkeley freed the code,
>>
> 
> It's been broken since Berkeley, too; the fact that no one complained
> till a month or two ago just indicates how little arrays are used, IMHO.
> I doubt you'd be any less annoyed no matter when in the development
> cycle we'd done this.
> 
> I do agree that it'd be better if this had been called out in the
> release notes.  We don't currently have any process for ensuring that
> minor incompatibilities get noted in the release notes.  Bruce makes up
> the notes based on scanning the CVS logs after the fact, and if he
> misses the significance of an entry, it's missed.  Maybe we can do
> better than that --- adding an entry to a release-notes-to-be file when
> the change is made might be more reliable.
> 
> It's also true that the SGML documentation is sadly deficient on this
> point; but then, its discussion of arrays is overly terse in just about
> every respect.  Someone want to volunteer to expand it?
> 
> 
>>Seriously, one point of a database is to insulate client applications
>>from the exact representation and layout of the data. Which is not
>>accomplished by making arbitrary changes to simple things like strings
>>that make them take a yards and yards of code to parse.
>>
> 
> Properly parsing arrays of text values is going to require dealing with
> backslash-escapes in any case; seems to me that that's what will take
> "yards and yards" of code.  Stripping off optional quotes is trivial
> by comparison.  On the other hand, parsing arrays of integers is pretty
> trivial since you know there are no escapable characters anywhere.
> I don't favor pushing complexity out of the one case and into the other.
> 
> I'm willing to consider the output-no-quotes-at-all approach if people
> think that's a superior solution.  Comments anyone?
> 
>             regards, tom lane
> 
> 



-- 
--------------------------------------------------------
elein@nextbus.com 
(510)420-3120 
www.nextbus.comspinning to infinity, hallelujah
--------------------------------------------------------



pgsql-hackers by date:

Previous
From: bpalmer
Date:
Subject: Re: Replication
Next
From: Oleg Bartunov
Date:
Subject: Re: GiST on 64-bit box