Re: json api WIP patch - Mailing list pgsql-hackers
From | Daniel Farina |
---|---|
Subject | Re: json api WIP patch |
Date | |
Msg-id | CAAZKuFb6Ho7MgiBuDt4trCf_gqh6bChJO46WnUfhbegOJUn66g@mail.gmail.com Whole thread Raw |
In response to | Re: json api WIP patch (Andrew Dunstan <andrew@dunslane.net>) |
List | pgsql-hackers |
On Tue, Jan 15, 2013 at 12:17 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > > On 01/15/2013 02:47 PM, Merlin Moncure wrote: >> >> On Tue, Jan 15, 2013 at 1:04 PM, David Fetter <david@fetter.org> wrote: >>> >>> On Mon, Jan 14, 2013 at 07:52:56PM -0500, Andrew Dunstan wrote: >>>> >>>> On 01/14/2013 07:36 PM, Merlin Moncure wrote: >>>>> >>>>> While testing this I noticed that integer based 'get' routines are >>>>> zero based -- was this intentional? Virtually all other aspects of >>>>> SQL are 1 based: >>>>> >>>>> postgres=# select json_get('[1,2,3]', 1); >>>>> json_get >>>>> ---------- >>>>> 2 >>>>> (1 row) >>>>> >>>>> postgres=# select json_get('[1,2,3]', 0); >>>>> json_get >>>>> ---------- >>>>> 1 >>>>> (1 row) >>>> >>>> Yes. it's intentional. SQL arrays might be 1-based by default, but >>>> JavaScript arrays are not. JsonPath and similar gadgets treat the >>>> arrays as zero-based. I suspect the Json-using community would not >>>> thank us for being overly SQL-centric on this - and I say that as >>>> someone who has always thought zero based arrays were a major design >>>> mistake, responsible for countless off-by-one errors. >>> >>> Perhaps we could compromise by making arrays 0.5-based. >> >> Well, I'm not prepared to argue with Andrew in this one. It was >> surprising behavior to me, but that's sample size one. > > I doubt I'm very representative either. People like David Wheeler, Taras > Mitran, Joe Van Dyk, and the Heroku guys would be better people to ask than > me. I'm quite prepared to change it if that's the consensus. Hello. I'm inclined to go with the same gut feeling you had (zero-based-indexing). Here is the background for my reasoning: The downside of zero-based-indexing is that people who want to use multiple sequential container types will inevitably have to deal with detailed and not easily type-checked integer coordinates that mean different things in each domain that will, no doubt, lead to a number of off-by-one errors. Nevertheless, this cost is already paid because one of the first things many people will do in programs generating SQL queries is try to zero-index a SQL array, swear a bit after figuring things out (because a NULL will be generated, not an error), and then adjust all the offsets. So, this is not a new problem. On many occasions I'm sure this has caused off-by-one bugs, or the NULLs slipped through testing and delivered funny results, yet the world moves on. On the other hand, the downside of going down the road of 1-based indexing and attempting to attain relative sameness to SQL arrays, it would also feel like one would be obliged to implement SQL array infelicities like 'out of bounds' being SQL NULL rather than an error, related to other spectres like non-rectangular nested arrays. SQL array semantics are complex and The Committee can change them or -- slightly more likely -- add interactions, so it seems like a general expectation that Postgres container types that happen to have any reasonable ordinal addressing will implement some level of same-ness with SQL arrays is a very messy one. As such, if it becomes customary to implement one-based indexing of containers, I think such customs are best carefully circumscribed so that attempts to be 'like' SQL arrays are only as superficial as that. What made me come down on the side of zero-based indexing in spite of the weaknesses are these two reasons: * The number of people who use JSON and zero-based-indexing is very large, and furthermore, within that set the number thatknow that SQL even defines array support -- much less that Postgres implements it -- is much smaller. Thus, one is targetingcohesion with a fairly alien concept that is not understood by the audience. * Maintaining PL integrated code that uses both 1-based indexing in PG functions and 0-based indexing in embedded languagesthat are likely to be combined with JSON -- doesn't sound very palatable, and the use of such PLs (e.g. plv8) seemspretty likely, too. That can probably be a rich source of bugs and frustration. If one wants SQL array semantics, it seems like the right way to get them is coercion to a SQL array value. Then one will receive SQL array semantics exactly. -- fdr
pgsql-hackers by date: