Re: jsonb and nested hstore - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: jsonb and nested hstore
Date
Msg-id 52F99417.5080306@dunslane.net
Whole thread Raw
In response to Re: jsonb and nested hstore  (Tom Dunstan <pgsql@tomd.cc>)
List pgsql-hackers
On 02/10/2014 08:50 PM, Tom Dunstan wrote:
> On 10 February 2014 20:11, Hannu Krosing <hannu@krosing.net> wrote:
>> The fastest and lowest parsing cost format for "JSON" is tnetstrings
>> http://tnetstrings.org/ why not use it as the binary wire format ?
>>
>> It would be as binary as it gets and still be generally parse-able by
>> lots of different platforms, at leas by all of these  we care about.
> If we do go down the binary encoding path in a future release, can I
> please suggest *not* using something like tnetstrings, which suffers
> the same problem that a few binary transport formats suffer,
> particularly when they're developed by people whose native language
> doesn't distinguish between byte arrays and strings - all strings are
> considered byte arrays and it's up to an application to decide on
> character encoding and which things are data vs strings in the
> application.
>
> This makes writing a parser in a language which does treat byte arrays
> and strings differently very difficult, see e.g. the java tnetstrings
> API [1] which is forced into treating strings as byte arrays until the
> programmer then asks it to parse the thing again, but please treat
> everything as a string this time. The msgpack people after much
> wrangling have ended up issuing a new version of the protocol which
> avoids this issue and which they are strongly encouraging users to
> switch to, see [2] for the gory details.
>
> While we may not ever store types in our jsonb format other than the
> standard json data types (I can foresee people wanting to do it,
> though), I would strongly recommend picking a format which at least is
> clear that a value is a string (text, whatever), and preferably makes
> it clear what the character encoding is. Or maybe it should just
> follow whatever the client encoding is at the time - as long as that
> is completely unambiguous to a client.
>

Its treatment of numbers is also broken from my POV (numbers are not 
just integers or floats), so no, we're not going to use tnetstrings. 
Plus, the whole idea of us moving to text for send/recv was to save 
code, not to have to write new code, so to suggest using it now is to 
ignore the discussion that went on before.

cheers

andrew



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: newlines at end of generated SQL
Next
From: Andrew Dunstan
Date:
Subject: Re: jsonb and nested hstore