Home > mailing lists

Re: jsonb and nested hstore - Mailing list pgsql-hackers

From	Andrew Dunstan
Subject	Re: jsonb and nested hstore
Date	March 4, 2014 07:59:14
Msg-id	53155D96.30102@dunslane.net Whole thread Raw
In response to	Re: jsonb and nested hstore (Peter Geoghegan <pg@heroku.com>)
Responses	Re: jsonb and nested hstore
List	pgsql-hackers

Tree view

On 03/03/2014 10:39 PM, Peter Geoghegan wrote:
> On Mon, Mar 3, 2014 at 6:54 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> My aim for 9.4, given constraints of both the development cycle and my time
>> budget, has been to get jsonb to a point where it has equivalent
>> functionality to json, so that nobody is forced to say "well I'll have to
>> use json because it lacks function x." For the processing functions, i.e.
>> those that don't generate json from non-json, this should be true with
>> what's proposed. The jsonb processing functions should be about as fast as,
>> or in some cases significantly faster than, their json equivalents. Parsing
>> text input takes a little longer (surprisingly little, actually), and
>> reserializing takes significantly longer - I haven't had a chance to look
>> and see if we can improve that. Both of these are more or less expected
>> results.
> Okay, that's fine. I'm sure that jsonb has some value without
> hstore-style indexing. That isn't really in question. What is in
> question is why you would choose to give up on those capabilities.


Who has given up?

I did as much as I could given the time constraints I mentioned. That's 
the way Postgres works. People do what they can.


>
>> For 9.5 I would hope that we have at least the equivalent of the proposed
>> hstore classes.
> But the equivalent code to the proposed hstore operator classes is
> *exactly the same* C code. The two types are fully binary coercible in
> the patch, so why delay? Why is that additional step appreciably
> riskier than adopting jsonb? I don't see why the functions associated
> with the operators that comprise, say, the gin_hstore_ops operator
> class represent much additional risk, assuming that jsonb is itself in
> good shape. For example, the new hstore_contains() looks fairly
> innocuous compared to much of the code you are apparently intent on
> including in the first cut at jsonb. Have I missed something? Why are
> those operators riskier than the operators you are intent on
> including?


You are really jumping at conclusions as to what's in my head, 
conclusions that are not justified by anything I have said.

Who said they were riskier? I certainly didn't.

Of course the operators would be the same. We could have them today, by 
migrating the exisiting code into core and making the hstore operators 
use that code instead. I could probably do it in about a day (if I had a 
day to spare). I was actually rather expecting that they would have been 
put there for the jsonb type when Teodor moved some code so we could 
have a jsonb type. But since he didn't we find ourselves where we are today.

If that's what it will take to get agreement I will try to make it happen.


>
> If it is true that you think that's a significant additional risk, a
> risk too far, then it makes sense that you'd defer doing this. I would
> like to know why that is, though, since I don't see it.


I don't, as I said. This whole line of speculation has me quite puzzled.


> Speaking of
> missing operator classes, I'm pretty sure that it's ipso facto
> unacceptable that there is no default btree operator class for the
> type jsonb:
>
> [local]/postgres=# \d+ bar
>                           Table "public.bar"
>   Column | Type  | Modifiers | Storage  | Stats target | Description
> --------+-------+-----------+----------+--------------+-------------
>   i      | jsonb |           | extended |              |
> Has OIDs: no
>
> [local]/postgres=# select * from bar order by i;
> ERROR:  42883: could not identify an ordering operator for type jsonb
> LINE 1: select * from bar order by i;
>                                     ^
> HINT:  Use an explicit ordering operator or modify the query.
> LOCATION:  get_sort_group_operators, parse_oper.c:221
> Time: 6.424 ms
> [local]/postgres=# select distinct i from bar;
> ERROR:  42883: could not identify an equality operator for type jsonb
> LINE 1: select distinct i from bar;
>                          ^
> LOCATION:  get_sort_group_operators, parse_oper.c:226
> Time: 6.457 ms
>


Well, the trouble is that the only one that would make sense is one that 
did in effect "order by i::json", since it would be weird to have these 
different. That might make the ordering slow, but would be easy enough 
to add.


>> But that's really just a start. Frankly, I think we need to
>> think a lot harder about how we want to be able to index this sort of data.
>> The proposed hstore operators appear to me to be at best just scratching the
>> surface of that. I'd like to be able to index jsonb's #> and #>> operators,
>> for example. Unanchored subpath operators could be an area that's
>> interesting to implement and index.
> I'm sure that's true, but it's not our immediate concern. We need to
> think very hard about it to get everything we want, but we also need
> to think somewhat harder about it in order to get even a basic jsonb
> type committed.
>

I think you need to be more accepting of the fact that Postgres 
development is frequently incremental. Nothing that's been proposed 
would prevent future development of the type AFAICT. Enums took us two 
or three releases to get to where we are. Arrays took longer. Even a 
smallish feature like CSV import is still being tweaked about seven 
releases after it was introduced.


cheers

andrew

pgsql-hackers by date:

From: Fabrízio de Royes Mello
Date: 04 March 2014, 07:38:10
Subject: Re: GSoC proposal - "make an unlogged table logged"

From: Andrew Dunstan
Date: 04 March 2014, 08:05:13
Subject: Re: jsonb and nested hstore

Re: jsonb and nested hstore - Mailing list pgsql-hackers

Previous

Next