Re: jsonb and nested hstore - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: jsonb and nested hstore
Date
Msg-id 5320DD50.4070500@fuzzy.cz
Whole thread Raw
In response to Re: jsonb and nested hstore  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On 12.3.2014 22:43, Peter Geoghegan wrote:
> On Wed, Mar 12, 2014 at 2:30 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
>> I think that's unfounded assumption. Many users actually have very
>> little control over the documents or queries - a nice example may be the
>> mail archive, with headers stored in a hstore/jsonb. I have absolutely
>> no control over the headers or queries.
> 
> Maybe, but what do you want me to do to help them? Indexing a
> typical jsonb field is a bad idea, unless you really do want
> something essentially equivalent to full text search (which could be
> justified), or unless you know ahead of time that your documents are
> not going to be heavily nested. The whole basis of your complaints
> seems to be that people won't know that at all.

Well, I would be quite happy with the GIN indexing without the limit I
ran into. I don't think we need to invent something entirely new.

You're right that the index is pretty futile with a condition matching
field/value combination. But what if I'm doing a query with multiple
such conditions, and the combination matches just a small fraction of
rows? GIN index works with that (and the patches from Alexander improve
this case tremendously, IIRC).

I still don't understand how's this similar to fulltext - that seems
pretty unsuitable for a treeish structure, assuming you can't flatten
it. Which you can't, if the queries use paths to access just parts of
the json value.

>> For many usecases, expressional indexes are the right tool. But not for
>> all and I see no reason to just throw some tools away.
> 
> If the tool you're talking about throwing away is the GiST opclass, I
> do not propose to throw that away. I don't think it's important enough
> to justify inclusion in our first cut at this, especially given the
> fact that the code has bugs, and is quite a bit more complex than GIN.
> What's wrong with those reasons?

Meh, I accidentally mixed two responses :-/

I have no problem with expression indexes, but it's not a good solution
to all problems. I certainly can't use them to achieve what I'd like and
I disagree with your assumptions that it doesn't make sense to index
everything / non-interesting keys, or that the documents have
well-defined structure. I can live with larger / less efficient indexes
on all fields.

Regarding GiST - I understand your concerns about complexity, and you
may be right that not shipping it now is prefferable to shipping it with
bugs. The thing is it doesn't have issues with the value lengths, which
prevents me from using GIN, and although GiST is slower, it's at least
some indexing. But maybe jsonb_hash_ops will work, I haven't tried yet.

regards
Tomas



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: jsonb and nested hstore
Next
From: Rukh Meski
Date:
Subject: Re: 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..