Re: spgist text_ops and LIKE - Mailing list pgsql-hackers

From Robert Haas
Subject Re: spgist text_ops and LIKE
Date
Msg-id CA+TgmoZsY3z3Z9DSLCB5EZpEPUMt8eovhiaqCYidJXRyeqS+Vg@mail.gmail.com
Whole thread Raw
In response to Re: spgist text_ops and LIKE  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: spgist text_ops and LIKE
Re: spgist text_ops and LIKE
List pgsql-hackers
On Thu, Feb 2, 2012 at 12:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Feb 1, 2012 at 10:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Too lazy to look at the code right now, but I think indxpath.c contains
>>> hardwired assumptions that LIKE prefix optimizations are only possible
>>> with btree indexes.  Possibly it would be worth relaxing that.
>
>> Is it as simple as the attached?
>
> Yeah, that looks about right, with the single quibble that the other
> places where we have #defines in pg_opfamily.h put them after the
> corresponding DATA line not before.  Please fix and commit.

Done.

>> On the flip side, spg_text_inner_consistent()
>> seems to fall back to a full index-scan for the regular comparison
>> operators, which makes me wonder why anyone would bother having the
>> operator at all.
>
> MHO is that that code path is only useful when you are in C locale;
> but I think that's a useful enough case to be worth the rather small
> amount of incremental code to handle it.  I'm not sure I believe your
> speculations about making spgist handle inequality searches in other
> locales.  Non-C sort orders are too bizarre and we have too little
> visibility into the precise rules the current locale is following.

Unfortunately, I fear that you are right, unless we were to switch
over to using a locale machinery of our own devising in place of that
provided by the OS.  And no, I'm not proposing that.

> ... however ... could it be reasonable to apply strxfrm and store the
> result of that in the index?  Then searches would just need byte-by-byte
> comparisons.

Yeah, you could, but I don't see the point.  I was hoping that spgist
would give us the ability to have a single index that allows both
inequality comparisons and pattern matching.  If you're going to have
one index where the strings are stored "straight" and another where
strxfrm() is applied, then you could just build two btree indexes, one
the regular way and a second with text_pattern_ops.

I'm having trouble figuring out under what set of circumstances spgist
is expected to be the best available alternative.  It only supports a
small subset of the data types that GiST does, so I suppose the point
is that it should be faster for the cases that it does handle.  And,
considering that this is all brand-new code, the fact that it's almost
keeping pace with btree on both pattern-matching and equality
comparisons is certainly respectable -- but I so far haven't found any
cases where it's a clear win.  There's limited glory in being the
almost-fastest way of indexing for a certain class of queries.
Admittedly, I haven't tried the point-in-box stuff yet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: keywords in pg_hba.conf are case-sensitive?
Next
From: Tom Lane
Date:
Subject: Re: spgist text_ops and LIKE