Thread: ILIKE and indexes

ILIKE and indexes

From
"Guillaume Smet"
Date:
Hi all,

I'm currently facing a common problem with queries using ILIKE: it
can't use an index except if the pattern begins with non alpha
characters.

The usual trick recommended in the doc is to use lower() and LIKE but
it leads to bad row estimates (it's constant whatever the search
pattern is) and in several use cases we have, it's a real problem
because the rows are far from being equally distributed.
To take a real life example, if I look for 'c%' or 'l%' patterns in
one of my tables, it returns a lot of rows and the nested loop chosen
by the planner for every pattern is a very poor choice for these
particular patterns.

I'd like to see an opclass similar to
(text|bpchar|varchar|name)_pattern_ops to deal with ILIKE.
I found this post of Jan
http://archives.postgresql.org/pgsql-hackers/2003-10/msg01550.php but
I'd really like not to introduce a new set of non standard operators
to deal with this feature.

I have planned to write the operator class as a "contrib" module but I
couldn't find the link between LIKE operator and text_pattern_ops
opclass which uses ~=~ and all its variants. Andrew from Supernews
told me it was hardcoded in the planner so the introduction of this
new opclass requires a few changes to the planner to take it into
account for ILIKE.

What I'd like to do:
* introduce 4 new opclasses called
(text|bpchar|varchar|name)_icpattern_ops with ~=~* operator and
variants
* change the planner to make it use these operators for ILIKE in the
same way it is done for LIKE (mostly remove the non alpha limitation
and point the planner to the new operators)

Is there any fundamental problem in this approach? I mostly wonder if
there are any significant problems which prevented us from doing it
before and I've missed in my analysis. Is there anything I should
particularly take care of?

Thanks for any advice or comment.

--
Guillaume


Re: ILIKE and indexes

From
Martijn van Oosterhout
Date:
On Sun, Mar 18, 2007 at 07:30:35PM +0100, Guillaume Smet wrote:
> I have planned to write the operator class as a "contrib" module but I
> couldn't find the link between LIKE operator and text_pattern_ops
> opclass which uses ~=~ and all its variants. Andrew from Supernews
> told me it was hardcoded in the planner so the introduction of this
> new opclass requires a few changes to the planner to take it into
> account for ILIKE.

Er, it's link between LIKE and the ~=~ that's hard coded, however the
link between the operator class and the operator is nothing special,
that's why it's an operator class.

So I think it's easier that you think: just build the operator class
and make sure you use the right operator so the planner uses it. ILIKE
already maps to an operator...

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: ILIKE and indexes

From
"Guillaume Smet"
Date:
On 3/18/07, Martijn van Oosterhout <kleptog@svana.org> wrote:
> Er, it's link between LIKE and the ~=~ that's hard coded

Yes.

> So I think it's easier that you think: just build the operator class
> and make sure you use the right operator so the planner uses it. ILIKE
> already maps to an operator...

Yeah I know. The fact is that you can't use an index for any pattern
and it depends on the database encoding too. The code in the planner
checks that the pattern and the database encoding makes the index
usable and rewrites the LIKE clause (and the ILIKE clause if the
pattern begins with a non alpha character) so that it can use the
index.
So I'm pretty sure I have to change this behaviour in the planner or
did I miss something?

--
Guillaume


Re: ILIKE and indexes

From
Tom Lane
Date:
"Guillaume Smet" <guillaume.smet@gmail.com> writes:
> The usual trick recommended in the doc is to use lower() and LIKE but
> it leads to bad row estimates (it's constant whatever the search
> pattern is)

Not if you have an index on lower(col) which one supposes you'd have
anyway for such an application.  Or are you running an ancient PG
release?
        regards, tom lane


Re: ILIKE and indexes

From
"Guillaume Smet"
Date:
On 3/19/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Not if you have an index on lower(col) which one supposes you'd have
> anyway for such an application.  Or are you running an ancient PG
> release?

Yes, you're right. Looking at my history I can't find what my error
was - I analyzed the table several times to be sure. As I was playing
with the production db and a 8.2.3 db, perhaps I missed a command on
one of the db server.

Anyway, the estimates are accurate now.

Thanks for your help.

--
Guillaume