ILIKE and indexes - Mailing list pgsql-hackers

From Guillaume Smet
Subject ILIKE and indexes
Date
Msg-id 1d4e0c10703181130p661a8ff5ibce2d1fbf114d923@mail.gmail.com
Whole thread Raw
Responses Re: ILIKE and indexes  (Martijn van Oosterhout <kleptog@svana.org>)
Re: ILIKE and indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi all,

I'm currently facing a common problem with queries using ILIKE: it
can't use an index except if the pattern begins with non alpha
characters.

The usual trick recommended in the doc is to use lower() and LIKE but
it leads to bad row estimates (it's constant whatever the search
pattern is) and in several use cases we have, it's a real problem
because the rows are far from being equally distributed.
To take a real life example, if I look for 'c%' or 'l%' patterns in
one of my tables, it returns a lot of rows and the nested loop chosen
by the planner for every pattern is a very poor choice for these
particular patterns.

I'd like to see an opclass similar to
(text|bpchar|varchar|name)_pattern_ops to deal with ILIKE.
I found this post of Jan
http://archives.postgresql.org/pgsql-hackers/2003-10/msg01550.php but
I'd really like not to introduce a new set of non standard operators
to deal with this feature.

I have planned to write the operator class as a "contrib" module but I
couldn't find the link between LIKE operator and text_pattern_ops
opclass which uses ~=~ and all its variants. Andrew from Supernews
told me it was hardcoded in the planner so the introduction of this
new opclass requires a few changes to the planner to take it into
account for ILIKE.

What I'd like to do:
* introduce 4 new opclasses called
(text|bpchar|varchar|name)_icpattern_ops with ~=~* operator and
variants
* change the planner to make it use these operators for ILIKE in the
same way it is done for LIKE (mostly remove the non alpha limitation
and point the planner to the new operators)

Is there any fundamental problem in this approach? I mostly wonder if
there are any significant problems which prevented us from doing it
before and I've missed in my analysis. Is there anything I should
particularly take care of?

Thanks for any advice or comment.

--
Guillaume


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: msvc build broken for ecpg
Next
From: "Jaime Casanova"
Date:
Subject: Re: [COMMITTERS] pgsql: Add GUC temp_tablespaces to provide a default location for