Thread: ILIKE and indexes
Hi all, I'm currently facing a common problem with queries using ILIKE: it can't use an index except if the pattern begins with non alpha characters. The usual trick recommended in the doc is to use lower() and LIKE but it leads to bad row estimates (it's constant whatever the search pattern is) and in several use cases we have, it's a real problem because the rows are far from being equally distributed. To take a real life example, if I look for 'c%' or 'l%' patterns in one of my tables, it returns a lot of rows and the nested loop chosen by the planner for every pattern is a very poor choice for these particular patterns. I'd like to see an opclass similar to (text|bpchar|varchar|name)_pattern_ops to deal with ILIKE. I found this post of Jan http://archives.postgresql.org/pgsql-hackers/2003-10/msg01550.php but I'd really like not to introduce a new set of non standard operators to deal with this feature. I have planned to write the operator class as a "contrib" module but I couldn't find the link between LIKE operator and text_pattern_ops opclass which uses ~=~ and all its variants. Andrew from Supernews told me it was hardcoded in the planner so the introduction of this new opclass requires a few changes to the planner to take it into account for ILIKE. What I'd like to do: * introduce 4 new opclasses called (text|bpchar|varchar|name)_icpattern_ops with ~=~* operator and variants * change the planner to make it use these operators for ILIKE in the same way it is done for LIKE (mostly remove the non alpha limitation and point the planner to the new operators) Is there any fundamental problem in this approach? I mostly wonder if there are any significant problems which prevented us from doing it before and I've missed in my analysis. Is there anything I should particularly take care of? Thanks for any advice or comment. -- Guillaume
On Sun, Mar 18, 2007 at 07:30:35PM +0100, Guillaume Smet wrote: > I have planned to write the operator class as a "contrib" module but I > couldn't find the link between LIKE operator and text_pattern_ops > opclass which uses ~=~ and all its variants. Andrew from Supernews > told me it was hardcoded in the planner so the introduction of this > new opclass requires a few changes to the planner to take it into > account for ILIKE. Er, it's link between LIKE and the ~=~ that's hard coded, however the link between the operator class and the operator is nothing special, that's why it's an operator class. So I think it's easier that you think: just build the operator class and make sure you use the right operator so the planner uses it. ILIKE already maps to an operator... Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
On 3/18/07, Martijn van Oosterhout <kleptog@svana.org> wrote: > Er, it's link between LIKE and the ~=~ that's hard coded Yes. > So I think it's easier that you think: just build the operator class > and make sure you use the right operator so the planner uses it. ILIKE > already maps to an operator... Yeah I know. The fact is that you can't use an index for any pattern and it depends on the database encoding too. The code in the planner checks that the pattern and the database encoding makes the index usable and rewrites the LIKE clause (and the ILIKE clause if the pattern begins with a non alpha character) so that it can use the index. So I'm pretty sure I have to change this behaviour in the planner or did I miss something? -- Guillaume
"Guillaume Smet" <guillaume.smet@gmail.com> writes: > The usual trick recommended in the doc is to use lower() and LIKE but > it leads to bad row estimates (it's constant whatever the search > pattern is) Not if you have an index on lower(col) which one supposes you'd have anyway for such an application. Or are you running an ancient PG release? regards, tom lane
On 3/19/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Not if you have an index on lower(col) which one supposes you'd have > anyway for such an application. Or are you running an ancient PG > release? Yes, you're right. Looking at my history I can't find what my error was - I analyzed the table several times to be sure. As I was playing with the production db and a 8.2.3 db, perhaps I missed a command on one of the db server. Anyway, the estimates are accurate now. Thanks for your help. -- Guillaume