Re: like/ilike improvements - Mailing list pgsql-hackers

From Guillaume Smet
Subject Re: like/ilike improvements
Date
Msg-id 1d4e0c10709210153u69111eacp25e281bdc645a986@mail.gmail.com
Whole thread Raw
In response to Re: like/ilike improvements  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: like/ilike improvements
List pgsql-hackers
Andrew,

On 9/20/07, Andrew Dunstan <andrew@dunslane.net> wrote:
> Please try the attached patch, which goes back to using a special case
> for single-byte ILIKE. I want to make sure that at the very least we
> don't cause a performance regression with the code done this release. I
> can't see an obvious way around the problem for multi-byte case -
> lower() then requires converting to and from wchar, and I don't see a
> way of avoiding calling lower(). If this is a major blocker I would
> suggest you look at an alternative to using ILIKE for your UTF8 data.

I tested your patch with latin1 and C encoding.

It's better but still slower than 8.2.

C results:
cityvox_c=# SELECT e.numeve FROM evenement e WHERE e.libgeseve LIKE
'%hocus pocus%';numeve
--------
(0 rows)

Time: 113.655 ms

cityvox_c=# SELECT e.numeve FROM evenement e WHERE e.libgeseve ILIKE
'%hocus pocus%'; numeve
-----------900024298    87578
(2 rows)

Time: 124.829 ms

Latin1 results:
cityvox_latin1=# SELECT e.numeve FROM evenement e WHERE e.libgeseve
LIKE '%hocus pocus%';numeve
--------
(0 rows)

Time: 113.207 ms

cityvox_latin1=# SELECT e.numeve FROM evenement e WHERE e.libgeseve
ILIKE '%hocus pocus%'; numeve
-----------900024298    87578
(2 rows)

Time: 123.163 ms

And to answer your IRC question about switching to regexp, it's even
slower than the new UTF-8 ILIKE of 8.3 so I don't think it's the way
to go :).

Regards,

--
Guillaume


pgsql-hackers by date:

Previous
From: "Heikki Linnakangas"
Date:
Subject: Re: HOT is applied
Next
From: ITAGAKI Takahiro
Date:
Subject: Re: like/ilike improvements