"Andrew - Supernews" <andrew@supernews.net> wrote:
> ITAGAKI> I think all "safe ASCII-supersets" encodings are comparable
> ITAGAKI> by bytes, not only UTF-8.
>
> This is false, particularly for EUC.
Umm, I see. I updated the optimization to be used only for UTF8 case.
I also added some inlining hints that are useful on my machine (Pentium 4).
x1000 of LIKE '%foo% on 10000 rows tables [ms]
encoding | HEAD | P1 | P2 | P3
-----------+-------+-------+-------+-------
SQL_ASCII | 7094 | 7120 | 7063 | 7031
LATIN1 | 7083 | 7130 | 7057 | 7031
UTF8 | 17974 | 10859 | 10839 | 9682
EUC_JP | 17032 | 17557 | 17599 | 15240
- P1: UTF8MatchText()
- P2: P1 + __inline__ GenericMatchText()
- P3: P2 + __inline__ wchareq()
(The attached patch is P3.)
Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center