UTF8MatchText - Mailing list pgsql-patches

From ITAGAKI Takahiro
Subject UTF8MatchText
Date
Msg-id 20070402133445.DDF8.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Whole thread Raw
In response to Multibyte LIKE optimization  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Responses Re: UTF8MatchText
Re: UTF8MatchText
Re: UTF8MatchText
List pgsql-patches
"Andrew - Supernews" <andrew@supernews.net> wrote:

>  ITAGAKI> I think all "safe ASCII-supersets" encodings are comparable
>  ITAGAKI> by bytes, not only UTF-8.
>
> This is false, particularly for EUC.

Umm, I see. I updated the optimization to be used only for UTF8 case.
I also added some inlining hints that are useful on my machine (Pentium 4).


x1000 of LIKE '%foo% on 10000 rows tables [ms]
 encoding  | HEAD  |  P1   |  P2   |  P3
-----------+-------+-------+-------+-------
 SQL_ASCII |  7094 |  7120 |  7063 |  7031
 LATIN1    |  7083 |  7130 |  7057 |  7031
 UTF8      | 17974 | 10859 | 10839 |  9682
 EUC_JP    | 17032 | 17557 | 17599 | 15240

- P1: UTF8MatchText()
- P2: P1 + __inline__ GenericMatchText()
- P3: P2 + __inline__ wchareq()
      (The attached patch is P3.)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: Current enums patch
Next
From: "FAST PostgreSQL"
Date:
Subject: Re: COPY-able sql log outputs