Home > mailing lists

UTF8MatchText - Mailing list pgsql-patches

From	ITAGAKI Takahiro
Subject	UTF8MatchText
Date	April 2, 2007 01:56:10
Msg-id	20070402133445.DDF8.ITAGAKI.TAKAHIRO@oss.ntt.co.jp Whole thread Raw
In response to	Multibyte LIKE optimization (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Responses	Re: UTF8MatchText Re: UTF8MatchText Re: UTF8MatchText
List	pgsql-patches

Tree view

"Andrew - Supernews" <andrew@supernews.net> wrote:

>  ITAGAKI> I think all "safe ASCII-supersets" encodings are comparable
>  ITAGAKI> by bytes, not only UTF-8.
>
> This is false, particularly for EUC.

Umm, I see. I updated the optimization to be used only for UTF8 case.
I also added some inlining hints that are useful on my machine (Pentium 4).


x1000 of LIKE '%foo% on 10000 rows tables [ms]
 encoding  | HEAD  |  P1   |  P2   |  P3
-----------+-------+-------+-------+-------
 SQL_ASCII |  7094 |  7120 |  7063 |  7031
 LATIN1    |  7083 |  7130 |  7057 |  7031
 UTF8      | 17974 | 10859 | 10839 |  9682
 EUC_JP    | 17032 | 17557 | 17599 | 15240

- P1: UTF8MatchText()
- P2: P1 + __inline__ GenericMatchText()
- P3: P2 + __inline__ wchareq()
      (The attached patch is P3.)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment

utf8matchtext.patch

pgsql-patches by date:

From: Tom Lane
Date: 02 April 2007, 00:57:55
Subject: Re: Current enums patch

From: "FAST PostgreSQL"
Date: 02 April 2007, 03:28:46
Subject: Re: COPY-able sql log outputs

UTF8MatchText - Mailing list pgsql-patches

Attachment

Previous

Next