Home > mailing lists

Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing? - Mailing list pgsql-general

From	Tom Lane
Subject	Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing?
Date	November 5, 1999 11:48:03
Msg-id	495.941820396@sss.pgh.pa.us Whole thread Raw
In response to	Re: [GENERAL] indexed regex select optimisation missing? ("Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu>)
Responses	Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing? Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing?
List	pgsql-general

Tree view

"Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu> writes:
> Reviewing my email logs from June, most of the work on this has to do with
> people who needs locales, and potentially multibyte character sets. Tom
> Lane is of the opinion that this particular optimization needs to be moved
> out of the parser, and deeper into the planner or optimizer/rewriter,
> so a good fix may be some ways out.

Actually, that part is already done: addition of the index-enabling
comparisons is gone from the parser and is now done in the optimizer,
which has a whole bunch of benefits (one being that the comparison
clauses don't get added to the query unless they are actually used
with an index!).

But the underlying LOCALE problem still remains: I don't know a good
character-set-independent method for generating a "just a little bit
larger" string to use as the righthand limit.  If anyone out there is
an expert on foreign and multibyte character sets, some help would
be appreciated.  Basically, given that we know the LIKE or regex
pattern can only match values beginning with FOO, we want to generate
string comparisons that select out the range of values that begin with
FOO (or, at worst, a slightly larger range).  In USASCII locale it's not
hard: you can do
    field >= 'FOO' AND field < 'FOP'
but it's not immediately obvious how to make this idea work reliably
in the presence of odd collation orders or multibyte characters...

BTW: the \377 hack is actually wrong for USASCII too, since it'll
exclude a data value like 'FOO\377x' which should be included.

            regards, tom lane

pgsql-general by date:

From: Bruce Momjian
Date: 05 November 1999, 11:38:03
Subject: Re: [GENERAL] indexed regex select optimisation missing?

From: The Hermit Hacker
Date: 05 November 1999, 16:13:05
Subject: PostgreSQL v6.5.3 Released

Re: [HACKERS] Re: [GENERAL] indexed regex select optimisation missing? - Mailing list pgsql-general

Previous

Next