Re: LIKE optimization and locale - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: LIKE optimization and locale
Date
Msg-id 200011262043.PAA13975@candle.pha.pa.us
Whole thread Raw
In response to Re: LIKE optimization and locale  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> The core problem is: given a string "foo", find a string "fop" that
> is greater than any possible extension "foobar" of "foo".  We need
> not find the least such string (else it would indeed be a hard
> problem), just a reasonably close upper bound.  The algorithm we have
> in 7.0.* increments the last byte(s) of "foo" until it finds
> something greater than "foo".  That handles collation orders that are
> different from numerical order, but it still breaks down in the cases
> Peter mentions.

This increment seems sub-optimal.

> 
> One variant I've been wondering about is to test a candidate bound
> string against not only "foo", but all single-character extensions of
> "foo", ie, "foo\001" through "foo\255".  That would catch situations
> like the one most recently complained of, where the last character
> of the proposed bound string is just a noise-character in dictionary
> order.  But I'm afraid it's still not good enough to catch all cases
> ... and it doesn't generalize to MULTIBYTE very well anyway.

This was my suggestion, to test all 255 chars and find the lowest that
is greater than the target, but I see that multi-byte would be a
problem.  Oh, well.  I hoped some postmaster-generated lookup table
could fix this.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: tcl/FreeBSD 4.2-STABLE, multiple TCL versions installed
Next
From: teg@redhat.com (Trond Eivind GlomsrØd)
Date:
Subject: Re: OK, that's one LOCALE bug report too many...