Re: [v9.2] make_greater_string() does not return a string in some cases - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: [v9.2] make_greater_string() does not return a string in some cases
Date
Msg-id 1317047432.1759.27.camel@fsopti579.F-Secure.com
Whole thread Raw
In response to Re: [v9.2] make_greater_string() does not return a string in some cases  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [v9.2] make_greater_string() does not return a string in some cases
List pgsql-hackers
On mån, 2011-09-26 at 10:08 -0400, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > On fre, 2011-09-23 at 20:35 +0300, Marcin Mańk wrote:
> >> One idea:
> >> col like 'foo%' could be translated to col >= 'foo' and col <= foo || 'zzz' , where 'z' is the largest possible
character.This should be good enough  for calculating stats.
 
> >> How to find such a character, i do not know.
> 
> > That's what makes this so difficult.
> 
> > If we knew the largest character, we could probably also find the
> > largest-1, largest-2, etc. characters and determine the total order of
> > everything.
> 
> No, it's a hundred times worse than that, because in collations other
> than C there typically *is* no total order.  The collation behavior of
> many characters is context-sensitive, thanks to the multi-pass behavior
> of typical "dictionary" algorithms.

Well, there is a total order of all strings, but it's not consistent
under string concatenation.

But there is a "largest character".  If the collation implementation
uses four weights (the typical case), the largest character is the one
that maps to <FFFF> <FFFF> <FFFF> <FFFF>.  If you appended that
character to a string, you would get a larger string.  (Unless there are
French backwards levels or other funny things in place, perhaps.)  But
we don't know which character that is, and likely there isn't one, so
we'd need to largest character that maps to an actually assigned weight,
and that's not possible without exhaustive search of all collating
elements.

We could possibly try to make this whole thing work differently by
storing the strxfrm results in the histograms.




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Upgrading Extenions from 8.4
Next
From: Robert Haas
Date:
Subject: Re: [v9.2] Fix Leaky View Problem