Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Sep 22, 2011 at 11:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Well, the metric that we were indirectly using earlier was the
>> number of characters in a given locale for which the algorithm
>> fails to find a greater one (excluding whichever character is "last",
>> I guess, or you could just recognize there's always at least one).
> What about characters that sort differently in sequence than individually?
Yeah, there's a whole 'nother set of issues there, but the character
incrementer is unlikely to affect that very much either way, I think.
> But now that I think about it, what about using some
> slightly-less-stupid version of that approach as a fallback strategy?
> For example, we could pick, oh, say, 20 characters out of the space of
> code points, about evenly distributed under whatever collations we
> think are likely to be in use.
Sure, if the "increment the top byte" strategy proves to not accomplish
that effectively. But I'd prefer not to design a complex strategy until
it's been proven that a simpler one doesn't work.
regards, tom lane