Re: B-Tree support function number 3 (strxfrm() optimization) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: B-Tree support function number 3 (strxfrm() optimization)
Date
Msg-id CAM3SWZQEcwL+DTS+6mHpfoo3ST7rqwFo3q8q=3FCKVXV_09EZw@mail.gmail.com
Whole thread Raw
In response to Re: B-Tree support function number 3 (strxfrm() optimization)  (Greg Stark <stark@mit.edu>)
Responses Re: B-Tree support function number 3 (strxfrm() optimization)  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Sun, Jul 27, 2014 at 8:23 AM, Greg Stark <stark@mit.edu> wrote:
> I haven't looked yet. Can you describe what exactly the AC_TRY_RUN is
> testing for?

It's more or less testing for a primary weight level (i.e. the first
part of the blob) that is no larger than the original characters of
the string, and has no "header bytes" or other redundancies.  It also
matches secondary and subsequently weight levels to ensure that they
match, since the two stings tested have identical case, use of
diacritics, etc (they're both lowercase ASCII-safe strings). I don't
set a locale, but that shouldn't matter. I have good reason to believe
that many strxfrm() implementations behave this way, based on the
Unicode standard, and some investigation. Still, that is something
that can be more formally verified as long as we're not trusting of
strxfrm() generally rather than just discriminating against Mac OS X
specifically. I think that the Mac OS X implementation is an anomaly
(I haven't really looked into why), and the FreeBSD one just isn't
very good. But even the FreeBSD one appears to append primary weights
(only) to the blob it returns, and so is essentially the same for my
purposes [1].

[1] http://lists.freebsd.org/pipermail/freebsd-current/2003-April/001273.html
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: 9.4 pg_control corruption
Next
From: Thomas Munro
Date:
Subject: Re: SKIP LOCKED DATA (work in progress)