Re: B-Tree support function number 3 (strxfrm() optimization) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: B-Tree support function number 3 (strxfrm() optimization)
Date
Msg-id CAM3SWZTy3MxgXbG6373cWh9rex=RhmAj6-kuXx03yRX5-vYKgQ@mail.gmail.com
Whole thread Raw
In response to Re: B-Tree support function number 3 (strxfrm() optimization)  (Wim Lewis <wiml@omnigroup.com>)
Responses Re: B-Tree support function number 3 (strxfrm() optimization)  (Wim Lewis <wiml@omnigroup.com>)
List pgsql-hackers
On Mon, Jul 28, 2014 at 5:14 PM, Wim Lewis <wiml@omnigroup.com> wrote:
> A quick glance at OSX's strxfrm() suggests they're using an implementation of strxfrm() from FreeBSD. You can find
thesource here:
 
>
>     http://www.opensource.apple.com/source/Libc/Libc-997.90.3/string/FreeBSD/strxfrm.c
>
> (and a really quick glance at the contents of libc on OSX 10.9 reinforces this--- I don't see any calls into their
CoreFoundationunicode string APIs.)
 

Something isn't quite accounted for, then. The FreeBSD behavior is to
append the primary weights only. That makes their returned blobs
smaller than those you'll see on Linux, but also appears to imply that
their implementation is substandard (The PostgreSQL port uses ICU on
FreeBSD for a reason, I suppose). But FreeBSD did not add extra,
redundant "header bytes" right in the primary level when I tested it,
but I'm told Mac OS X does. I guess it could be that the collations
shipped differ, but I can't think why that would be. It does seem
peculiar that the Mac OS X blobs are always printable, whereas that
isn't the case with Glibc (the only restriction like that is that
there are no NULL bytes), and the Unicode algorithm standard
specifically says that that's okay.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Wim Lewis
Date:
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)
Next
From: Andrew Dunstan
Date:
Subject: Re: Reminder: time to stand down from 8.4 maintenance