On Mon, Mar 31, 2014 at 7:35 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Okay. Attached revision only trusts strxfrm() blobs (as far as that
> goes) when the buffer passed to strxfrm() was sufficiently large that
> the blob could fully fit.
Attached revision has been further polished. I've added two additional
optimizations:
* Squeeze the last byte out of each Datum, so that on a 64-bit system,
the full 8 bytes are available to store strxfrm() blobs.
* Figure out when the strcoll() bttextfastcmp_locale() comparator is
called, if it was called because a poor man's comparison required it
(and *not* because it's the non-leading key in the traditional sense,
which implies there are no poorman's normalized keys in respect of
this attribute at all). This allows us to try and get away with a
straight memcmp if and when the lengths of the original text strings
match, on the assumption that when the initial poorman's comparison
didn't work out, and when the string lengths match, there is a very
good chance that both are equal, and on average it's a win to avoid
doing a strcoll() (along with the attendant copying around of buffers
for NULL-termination) entirely. Given that memcmp() is so much cheaper
than strcoll() anyway, this seems like a good trade-off.
--
Peter Geoghegan