Re: B-Tree support function number 3 (strxfrm() optimization) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: B-Tree support function number 3 (strxfrm() optimization)
Date
Msg-id CAM3SWZTM9zsA7Qnf+-pYf_oLdAMEj+rVsE=SOqPrqFePNmSEpw@mail.gmail.com
Whole thread Raw
In response to Re: B-Tree support function number 3 (strxfrm() optimization)  (Peter Geoghegan <pg@heroku.com>)
Responses Re: B-Tree support function number 3 (strxfrm() optimization)  (Thom Brown <thom@linux.com>)
List pgsql-hackers
On Mon, Mar 31, 2014 at 7:35 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Okay. Attached revision only trusts strxfrm() blobs (as far as that
> goes) when the buffer passed to strxfrm() was sufficiently large that
> the blob could fully fit.

Attached revision has been further polished. I've added two additional
optimizations:

* Squeeze the last byte out of each Datum, so that on a 64-bit system,
the full 8 bytes are available to store strxfrm() blobs.

* Figure out when the strcoll() bttextfastcmp_locale() comparator is
called, if it was called because a poor man's comparison required it
(and *not* because it's the non-leading key in the traditional sense,
which implies there are no poorman's normalized keys in respect of
this attribute at all). This allows us to try and get away with a
straight memcmp if and when the lengths of the original text strings
match, on the assumption that when the initial poorman's comparison
didn't work out, and when the string lengths match, there is a very
good chance that both are equal, and on average it's a win to avoid
doing a strcoll() (along with the attendant copying around of buffers
for NULL-termination) entirely. Given that memcmp() is so much cheaper
than strcoll() anyway, this seems like a good trade-off.

--
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Hadi Moshayedi
Date:
Subject: PostgreSQL Columnar Store for Analytic Workloads
Next
From: Thom Brown
Date:
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)