Re: B-Tree support function number 3 (strxfrm() optimization) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: B-Tree support function number 3 (strxfrm() optimization)
Date
Msg-id CAM3SWZQ5rj4GZw7M+Jb+LswYfN7ffRVZ+m0Q=TZUiCS+jH=EZQ@mail.gmail.com
Whole thread Raw
In response to Re: B-Tree support function number 3 (strxfrm() optimization)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: B-Tree support function number 3 (strxfrm() optimization)
Re: B-Tree support function number 3 (strxfrm() optimization)
List pgsql-hackers
On Tue, Jan 20, 2015 at 5:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I was assuming we were going to fix this by undoing the abbreviation
> (as in the abort case) when we spill to disk, and not bothering with
> it thereafter.

The spill-to-disk case is at least as compelling at the internal sort
case. The overhead of comparisons is much higher for tapesort.

Attached patch serializes keys. On reflection, I'm inclined to go with
this approach. Even if the CPU overhead of reconstructing strxfrm()
blobs is acceptable for text, it might be much more expensive for
other types. I'm loathe to throw away those abbreviated keys
unnecessarily.

We don't have to worry about having aborted abbreviation, since once
we spill to disk we've effectively committed to abbreviation. This
patch formalizes the idea that there is strictly a pass-by-value
representation required for such cases (but not that the original
Datums must be of a pass-by-reference, which is another thing
entirely). I've tested it some, obviously with Andrew's testcase and
the regression tests, but also with my B-Tree verification tool.
Please review it.

Sorry about this.
--
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: pgaudit - an auditing extension for PostgreSQL
Next
From: Robert Haas
Date:
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)