Prefix compression of B-Tree keys - Mailing list pgsql-hackers

From Claudio Freire
Subject Prefix compression of B-Tree keys
Date
Msg-id CAGTBQpYbhy-ahV6qzq3Yqp43uS2UrukNmf-GZxHckJVrSPj8VA@mail.gmail.com
Whole thread Raw
Responses Re: Prefix compression of B-Tree keys  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Thu, Jan 30, 2014 at 9:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Geoghegan <pg@heroku.com> writes:
>> On more occasions than I care to recall, someone has suggested that it
>> would be valuable to do something with strxfrm() blobs in order to
>> have cheaper locale-aware text comparisons. One obvious place to do so
>> would be in indexes, but in the past that has been dismissed on the
>> following grounds:
>
>> * Index-only scans need fully formed datums to work, and strxfrm() is
>> a one-way function (or so I believe). There is no reason to think that
>> the original string can be reassembled from the blob, so clearly that
>> won't fly.
>
>> * There is a high cost to be paid in storage overhead. Even for
>> collations like "en_US.UTF-8", that can mean that the blob is as much
>> as 3-4 times larger than the original text string. Who is to say that
>> we'll come out ahead even with the savings of just doing a strcmp()
>> rather than a strcoll()?
>
> Quite aside from the index bloat risk, this effect means a 3-4x reduction
> in the maximum string length that can be indexed before getting the
> dreaded "Values larger than 1/3 of a buffer page cannot be indexed" error.
> Worse, a value insertion might well succeed, with the failure happening
> only (much?) later when that entry is chosen as a page split boundary.
>
> It's possible that TOAST compression of the strings would save you, but
> I'm not convinced of that; it certainly doesn't seem like we could
> guarantee no post-insertion failures that way.

Maybe not TOAST compression, but prefix compression.

I've been wondering lately, whether a format change in the B-Tree
could be worth the effort: store values with prefix compression. It
would certainly help indexes of many kinds (text-valued, multi-column,
etc...).

Now, I haven't checked if it's already done. Sorry if it is. I did
mock around btree code a lot and don't remember any of this, but I do
remember stuff that could be used to achieve it (specifically, all the
index-only-scan machinery, which allows for implicit info).

Granted, this isn't strxfrm, but it may provide some of the benefits
(faster comparisons because less bytes are compared?).

On Fri, Jan 31, 2014 at 1:51 AM, Peter Geoghegan <pg@heroku.com> wrote:
> We're already suffering from the fact
> that strcoll() mandates a NULL-terminated string, since we have to
> copy text datums to a buffer to deal with the impedance mismatch.
> Furthermore, multi-column comparisons probably have a lot of overhead
> at present from all the indirection alone.

Which makes the above even more likely to help.



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Making strxfrm() blobs in indexes work
Next
From: Fujii Masao
Date:
Subject: Re: Backup throttling