Home > mailing lists

Re: Abbreviated keys for Numeric - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: Abbreviated keys for Numeric
Date	April 2, 2015 16:14:50
Msg-id	CAM3SWZR2qdEDqZqrCV7zH_O_vaDOw9LavM227OE8GhW1CVVWxw@mail.gmail.com Whole thread Raw
In response to	Re: Abbreviated keys for Numeric (Peter Geoghegan <pg@heroku.com>)
List	pgsql-hackers

Tree view

On Mon, Mar 23, 2015 at 9:46 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Mar 23, 2015 at 2:41 PM, Andrew Gierth
> <andrew@tao11.riddles.org.uk> wrote:
>>  Peter> As I said, I don't really consider that my patch is a rewrite,
>>  Peter> especially V4, which changes nothing substantive except removing
>>  Peter> 32-bit support.
>>
>> Well, that's a hell of an "except".
>
>
> I guess you're right. I'm willing to go with whatever the consensus is
> on that question.

So I changed my mind - I now think we should support 32-bit abbreviation.

You might wonder why I've changed my mind so suddenly. It's not
because I started caring about 32-bit systems overnight. For the
record, I still think that they are almost irrelevant. But I've seen a
new angle.

One of the likely future uses for abbreviated keys is to guide B-Tree
index scans. One technique that looks really interesting is storing an
abbreviated key in internal pages. I always knew that abbreviation is
as useful for index scans as it is for sorting - maybe more so.
Reading "Modern B-Tree techniques" [1] again today, I realized that we
can store fixed sized abbreviated keys in line items directly. If we
stored a 4 byte abbreviated key, we'd pay no storage overhead on
64-bit systems that already lose that to ItemId alignment. We'd only
generate abbreviated keys in internal B-Tree pages, during relatively
infrequent page splits (internal pages naturally have a very wide
spread of values anyway), so that would be a very low cost. Even when
abbreviation doesn't work out, we have to visit the line item anyway,
and an integer comparison on the same cacheline is effectively free.
It would probably work out all the time anyway, owing to the natural
spread of items within internal pages. Best of all, most index scans
don't even need to look past the itemId array (for internal pages,
which are the large majority visited by any given index scan, but a
tiny minority of those actually stored), hugely cutting down on the
number of cache misses. I could see this making index scans on numeric
IndexTuples noticeably faster than even those on int8 IndexTuples.

Of course, text is the type that this is really compelling for
(perhaps int4 too - perhaps we could automatically get this for types
that fit in 4 bytes naturally on 64-bit platforms). I'm not sure that
we could do this with text without adopting ICU, which makes firm
guarantees about binary sort key stability, so I thought that numeric
could be considered a proof of concept for that.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.219.7269&rep=rep1&type=pdf
-- 
Peter Geoghegan

pgsql-hackers by date:

From: Andres Freund
Date: 02 April 2015, 16:11:16
Subject: Re: deparsing utility commands

From: Alvaro Herrera
Date: 02 April 2015, 16:39:54
Subject: Re: vac truncation scan problems

Re: Abbreviated keys for Numeric - Mailing list pgsql-hackers

Previous

Next