Re: itemptr_encode/itemptr_decode - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: itemptr_encode/itemptr_decode
Date
Msg-id CAH2-Wznju51z_-tQhm6tkGkJAcW5VgVBGD_xpHoLtV1PEF=SxA@mail.gmail.com
Whole thread Raw
In response to Re: itemptr_encode/itemptr_decode  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Apr 17, 2019 at 4:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> As for the general usability argument, I'm not sure --- as we start
> to look at alternate AMs, we might have more use for them.  When I first
> saw the functions, I thought maybe they were part of sort acceleration
> for TIDs; evidently they're not (yet), but that seems like another
> possible use-case.

There is also your join-or-to-union patch, which I thought might make
use of this for its TID sort.

Maybe it would make sense to put this infrastructure in tuplesort.c,
but probably not. TIDs are 6 bytes, which as you once pointed out, is
not something that we have appropriate infrastructure for (there isn't
a DatumGet*() macro, and so on). The encoding scheme (which you
originally suggested as an alternative to my first idea, sort support
for item pointers) works particularly well as these things go -- it
was about 3x faster when everything fit in memory, and faster still
with external sorts. It allowed us to resolve comparisons at the
SortTuple level within tuplesort.c, but also allowed tuplesort.c to
use the pass-by-value datum qsort specialization. It even allowed
sorted array entries (TIDs/int8s) to be fetched without extra pointer
chasing -- that can be a big bottleneck these days.

The encoding scheme is a bit ugly, but I suspect it would be simpler
to stick to the same approach elsewhere than to try and hide all the
details within tuplesort.c, or something like that. Unless we're
willing to treat TIDs as a whole new type of tuple with its own set of
specialized functions in tuplesort.c, which has problems of its own,
then it's kind of awkward to do it some other way.


--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Segfault on ANALYZE in SERIALIZABLE isolation
Next
From: Andres Freund
Date:
Subject: sample scans and predicate locking