On Wed, Apr 17, 2019 at 4:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> As for the general usability argument, I'm not sure --- as we start
> to look at alternate AMs, we might have more use for them. When I first
> saw the functions, I thought maybe they were part of sort acceleration
> for TIDs; evidently they're not (yet), but that seems like another
> possible use-case.
There is also your join-or-to-union patch, which I thought might make
use of this for its TID sort.
Maybe it would make sense to put this infrastructure in tuplesort.c,
but probably not. TIDs are 6 bytes, which as you once pointed out, is
not something that we have appropriate infrastructure for (there isn't
a DatumGet*() macro, and so on). The encoding scheme (which you
originally suggested as an alternative to my first idea, sort support
for item pointers) works particularly well as these things go -- it
was about 3x faster when everything fit in memory, and faster still
with external sorts. It allowed us to resolve comparisons at the
SortTuple level within tuplesort.c, but also allowed tuplesort.c to
use the pass-by-value datum qsort specialization. It even allowed
sorted array entries (TIDs/int8s) to be fetched without extra pointer
chasing -- that can be a big bottleneck these days.
The encoding scheme is a bit ugly, but I suspect it would be simpler
to stick to the same approach elsewhere than to try and hide all the
details within tuplesort.c, or something like that. Unless we're
willing to treat TIDs as a whole new type of tuple with its own set of
specialized functions in tuplesort.c, which has problems of its own,
then it's kind of awkward to do it some other way.
--
Peter Geoghegan