Re: Yet another fast GiST build - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Yet another fast GiST build
Date
Msg-id fdee7ef5-1ea3-e381-0611-19352c4f9d1d@iki.fi
Whole thread Raw
In response to Re: Yet another fast GiST build  (Darafei "Komяpa" Praliaskouski <me@komzpa.net>)
Responses Re: Yet another fast GiST build  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
List pgsql-hackers
On 09/09/2020 15:20, Darafei "Komяpa" Praliaskouski wrote:
> On Wed, Sep 9, 2020 at 3:09 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> 
>> Come to think of it, the point z-order comparator could benefit a lot
>> from key abbreviation, too. You could do the point -> zorder conversion
>> in the abbreviation routine.
> 
> That's how it works in PostGIS, only that we moved to more
> effecient Hilbert curve:
> https://github.com/postgis/postgis/blob/54399b9f6b0f02e8db9444f9f042b8d4ca6d4fa4/postgis/lwgeom_btree.c#L171

Thanks, that's interesting.

I implemented the abbreviated keys for the point opclass, too, and 
noticed that the patch as it was never used it. I reworked the patch so 
that tuplesort_begin_index_gist() is responsible for looking up the 
sortsupport function, like tuplesort_begin_index_btree() does, and uses 
abbreviation when possible.

I think this is pretty much ready for commit now. I'll do a bit more 
testing (do we have regression test coverage for this?), also on a 
SIZEOF_DATUM==4 system since the abbreviation works differently with 
that, and push if nothing new comes up. And clarify the documentation 
and/or comments that the sortsupport function sees "compressed" values.

I wonder if we could use sorting to also speed up building tsvector 
indexes? The values stored there are bit signatures, what would be a 
good sort order for those?

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Yet another fast GiST build
Next
From: Alvaro Herrera
Date:
Subject: Re: Inconsistency in determining the timestamp of the db statfile.