Re: Yet another fast GiST build - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Yet another fast GiST build
Date
Msg-id CAPpHfdsH04kPKRVTtb8aesfqDMMTS5cNHHk5fjPqQZc6p_HpjA@mail.gmail.com
Whole thread Raw
In response to Yet another fast GiST build  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: Yet another fast GiST build  (Peter Geoghegan <pg@bowt.ie>)
Re: Yet another fast GiST build  (Andrey Borodin <x4mmm@yandex-team.ru>)
List pgsql-hackers
On Mon, Aug 26, 2019 at 10:59 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> In many cases GiST index can be build fast using z-order sorting.
>
> I've looked into proof of concept by Nikita Glukhov [0] and it looks very interesting.
> So, I've implemented yet another version of B-tree-like GiST build.
> It's main use case and benefits can be summarized with small example:
>
> postgres=# create table x as select point (random(),random()) from generate_series(1,3000000,1);
> SELECT 3000000
> Time: 5061,967 ms (00:05,062)
> postgres=# create index ON x using gist (point ) with (fast_build_sort_function=gist_point_sortsupport);
> CREATE INDEX
> Time: 6140,227 ms (00:06,140)
> postgres=# create index ON x using gist (point );
> CREATE INDEX
> Time: 32061,200 ms (00:32,061)
>
> As you can see, Z-order build is on order of magnitude faster. Select performance is roughly the same. Also, index is
significantlysmaller.
 

Cool!  These experiments bring me to following thoughts.  Can we not
only build, but also maintain GiST indexes in B-tree-like manner?  If
we put Z-value together with MBR to the non-leaf keys, that should be
possible.  Maintaining it in B-tree-like manner would have a lot of
advantages.
1) Binary search in non-leaf pages instead of probing each key is much faster.
2) Split algorithm is also simpler and faster.
3) We can refind existing index tuple in predictable time of O(log N).
And this doesn't depend on MBR overlapping degree.  This is valuable
for future table AMs, which would have notion of deleting individual
index tuples (for instance, zheap promises to eventually support
delete-marking indexes).

Eventually, we may come to the idea of B-tree indexes with
user-defined additional keys in non-leaf tuples, which could be used
for scanning.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: pg_upgrade: Error out on too many command-line arguments
Next
From: Peter Geoghegan
Date:
Subject: Re: Yet another fast GiST build