Re: Yet another fast GiST build - Mailing list pgsql-hackers

From Darafei "Komяpa" Praliaskouski
Subject Re: Yet another fast GiST build
Date
Msg-id CAC8Q8tKYpyum_VTPU+qyOTRTYbPoER7sLvy4WdGPxnDeS7Hurw@mail.gmail.com
Whole thread Raw
In response to Re: Yet another fast GiST build  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
Responses Re: Yet another fast GiST build  (Yuri Astrakhan <yuriastrakhan@gmail.com>)
List pgsql-hackers
Hello,

Thanks for the patch and working on GiST infrastructure, it's really
valuable for PostGIS use cases and I wait to see this improvement in
PG13.

On Sat, Feb 29, 2020 at 3:13 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:

> Thomas, I've used your wording almost exactly with explanation how
> point_zorder_internal() works. It has more explanation power than my attempts
> to compose good comment.

PostGIS uses this trick to ensure locality. In PostGIS 3 we enhanced
that trick to have the Hilbert curve instead of Z Order curve.

For visual representation have a look at these links:
 - http://blog.cleverelephant.ca/2019/08/postgis-3-sorting.html - as
it's implemented in PostGIS btree sorting opclass
 - https://observablehq.com/@mourner/hilbert-curve-packing - to
explore general approach.

Indeed if it feels insecure to work with bit magic that implementation
can be left out to extensions.

> There is one design decision that worries me most:
> should we use opclass function or index option to provide this sorting information?
> It is needed only during index creation, actually. And having extra i-class only for fast build
> seems excessive.
> I think we can provide both ways and let opclass developers decide?

Reloption variant looks dirty. It won't cover an index on (id uuid,
geom geometry) where id is duplicated (say, tracked car identifier)
but present in every query - no way to pass such thing as reloption.
I'm also concerned about security of passing a sortsupport function
manually during index creation (what if that's not from the same
extension or even (wrong-)user defined something).

We know for sure it's a good idea for all btree_gist types and
geometry and I don't see a case where user would want to disable it.
Just make it part of operator class, and that would also allow fast
creation of multi-column index.

Thanks.

-- 
Darafei Praliaskouski
Support me: http://patreon.com/komzpa



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: proposal - reglanguage type
Next
From: Amit Kapila
Date:
Subject: Re: Improving connection scalability: GetSnapshotData()