Re: GiST limits on contrib/cube with dimension > 100? - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: GiST limits on contrib/cube with dimension > 100?
Date
Msg-id B360AFBA-A8A9-46AB-9DA6-A94BF9FDB989@yandex-team.ru
Whole thread Raw
In response to Re: GiST limits on contrib/cube with dimension > 100?  (Siarhei Siniak <siarheisiniak@yahoo.com>)
Responses Re: GiST limits on contrib/cube with dimension > 100?
List pgsql-hackers

> 12 июня 2019 г., в 15:11, Siarhei Siniak <siarheisiniak@yahoo.com> написал(а):
>
> A uniform set of points with a dimension 128 and type cube. That has a size of 50 ** 3. Can be queried for a nearest
neighborat a speed of 10 queries per second with limit varying from 1 to 25. 
> It works better than when no index used at all. So gist gives here a speed up.

Then, I think, data is correlated. I believe it is possible to implement something better for high dimensional KNN in
GiSTthan cube. 


>
> The documentation of postgresql doesn't mention complexity of such an index. I've got confused as to its speed.
>
> Does postgresql allow for an approximate nearest neighbor search?
> https://github.com/erikbern/ann-benchmarks has a lot of efficient implementations.

ANN is beyond concepts of SQL standard: database with index must return same results as without index.
I can add ANN to github.com/x4m/ags which is a fork of GiST.

But PostgreSQL adds a lot of OLTP overhead:
1. it is crash-safe
2. it supports concurrent operations
2a. in a very unexpected way, for example in serializable mode it guaranties you that if you were looking for nearest
neighborthere will not appear any new closer neighbor until the end of your transaction. 
3. it allows extensibility and has abstraction layers
4. it has declarative querying language
etcetc

All this comes at a cost of database that can do anything and be anything. It its very hard to compete with specialized
indexesthat only do ANN. 

Yet, as far as I know, no one really pursued the idea of fast high dimensional ANN in PG.

Best regards, Andrey Borodin.


pgsql-hackers by date:

Previous
From: Oleksii Kliukin
Date:
Subject: Re: upgrades in row-level locks can deadlock
Next
From: "Daniel Verite"
Date:
Subject: RE: proposal: pg_restore --convert-to-text