Re: KNNGiST for knn-search - Mailing list pgsql-hackers

From Teodor Sigaev
Subject Re: KNNGiST for knn-search
Date
Msg-id 4B0BC854.30705@sigaev.ru
Whole thread Raw
In response to Re: KNNGiST for knn-search  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: KNNGiST for knn-search
List pgsql-hackers
> I think you'll need to work on that. A WHERE qual shouldn't imply a sort
> order. You'll have to teach the planner how to use the index to speed up
> a query in the first form.
Of course, right now it is a working prototype.

>> 1. KNNGiST is about 5% slower than GiST on non-knn search queries, like
>>   contains or contained by, because of some overhead of new algorithm of
>>   tree traversal
> 
> Is it possible to use the regular GiST traversal algorithm on a
> KNNGiST-tree, when performing regular GiST searches that don't require a
> particular order?
New algorithm works much more with memory for allocation/free to manage lists 
and it's a single reason of performance loss. Choosing of algorithm could not be 
done by consistent function, it should be done at least in amrescan method or 
even earlier - in planner.


> 
>> 2.  KNNGiST can't be used in  bitmap index scan, which destroys order of
>> results,
>>   We don't know the way to forbid bitmap index scan only for knn queries.
>>   Current version of KNNGiST doesn't distinguish knn-search and usual
>> search
>>   and postgres doesn't know about ordered output from KNNGiST.
> 
> Yeah, you really need to modify the planner to understand the ordering
> and plan accordingly.
Hmm, I thought about it, but still have no a good idea.
One idea:
SELECT p FROM pt WHERE p << '5.0,5.0'::point ORDER BY (p <-> '5.0,5.0'::point) 
DESC LIMIT 10;
And add <-> to opclass (but for now any indexable operation should return 
boolean type). Of course, KNNGiST should be modified to support not only 
k-nearest search but k-"farest" search and NULLS LAST/FIRST.

Not very convenient, because it's needed to look into expression of ORDER BY. 
And now you can specify p >< 'one point' AND p >< 'another point', but it's 
impossible to do that by ORDER BY clause.

Second idea with non-standard syntax.
SELECT ... ORDER BY PROXIMITY OF expression[, expression [..]] TO expression[, 
expression [..]] USING [operator [, operator [..]]
and operator is distance operator, i.e. it's not a member of btree opclass, but 
returns non-negative float8 value.

Without index it will be essentially the same as
ORDER BY expression operator expression[ + ..] DESC NULLS LAST


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


pgsql-hackers by date:

Previous
From: Daniel Farina
Date:
Subject: Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
Next
From: Magnus Hagander
Date:
Subject: Re: enable-thread-safety defaults?