Thread: [HACKERS] [PATCH] kNN for btree

[HACKERS] [PATCH] kNN for btree

From
Nikita Glukhov
Date:
Hi hackers,

I'd like to present a series of patches that implements k-Nearest 
Neighbors (kNN)
search forbtree, which can be usedto speed up ORDER BY distance queries 
like this:
SELECT * FROM eventsORDER BY date <-> '2000-01-01'::date ASC LIMIT 100;
Now only GiST supports kNN, but kNN on btree can be emulated using 
contrib/btree_gist.


Scanning algorithm
==================

Algorithm is very simple: we use bidirectional B-tree index scan 
starting at the point
from which we measure the distance(target point).At each step, we advance
this scan in the directionthat has the nearest point.  Butwhen the 
target point
does not fall into the scannedrange, we don't even need to 
useabidirectional
scan here --- wecan use ordinaryunidirectional scaninthe right direction.


Performance results
===================

Test database istaken from original kNN-GiST presentation (PGCon2010).

Test query

SELECT * FROM events ORDER BY date <-> '1957-10-04'::date ASC LIMIT k;

can be optimizedto the next rather complicated UNION form, whichnolonger 
requireskNN:

WITH
   t1 AS (SELECT * FROM events WHERE date >= '1957-10-04'::date ORDER BY 
date ASC  LIMIT k),
   t2 AS (SELECT * FROM events WHERE date < '1957-10-04'::date ORDER BY 
date DESC LIMIT k),
t  AS (SELECT * FROM t1 UNION SELECT * FROM t2)
SELECT * FROM t ORDER BY date <-> '1957-10-04'::date ASC LIMIT k;


In each cell of this table shown query execution time in milliseconds and
the number of accessed blocks:


    k   | kNN-btree   | kNN-GiST|  Opt. query   | Seq.scan
        |              | (btree_gist) |  withUNION |  with sort
-------|--------------|--------------|---------------|------------
      1 | 0.0414| 0.0794|   0.0608 |41.11824
     10 |  0.0487 |  0.0919 |   0.09717 |41.81824
    100 |  0.10747 |  0.192    52|   0.342104 |42.31824
   1000 |  0.735573 |  0.913   650 | 2.9701160 |43.51824
  10000 |  5.070 5622|  6.240 6760|  36.300 11031 |  54.1 1824
100000 | 49.600 51608| 61.900 64194 | 295.100 94980| 115.0 1824


As you can see, kNN-btree can be two times faster than kNN-GiST(btree_gist)
when k < 1000,but the number of blocks readis roughly the same.


Implementation details
======================

A brief description is given below for each of the patches:

1. Introduce amcanorderbyop() function

This patch transformsexisting boolean AMpropertyamcanorderbyop into a 
method
(function pointer).This is necessarybecause, unlike GiST,kNN for 
btreesupports
only a one ordering operator onthe first index column and we need a 
different pathkey
matching logicfor btree (there was acorresponding comment 
inmatch_pathkeys_to_index()).
GiST-specific logic has been moved from match_pathkeys_to_index() to 
gistcanorderbyop().

2. Extract substructure BTScanState from BTScanOpaque

This refactoringis necessary for bidirectional kNN-scanimplementation. Now,
BTScanOpaque'ssubstructure BTScanState containing only thefields related
toscanpositionis passed to some functions where thewhole BTScanOpaque
waspassedpreviously.

3. Extract get_index_column_opclass(), 
get_opclass_opfamily_and_input_type().

Extracted two simple common functions usedingistproperty() and
btproperty() (see the next patch).

4. Add kNN supportto btree

   * Added additional optional BTScanState to BTScanOpaque for 
bidirectional kNN scan.
   * Implemented bidirectional kNN scan.
   * Implemented logic for selecting kNN strategy
* Implemented btcanorderbyop(), updated btproperty() and btvalidate()

B-tree user interface functions have not been altered because ordering 
operators
are used directly.

5. Add distance operators for sometypes

These operators for integer, float, date, time, timestamp, interval, 
cash and oidtypes
havebeencopied fromcontrib/btree_gistand added to the existing btree 
opclasses
as ordering operators.  Their btree_gist duplicates areremoved in the 
next patch.

6. Remove duplicate distance operators from contrib/btree_gist.

References to their own distance operators in btree_gist opclassesare 
replaced
with references to the built-in operatorsand thanduplicate operators are 
dropped.
But if the user is using somewhere these operators, upgrade of btree_gist
from 1.3 to 1.4 should fail.

7. Add regression tests for btree kNN.

Tests were added only after the built-in distance operators were added.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] [PATCH] kNN for btree

From
Nikita Glukhov
Date:
Sorry for the broken formatting in my previous message.
Below is a corrected version of this message.


I'd like to present a series of patches that implements k-Nearest Neighbors
(kNN) search for btree, which can be used to speed up ORDER BY distance
queries like this:
SELECT * FROM events ORDER BY date <-> '2000-01-01'::date ASC LIMIT 100;

Now only GiST supports kNN, but kNN on btree can be emulated using
contrib/btree_gist.


Scanning algorithm
==================

Algorithm is very simple: we use bidirectional B-tree index scan starting at
the point from which we measure the distance (target point).  At each step,
we advance this scan in the direction that has the  nearest point.  But when
the target point does not fall into the scanned range, we don't even need to
use a bidirectional scan here --- we can use ordinary unidirectional scan
in the right direction.


Performance results
===================

Test database is taken from original kNN-GiST presentation (PGCon 2010).

Test query

SELECT * FROM events ORDER BY date <-> '1957-10-04'::date ASC LIMIT k;

can be optimized to the next rather complicated UNION form,
which no longer requires kNN:

WITH    t1 AS (SELECT * FROM events WHERE date >= '1957-10-04'::date           ORDER BY date ASC  LIMIT k),    t2 AS
(SELECT* FROM events WHERE date <  '1957-10-04'::date           ORDER BY date DESC LIMIT k),    t  AS (SELECT * FROM t1
UNIONSELECT * FROM t2)
 
SELECT * FROM t ORDER BY date <-> '1957-10-04'::date ASC LIMIT k;


In each cell of this table shown query execution time in milliseconds and
the number of accessed blocks:

     k  |  kNN-btree   |  kNN-GiST    |  Opt. query   |  Seq. scan        |              | (btree_gist) |  with UNION
| with sort
 
--------|--------------|--------------|---------------|------------      1 |  0.041     4 |  0.079     4 |   0.060
8|  41.1 1824     10 |  0.048     7 |  0.091     9 |   0.097    17 |  41.8 1824    100 |  0.107    47 |  0.192    52 |
0.342   104 |  42.3 1824   1000 |  0.735   573 |  0.913   650 |   2.970  1160 |  43.5 1824  10000 |  5.070  5622 |
6.240 6760 |  36.300 11031 |  54.1 1824 100000 | 49.600 51608 | 61.900 64194 | 295.100 94980 | 115.0 1824
 


As you can see, kNN-btree can be two times faster than kNN-GiST (btree_gist)
when k < 1000, but the number of blocks read is roughly the same.


Implementation details
======================

A brief description is given below for each of the patches:

1. Introduce amcanorderbyop() function

This patch transforms existing boolean AM property amcanorderbyop into a method
(function pointer).  This is necessary because, unlike GiST, kNN for btree
supports only a one ordering operator on the first index column and we need a
different pathkey matching logic for btree (there was a corresponding comment
in match_pathkeys_to_index()).  GiST-specific logic has been moved from
match_pathkeys_to_index() to gistcanorderbyop().

2. Extract substructure BTScanState from BTScanOpaque

This refactoring is necessary for bidirectional kNN-scan implementation.
Now, BTScanOpaque's substructure BTScanState containing only the fields
related to scan position is passed to some functions where the whole
BTScanOpaque was passed previously.

3. Extract get_index_column_opclass(), get_opclass_opfamily_and_input_type().

Extracted two simple common functions used in gistproperty() and
btproperty() (see the next patch).

4. Add kNN support to btree
  * Added additional optional BTScanState to BTScanOpaque for    bidirectional kNN scan.  * Implemented bidirectional
kNNscan.  * Implemented logic for selecting kNN strategy  * Implemented btcanorderbyop(), updated btproperty() and
btvalidate()

B-tree user interface functions have not been altered because ordering
operators are used directly.

5. Add distance operators for some types

These operators for integer, float, date, time, timestamp, interval, cash and
oid types have been copied from contrib/btree_gist and added to the existing
btree opclasses as ordering operators.  Their btree_gist duplicates are removed
in the next patch.

6. Remove duplicate distance operators from contrib/btree_gist.

References to their own distance operators in btree_gist opclasses are
replaced with references to the built-in operators and than duplicate
operators are dropped.  But if the user is using somewhere these operators,
upgrade of btree_gist from 1.3 to 1.4 would fail.

7. Add regression tests for btree kNN.

Tests were added only after the built-in distance operators were added.


-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: [HACKERS] [PATCH] kNN for btree

From
Alexander Korotkov
Date:
Hi, Nikita!

I have assigned as a reviewer for this patchset.  I took a fist look to these patches.
At first, I'd like to notice that it's very cool that you picked up this work.  I frequently hear people complains about lack of this feature.

     k  |  kNN-btree   |  kNN-GiST    |  Opt. query   |  Seq. scan
        |              | (btree_gist) |  with UNION   |  with sort
--------|--------------|--------------|---------------|------------
      1 |  0.041     4 |  0.079     4 |   0.060     8 |  41.1 1824
     10 |  0.048     7 |  0.091     9 |   0.097    17 |  41.8 1824
    100 |  0.107    47 |  0.192    52 |   0.342   104 |  42.3 1824
   1000 |  0.735   573 |  0.913   650 |   2.970  1160 |  43.5 1824
  10000 |  5.070  5622 |  6.240  6760 |  36.300 11031 |  54.1 1824
 100000 | 49.600 51608 | 61.900 64194 | 295.100 94980 | 115.0 1824

These results looks quite expected.  KNN-btree uses about half of blocks in comparison with UNION query, and it's more than twice faster.  In comparison with kNN-GiST there is still some win. 

1. Introduce amcanorderbyop() function

This patch transforms existing boolean AM property amcanorderbyop into a method
(function pointer).  This is necessary because, unlike GiST, kNN for btree
supports only a one ordering operator on the first index column and we need a
different pathkey matching logic for btree (there was a corresponding comment
in match_pathkeys_to_index()).  GiST-specific logic has been moved from
match_pathkeys_to_index() to gistcanorderbyop().

I'm not very excited about this design of amcanorderbyop callback.  Introducing new callback from index access method to the planner should imply quite good flexibility to the future.  In this particular signature of callback I see no potential future use-cases than your implementation for btree.  We could just add amcanorderbyonlyfisrtop property and that would give us same level of flexibility I think.
With existing index types, we could cover much more orderings that we currently do.  Some of possible cases:
1) "ORDER BY col" for btree_gist, SP-GiST text_ops
2) "ORDER BY col1, col2 <-> const" for btree_gist
3) "ORDER BY col1, col2 <-> const" for btree

I understand that #3 is quite hard task and I don't ask you to implement it now.  But it would be nice if some day we decide to add #3, we wouldn't have to change IndexAmRoutine definition.

My idea is that we need more general redesign of specifying ordering which index can produce.  Ideally, we should replace amcanorder, amcanbackward and amcanorderbyop with single callback.  Such callback should take a list of pathkeys and return number of leading pathkeys index could satisfy (with corresponding information for index scan).  I'm not sure that other hackers would agree with such design, but I'm very convinced that we need something of this level of extendability.  Otherwise we would have to hack our planner <-> index_access_method interface each time we decide to cover another index produced ordering.

6. Remove duplicate distance operators from contrib/btree_gist.

References to their own distance operators in btree_gist opclasses are
replaced with references to the built-in operators and than duplicate
operators are dropped.  But if the user is using somewhere these operators,
upgrade of btree_gist from 1.3 to 1.4 would fail.

The query in "btree_gist--1.3--1.4.sql" which directly touches system catalogue to update opfamilies looks too hackery.  I think we shouldn't use such queries until we have no other choice.
In this particular case we can update opfamilies using legal mechanism "ALTER OPERATOR FAMILY name USING index_method ADD/DROP ... " (note that operator name could be schema-qualified if needed).  This way wouldn't be that brief, but it is much more correct. 

Also this like catch my eyes.
info->amcanorderbyop = (void (*)()) amroutine->amcanorderbyop;
It's not necessary to use cast here.  For instance, we don't use cast for amcostestimate.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] [PATCH] kNN for btree

From
Robert Haas
Date:
On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> My idea is that we need more general redesign of specifying ordering which
> index can produce.  Ideally, we should replace amcanorder, amcanbackward and
> amcanorderbyop with single callback.  Such callback should take a list of
> pathkeys and return number of leading pathkeys index could satisfy (with
> corresponding information for index scan).  I'm not sure that other hackers
> would agree with such design, but I'm very convinced that we need something
> of this level of extendability.  Otherwise we would have to hack our planner
> <-> index_access_method interface each time we decide to cover another index
> produced ordering.

Yeah.  I'm not sure if that's exactly the right idea.  But it seems
like we need something.

>> info->amcanorderbyop = (void (*)()) amroutine->amcanorderbyop;
>
> It's not necessary to use cast here.  For instance, we don't use cast for
> amcostestimate.

In fact, it's bad to use the cast here, because if in future the
signature of one of amroutine->amcanorderbyop is changed and
info->amcanorderbyop is not changed to match, then the cast will
prevent a compiler warning, but at runtime you may crash.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] [PATCH] kNN for btree

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
>> My idea is that we need more general redesign of specifying ordering which
>> index can produce.  Ideally, we should replace amcanorder, amcanbackward and
>> amcanorderbyop with single callback.  Such callback should take a list of
>> pathkeys and return number of leading pathkeys index could satisfy (with
>> corresponding information for index scan).  I'm not sure that other hackers
>> would agree with such design, but I'm very convinced that we need something
>> of this level of extendability.  Otherwise we would have to hack our planner
>> <-> index_access_method interface each time we decide to cover another index
>> produced ordering.

> Yeah.  I'm not sure if that's exactly the right idea.  But it seems
> like we need something.

That's definitely not exactly the right idea, because using it would
require the core planner to play twenty-questions trying to guess which
pathkeys the index can satisfy.  ("Can you satisfy some prefix of this
pathkey list?  How about that one?")  It could be sensible to have a
callback that's called once per index and hands back a list of pathkey
lists that represent interesting orders the index could produce, which
could be informed by looking aside at the PlannerInfo contents to see
what is likely to be relevant to the query.

But even so, I'm not convinced that that is a better design or more
maintainable than the current approach.  I fear that it will lead to
duplicating substantial amounts of code and knowledge into each index AM,
which is not an improvement; and if anything, that increases the risk of
breaking every index AM anytime you want to introduce some fundamentally
new capability in the area.  Now that it's actually practical to have
out-of-core index AMs, that's a bigger concern than it might once have
been.

Also see the discussion that led up to commit ed0097e4f.  Users objected
the last time we tried to make index capabilities opaque at the SQL level,
so they're not going to like a design that tries to hide that information
even from the core C code.
        regards, tom lane



Re: [HACKERS] [PATCH] kNN for btree

From
Robert Haas
Date:
On Thu, Feb 16, 2017 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
>> <a.korotkov@postgrespro.ru> wrote:
>>> My idea is that we need more general redesign of specifying ordering which
>>> index can produce.  Ideally, we should replace amcanorder, amcanbackward and
>>> amcanorderbyop with single callback.  Such callback should take a list of
>>> pathkeys and return number of leading pathkeys index could satisfy (with
>>> corresponding information for index scan).  I'm not sure that other hackers
>>> would agree with such design, but I'm very convinced that we need something
>>> of this level of extendability.  Otherwise we would have to hack our planner
>>> <-> index_access_method interface each time we decide to cover another index
>>> produced ordering.
>
>> Yeah.  I'm not sure if that's exactly the right idea.  But it seems
>> like we need something.
>
> That's definitely not exactly the right idea, because using it would
> require the core planner to play twenty-questions trying to guess which
> pathkeys the index can satisfy.  ("Can you satisfy some prefix of this
> pathkey list?  How about that one?")  It could be sensible to have a
> callback that's called once per index and hands back a list of pathkey
> lists that represent interesting orders the index could produce, which
> could be informed by looking aside at the PlannerInfo contents to see
> what is likely to be relevant to the query.
>
> But even so, I'm not convinced that that is a better design or more
> maintainable than the current approach.  I fear that it will lead to
> duplicating substantial amounts of code and knowledge into each index AM,
> which is not an improvement; and if anything, that increases the risk of
> breaking every index AM anytime you want to introduce some fundamentally
> new capability in the area.  Now that it's actually practical to have
> out-of-core index AMs, that's a bigger concern than it might once have
> been.

Yeah, that's all true.  But I think Alexander is right that just
adding amcandoblah flags ad infinitum doesn't feel good either.  The
interface isn't really arm's-length if every new thing somebody wants
to do something new requires another flag.

> Also see the discussion that led up to commit ed0097e4f.  Users objected
> the last time we tried to make index capabilities opaque at the SQL level,
> so they're not going to like a design that tries to hide that information
> even from the core C code.

Discoverability is definitely important, but first we have to figure
out how we're going to make it work, and then we can work out how to
let users see how it works.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] [PATCH] kNN for btree

From
David Steele
Date:
Hi Alexander,

On 2/16/17 11:20 AM, Robert Haas wrote:
> On Thu, Feb 16, 2017 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
>>> <a.korotkov@postgrespro.ru> wrote:
>>>> My idea is that we need more general redesign of specifying ordering which
>>>> index can produce.  Ideally, we should replace amcanorder, amcanbackward and
>>>> amcanorderbyop with single callback.  Such callback should take a list of
>>>> pathkeys and return number of leading pathkeys index could satisfy (with
>>>> corresponding information for index scan).  I'm not sure that other hackers
>>>> would agree with such design, but I'm very convinced that we need something
>>>> of this level of extendability.  Otherwise we would have to hack our planner
>>>> <-> index_access_method interface each time we decide to cover another index
>>>> produced ordering.
>>
>>> Yeah.  I'm not sure if that's exactly the right idea.  But it seems
>>> like we need something.
>>
>> That's definitely not exactly the right idea, because using it would
>> require the core planner to play twenty-questions trying to guess which
>> pathkeys the index can satisfy.  ("Can you satisfy some prefix of this
>> pathkey list?  How about that one?")  It could be sensible to have a
>> callback that's called once per index and hands back a list of pathkey
>> lists that represent interesting orders the index could produce, which
>> could be informed by looking aside at the PlannerInfo contents to see
>> what is likely to be relevant to the query.
>>
>> But even so, I'm not convinced that that is a better design or more
>> maintainable than the current approach.  I fear that it will lead to
>> duplicating substantial amounts of code and knowledge into each index AM,
>> which is not an improvement; and if anything, that increases the risk of
>> breaking every index AM anytime you want to introduce some fundamentally
>> new capability in the area.  Now that it's actually practical to have
>> out-of-core index AMs, that's a bigger concern than it might once have
>> been.
> 
> Yeah, that's all true.  But I think Alexander is right that just
> adding amcandoblah flags ad infinitum doesn't feel good either.  The
> interface isn't really arm's-length if every new thing somebody wants
> to do something new requires another flag.
> 
>> Also see the discussion that led up to commit ed0097e4f.  Users objected
>> the last time we tried to make index capabilities opaque at the SQL level,
>> so they're not going to like a design that tries to hide that information
>> even from the core C code.
> 
> Discoverability is definitely important, but first we have to figure
> out how we're going to make it work, and then we can work out how to
> let users see how it works.

Reading through this thread I'm concerned that this appears to be a big
change making its first appearance in the last CF.  There is also the
need for a new patch and a general consensus of how to proceed.

I recommend moving this patch to 2017-07 or marking it RWF.

Thanks,
-- 
-David
david@pgmasters.net



Re: [HACKERS] [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Thu, Mar 2, 2017 at 5:57 PM, David Steele <david@pgmasters.net> wrote:
Hi Alexander,

On 2/16/17 11:20 AM, Robert Haas wrote:
> On Thu, Feb 16, 2017 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
>>> <a.korotkov@postgrespro.ru> wrote:
>>>> My idea is that we need more general redesign of specifying ordering which
>>>> index can produce.  Ideally, we should replace amcanorder, amcanbackward and
>>>> amcanorderbyop with single callback.  Such callback should take a list of
>>>> pathkeys and return number of leading pathkeys index could satisfy (with
>>>> corresponding information for index scan).  I'm not sure that other hackers
>>>> would agree with such design, but I'm very convinced that we need something
>>>> of this level of extendability.  Otherwise we would have to hack our planner
>>>> <-> index_access_method interface each time we decide to cover another index
>>>> produced ordering.
>>
>>> Yeah.  I'm not sure if that's exactly the right idea.  But it seems
>>> like we need something.
>>
>> That's definitely not exactly the right idea, because using it would
>> require the core planner to play twenty-questions trying to guess which
>> pathkeys the index can satisfy.  ("Can you satisfy some prefix of this
>> pathkey list?  How about that one?")  It could be sensible to have a
>> callback that's called once per index and hands back a list of pathkey
>> lists that represent interesting orders the index could produce, which
>> could be informed by looking aside at the PlannerInfo contents to see
>> what is likely to be relevant to the query.
>>
>> But even so, I'm not convinced that that is a better design or more
>> maintainable than the current approach.  I fear that it will lead to
>> duplicating substantial amounts of code and knowledge into each index AM,
>> which is not an improvement; and if anything, that increases the risk of
>> breaking every index AM anytime you want to introduce some fundamentally
>> new capability in the area.  Now that it's actually practical to have
>> out-of-core index AMs, that's a bigger concern than it might once have
>> been.
>
> Yeah, that's all true.  But I think Alexander is right that just
> adding amcandoblah flags ad infinitum doesn't feel good either.  The
> interface isn't really arm's-length if every new thing somebody wants
> to do something new requires another flag.
>
>> Also see the discussion that led up to commit ed0097e4f.  Users objected
>> the last time we tried to make index capabilities opaque at the SQL level,
>> so they're not going to like a design that tries to hide that information
>> even from the core C code.
>
> Discoverability is definitely important, but first we have to figure
> out how we're going to make it work, and then we can work out how to
> let users see how it works.

Reading through this thread I'm concerned that this appears to be a big
change making its first appearance in the last CF.  There is also the
need for a new patch and a general consensus of how to proceed.
 
Yes, refactoring of amcanorder/amcanorderbyop should be very thoughtful.

I recommend moving this patch to 2017-07 or marking it RWF.

I agree. Done.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:
Attached 3rd version of the patches rebased onto the current master.

Changes from the previous version:
- Added support of INCLUDE columns to get_index_column_opclass() (1st patch).
- Added parallel kNN scan support.
- amcanorderbyop() was transformed into ammatchorderby() which takes a List of PathKeys and checks each of them with new function match_orderbyop_pathkey() extracted from match_pathkeys_to_index().  I think that this design can be used in the future to support a mix of ordinary and order-by-op PathKeys, but I am not sure.

On 09.03.2017 15:00, Alexander Korotkov wrote:
On Thu, Mar 2, 2017 at 5:57 PM, David Steele <david@pgmasters.net> wrote:
Hi Alexander,

On 2/16/17 11:20 AM, Robert Haas wrote:
> On Thu, Feb 16, 2017 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
>>> <a.korotkov@postgrespro.ru> wrote:
>>>> My idea is that we need more general redesign of specifying ordering which
>>>> index can produce.  Ideally, we should replace amcanorder, amcanbackward and
>>>> amcanorderbyop with single callback.  Such callback should take a list of
>>>> pathkeys and return number of leading pathkeys index could satisfy (with
>>>> corresponding information for index scan).  I'm not sure that other hackers
>>>> would agree with such design, but I'm very convinced that we need something
>>>> of this level of extendability.  Otherwise we would have to hack our planner
>>>> <-> index_access_method interface each time we decide to cover another index
>>>> produced ordering.
>>
>>> Yeah.  I'm not sure if that's exactly the right idea.  But it seems
>>> like we need something.
>>
>> That's definitely not exactly the right idea, because using it would
>> require the core planner to play twenty-questions trying to guess which
>> pathkeys the index can satisfy.  ("Can you satisfy some prefix of this
>> pathkey list?  How about that one?")  It could be sensible to have a
>> callback that's called once per index and hands back a list of pathkey
>> lists that represent interesting orders the index could produce, which
>> could be informed by looking aside at the PlannerInfo contents to see
>> what is likely to be relevant to the query.
>>
>> But even so, I'm not convinced that that is a better design or more
>> maintainable than the current approach.  I fear that it will lead to
>> duplicating substantial amounts of code and knowledge into each index AM,
>> which is not an improvement; and if anything, that increases the risk of
>> breaking every index AM anytime you want to introduce some fundamentally
>> new capability in the area.  Now that it's actually practical to have
>> out-of-core index AMs, that's a bigger concern than it might once have
>> been.
>
> Yeah, that's all true.  But I think Alexander is right that just
> adding amcandoblah flags ad infinitum doesn't feel good either.  The
> interface isn't really arm's-length if every new thing somebody wants
> to do something new requires another flag.
>
>> Also see the discussion that led up to commit ed0097e4f.  Users objected
>> the last time we tried to make index capabilities opaque at the SQL level,
>> so they're not going to like a design that tries to hide that information
>> even from the core C code.
>
> Discoverability is definitely important, but first we have to figure
> out how we're going to make it work, and then we can work out how to
> let users see how it works.

Reading through this thread I'm concerned that this appears to be a big
change making its first appearance in the last CF.  There is also the
need for a new patch and a general consensus of how to proceed.
 
Yes, refactoring of amcanorder/amcanorderbyop should be very thoughtful.

I recommend moving this patch to 2017-07 or marking it RWF.

I agree. Done.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: [PATCH] kNN for btree

From
Dmitry Dolgov
Date:
> On Wed, Sep 26, 2018 at 5:41 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>
> Attached 3rd version of the patches rebased onto the current master.
>
> Changes from the previous version:
> - Added support of INCLUDE columns to get_index_column_opclass() (1st patch).
> - Added parallel kNN scan support.
> - amcanorderbyop() was transformed into ammatchorderby() which takes a List of
>   PathKeys and checks each of them with new function match_orderbyop_pathkey()
>   extracted from match_pathkeys_to_index().  I think that this design can be
>   used in the future to support a mix of ordinary and order-by-op PathKeys,
>   but I am not sure.

Hi,

Unfortunately, the patch has some conflicts, could you rebase it? In the
meantime I'll move it to the next CF, hoping to have more reviewers for this
item.


Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:
On 29.11.2018 18:24, Dmitry Dolgov wrote:
>> On Wed, Sep 26, 2018 at 5:41 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>>
>> Attached 3rd version of the patches rebased onto the current master.
>>
>> Changes from the previous version:
>> - Added support of INCLUDE columns to get_index_column_opclass() (1st patch).
>> - Added parallel kNN scan support.
>> - amcanorderbyop() was transformed into ammatchorderby() which takes a List of
>>    PathKeys and checks each of them with new function match_orderbyop_pathkey()
>>    extracted from match_pathkeys_to_index().  I think that this design can be
>>    used in the future to support a mix of ordinary and order-by-op PathKeys,
>>    but I am not sure.
> Hi,
>
> Unfortunately, the patch has some conflicts, could you rebase it? In the
> meantime I'll move it to the next CF, hoping to have more reviewers for this
> item.

Attached 4th version of the patches rebased onto the current master.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
Hi!

On Fri, Nov 30, 2018 at 3:02 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> On 29.11.2018 18:24, Dmitry Dolgov wrote:
> >> On Wed, Sep 26, 2018 at 5:41 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> >>
> >> Attached 3rd version of the patches rebased onto the current master.
> >>
> >> Changes from the previous version:
> >> - Added support of INCLUDE columns to get_index_column_opclass() (1st patch).
> >> - Added parallel kNN scan support.
> >> - amcanorderbyop() was transformed into ammatchorderby() which takes a List of
> >>    PathKeys and checks each of them with new function match_orderbyop_pathkey()
> >>    extracted from match_pathkeys_to_index().  I think that this design can be
> >>    used in the future to support a mix of ordinary and order-by-op PathKeys,
> >>    but I am not sure.
> > Hi,
> >
> > Unfortunately, the patch has some conflicts, could you rebase it? In the
> > meantime I'll move it to the next CF, hoping to have more reviewers for this
> > item.
>
> Attached 4th version of the patches rebased onto the current master.

I think this patchset in general has a good shape.  After some rounds
of review, it might be committed during January commitfest.

For now, I have following notes.

* 0002-Introduce-ammatchorderby-function-v04.patch

I think match_orderbyop_pathkey() and match_orderbyop_pathkeys()
deserve some high-level commends describing what these functions are
expected to do.

* 0004-Add-kNN-support-to-btree-v04.patch

+ <para>
+  FIXME!!!
+  To implement the distance ordered (nearest-neighbor) search, we only need
+  to define a distance operator (usually it called <->) with a
correpsonding
+  operator family for distance comparison in the index's operator class.
+  These operators must satisfy the following assumptions for all non-null
+  values A,B,C of the datatype:
+
+  A <-> B = B <-> A symmetric law
+  if A = B, then A <-> C = B <-> C distance equivalence
+  if (A <= B and B <= C) or (A >= B and B >= C),
+  then A <-> B <= A <-> C monotonicity
+ </para>

What exactly you're going to fix here?  I think you at least should
provide a proper formatting to this paragraph....

* 0006-Remove-distance-operators-from-btree_gist-v04.patch

I see you provide btree_gist--1.6.sql and remove btree_gist--1.2.sql.
Note, that in order to better checking of extension migrations, we're
now providing just migration script to new version.  So, everybody
installing new version will go through the migration.  However, in
this particular case we've mass deletion of former extension objects.
So, I think this case should be an exception to the rules.  And it's
good to provide new version of extension script in this case.  Other
opinions?

A see  btree_gist--1.5--1.6.sql contains a sophisticated query
updating extension operators to builtin operators.  However, what do
you think about just long sequence of ALTER OPERATOR FAMILY commands
removing old operators and adding new operators?  It would be longer,
but more predictable and easier for understanding.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Thu, Dec 27, 2018 at 5:46 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> * 0006-Remove-distance-operators-from-btree_gist-v04.patch
>
> I see you provide btree_gist--1.6.sql and remove btree_gist--1.2.sql.
> Note, that in order to better checking of extension migrations, we're
> now providing just migration script to new version.  So, everybody
> installing new version will go through the migration.  However, in
> this particular case we've mass deletion of former extension objects.
> So, I think this case should be an exception to the rules.  And it's
> good to provide new version of extension script in this case.  Other
> opinions?

I also note that you've removed implementation of distance functions
from btree_gist.  But during pg_upgrade extensions are moved "as is".
Not just CREATE EXTENSION command is dumped, but the whole extension
content.  pg_upgrade'd instances would have old version of extension
metadata with new .so until ALTER EXTENSION UPDATE. So, user would get
errors about missed function in .so until updates the extension.

We're typically evade this by inclusion of old functions into new .so.
Then user can work normally before extension update.  In this
particular case, we can leave the distance functions in the .so, but
make them just wrappers over core functions.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Sun, Dec 30, 2018 at 1:19 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Thu, Dec 27, 2018 at 5:46 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > * 0006-Remove-distance-operators-from-btree_gist-v04.patch
> >
> > I see you provide btree_gist--1.6.sql and remove btree_gist--1.2.sql.
> > Note, that in order to better checking of extension migrations, we're
> > now providing just migration script to new version.  So, everybody
> > installing new version will go through the migration.  However, in
> > this particular case we've mass deletion of former extension objects.
> > So, I think this case should be an exception to the rules.  And it's
> > good to provide new version of extension script in this case.  Other
> > opinions?
>
> I also note that you've removed implementation of distance functions
> from btree_gist.  But during pg_upgrade extensions are moved "as is".
> Not just CREATE EXTENSION command is dumped, but the whole extension
> content.  pg_upgrade'd instances would have old version of extension
> metadata with new .so until ALTER EXTENSION UPDATE. So, user would get
> errors about missed function in .so until updates the extension.
>
> We're typically evade this by inclusion of old functions into new .so.
> Then user can work normally before extension update.  In this
> particular case, we can leave the distance functions in the .so, but
> make them just wrappers over core functions.

I've run regression tests with patch applied and opr_sanity showed some errors:

1) date_dist_timestamptz(), timestamp_dist_timestamptz(),
timestamptz_dist_date(), timestamptz_dist_timestamp() should be
stable, not immutable.  These functions use timezone during
conversion.

2) date_dist_timestamp(), date_dist_timestamptz(),
timestamp_dist_date(), timestamp_dist_timestamptz(),
timestamptz_dist_date(), timestamptz_dist_timestamp() should be not
leafproof.  These functions perform conversion, which might fail in
corner case.  So, this error should be considered as a leak.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
Hi!

I've couple more notes regarding this patch.
1) There are two loops over scan key determining scan strategy:
existing loop in _bt_first(), and in new function
_bt_select_knn_search_strategy().  It's kind of redundant that we've
to process scan keys twice for knn searches.  I think scan keys
processing should be merged into one loop.
2) We're not supporting knn ordering only using the first key.
Supporting multiple knn keys would require significant reword of
B-tree traversal algorithm making it closer to GiST and SP-GiST.  So,
I think supporting only one knn key is reasonable restriction, at
least for now.  But we could support single-key knn ordering in more
cases.  For instance, knn search in "SELECT * FROM tbl WHERE a =
const1 ORDER BY b <-> const2" could benefit from (a, b) B-tree index.
So, we can support knn search on n-th key if there are equality scan
keys for [1, n-1] index keys.

I've already discussed these issues with Nikita personally.
Hopefully, new version will be published soon.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:

Attached 5th version of the patches.

On 11.01.2019 2:21, Alexander Korotkov wrote:

Hi!

I've couple more notes regarding this patch.
1) There are two loops over scan key determining scan strategy:
existing loop in _bt_first(), and in new function
_bt_select_knn_search_strategy().  It's kind of redundant that we've
to process scan keys twice for knn searches.  I think scan keys
processing should be merged into one loop.
This redundant loop was removed and code from _bt_select_knn_search_strategy()
was moved into the new function _bt_emit_scan_key() extracted from
_bt_preprocess_keys().
2) We're not supporting knn ordering only using the first key.
Supporting multiple knn keys would require significant reword of
B-tree traversal algorithm making it closer to GiST and SP-GiST.  So,
I think supporting only one knn key is reasonable restriction, at
least for now.  But we could support single-key knn ordering in more
cases.  For instance, knn search in "SELECT * FROM tbl WHERE a =
const1 ORDER BY b <-> const2" could benefit from (a, b) B-tree index.
So, we can support knn search on n-th key if there are equality scan
keys for [1, n-1] index keys.
I will try to implement this in the next version of the patch.

I also note that you've removed implementation of distance functions
from btree_gist.  But during pg_upgrade extensions are moved "as is".
Not just CREATE EXTENSION command is dumped, but the whole extension
content.  pg_upgrade'd instances would have old version of extension
metadata with new .so until ALTER EXTENSION UPDATE. So, user would get
errors about missed function in .so until updates the extension.

We're typically evade this by inclusion of old functions into new .so.
Then user can work normally before extension update.  In this
particular case, we can leave the distance functions in the .so, but
make them just wrappers over core functions.

Wrappers over core functions were left in btree_gist.

I've run regression tests with patch applied and opr_sanity showed some errors:

1) date_dist_timestamptz(), timestamp_dist_timestamptz(),
timestamptz_dist_date(), timestamptz_dist_timestamp() should be
stable, not immutable.  These functions use timezone during
conversion.

Fixed.
2) date_dist_timestamp(), date_dist_timestamptz(),
timestamp_dist_date(), timestamp_dist_timestamptz(),
timestamptz_dist_date(), timestamptz_dist_timestamp() should be not
leafproof.  These functions perform conversion, which might fail in
corner case.  So, this error should be considered as a leak.
All new distance functions except oiddist() are not leakproof,
so I had to relax condition in opr_sanity.sql test:

- pp.proleakproof != po.proleakproof
+ (NOT pp.proleakproof AND po.proleakproof))


--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [PATCH] kNN for btree

From
Michael Paquier
Date:
On Fri, Jan 11, 2019 at 04:01:51PM +0300, Nikita Glukhov wrote:
> All new distance functions except oiddist() are not leakproof,
> so I had to relax condition in opr_sanity.sql test:

This patch set needs a rebase because of conflicts caused by the
recent patches for pluggable storage.
--
Michael

Attachment

Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:
On 04.02.2019 8:35, Michael Paquier wrote:
This patch set needs a rebase because of conflicts caused by the
recent patches for pluggable storage.
Attached 6th version of the patches rebased onto current master:
* index_clauses now also passed into ammatchorderby()
* added support for queries like  SELECT * FROM tab WHERE col1 = val1 AND col2 = val2 ORDER BY col3 <-> val3
* (experimental patch #9)  added support for queries like  SELECT * FROM tab WHERE col1 IN (v1, v2, v3) ORDER BY col1, col2 <-> val


Patch #9 is experimental.  In order to distinguish order-by-operator and 
simple order-by-column clauses (index column can be operator expression)
in orderbyclauses lists I am trying to pass negative column numbers in 
orderbyclausecols, but it looks ugly, so I think orderbyclauses passing needs
some refactoring like recent IndexClause refactoring.  Also I doubt that I 
correctly implemented match_pathkey_to_indexcol().

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [PATCH] kNN for btree

From
Thomas Munro
Date:
On Wed, Feb 20, 2019 at 2:18 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> On 04.02.2019 8:35, Michael Paquier wrote:
> This patch set needs a rebase because of conflicts caused by the
> recent patches for pluggable storage.

Hi Nikita,
From the department of trivialities: according to cfbot the
documentation doesn't build.  Looks like you have some cases of </>,
but these days you have to write out </quote> (or whatever) in full.

-- 
Thomas Munro
https://enterprisedb.com


Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:
On 20.02.2019 7:35, Thomas Munro wrote:
On Wed, Feb 20, 2019 at 2:18 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
On 04.02.2019 8:35, Michael Paquier wrote:
This patch set needs a rebase because of conflicts caused by the
recent patches for pluggable storage.
Hi Nikita,
From the department of trivialities: according to cfbot the
documentation doesn't build.  Looks like you have some cases of </>,
but these days you have to write out </quote> (or whatever) in full.
Sorry, tags in docs were fixed.  Also I fixed list of data types with built-in
distance operators and list of assumptions for btree distance operators.

Attached 7th version the patches (only documentation was changed).

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:

On 20.02.2019 15:44, Nikita Glukhov wrote:

On 20.02.2019 7:35, Thomas Munro wrote:
On Wed, Feb 20, 2019 at 2:18 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
On 04.02.2019 8:35, Michael Paquier wrote:
This patch set needs a rebase because of conflicts caused by the
recent patches for pluggable storage.
Hi Nikita,
From the department of trivialities: according to cfbot the
documentation doesn't build.  Looks like you have some cases of </>,
but these days you have to write out </quote> (or whatever) in full.
Sorry, tags in docs were fixed.  Also I fixed list of data types with built-in
distance operators and list of assumptions for btree distance operators.

Attached 7th version the patches (only documentation was changed).
Sorry again. Experimental patch #9 contained a bug leading to random failures
in the 'brin' test.
Attached fixed 7th version the patches.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [PATCH] kNN for btree

From
Anastasia Lubennikova
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Hi,
thank you for your work on this patch.

Patch #1 is ready for commit.
It only fixes lack of refactoring after INCLUDE index patch.

Patches 2-7 are ready for committer's review.
I found no significant problems in algorithm or implementation.
Here are few suggestions to improve readability:

1) patch 0002.
* Returned matched index clause exression.
* Number of matched index column is returned in *indexcol_p.

Typos in comment:
exPression
index columnS 

2) patch 0002. 
+            /*
+             * We allow any column of this index to match each pathkey; they
+             * don't have to match left-to-right as you might expect.
+             */

Before refactoring this comment had a line about gist and sp-gist specific:

-             * We allow any column of the index to match each pathkey; they
-             * don't have to match left-to-right as you might expect.  This is
-             * correct for GiST, and it doesn't matter for SP-GiST because
-             * that doesn't handle multiple columns anyway, and no other
-             * existing AMs support amcanorderbyop.  We might need different
-             * logic in future for other implementations.

Now, when the code was moved to a separate function, it is not clear why the same logic is ok for b-tree and other
indexmethods.
 
If I got it right, it doesn't affect the correctness of the algorithm, because b-tree specific checks are performed in
anotherfunction.
 
Still it would be better to explain it in the comment for future readers.

3) patch 0004
 if (!so->distanceTypeByVal)
{
so->state.currDistance = PointerGetDatum(NULL);
so->state.markDistance = PointerGetDatum(NULL);
}

Why do we reset these fields only for !distanceTypeByVal?

4) patch 0004
+ * _bt_next_item() -- Advance to next tuple on current page;
+ * or if there's no more, try to step to the next page with data.
+ *
+ * If there are no more matching records in the given direction
*/

Looks like the last sentence of the comment is unfinished.

5) patch 0004
_bt_start_knn_scan()

so->currRightIsNearest = _bt_compare_current_dist(so, rstate, lstate);
/* Reset right flag if the left item is nearer. */
right = so->currRightIsNearest;

If this comment relates to the string above it?

6) patch 0004
_bt_parallel_seize()

+ scanPage = state == &so->state
+ ? &btscan->btps_scanPage
+ : &btscan->btps_knnScanPage;

This code looks a bit tricke to me. Why do we need to pass BTScanState state to _bt_parallel_seize() at all?
Won't it be enough to choose the page before function call and pass it?

7) patch 0006
+  <title>Upgrade notes for version 1.6</title>
+
+  <para>
+   In version 1.6 <literal>btree_gist</literal> switched to using in-core
+   distance operators, and its own implementations were removed.  References to
+   these operators in <literal>btree_gist</literal> opclasses will be updated
+   automatically during the extension upgrade, but if the user has created
+   objects referencing these operators or functions, then these objects must be
+   dropped manually before updating the extension.

Is this comment still relevant after the latest changes?
Functions are not removed, they're still in contrib.

Patches #8 and #9 need more review and tests.
I'll try to give a feedback on them in the week.

P.S. many thanks for splitting the code into separate patches. It made review a lot easier.

The new status of this patch is: Waiting on Author

Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:

Attached 9th version of the patches.

On 03.03.2019 12:46, Anastasia Lubennikova wrote:

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Hi,
thank you for your work on this patch.
Thank you for you review.

Patch #1 is ready for commit.
It only fixes lack of refactoring after INCLUDE index patch.

Patches 2-7 are ready for committer's review.
I found no significant problems in algorithm or implementation.
Here are few suggestions to improve readability:

1) patch 0002.
* Returned matched index clause exression.
* Number of matched index column is returned in *indexcol_p.

Typos in comment:
exPression
index columnS 
"exPression" is fixed.
But there should be "column" because only single index column is matched.
2) patch 0002. 
+			/*
+			 * We allow any column of this index to match each pathkey; they
+			 * don't have to match left-to-right as you might expect.
+			 */

Before refactoring this comment had a line about gist and sp-gist specific:

-			 * We allow any column of the index to match each pathkey; they
-			 * don't have to match left-to-right as you might expect.  This is
-			 * correct for GiST, and it doesn't matter for SP-GiST because
-			 * that doesn't handle multiple columns anyway, and no other
-			 * existing AMs support amcanorderbyop.  We might need different
-			 * logic in future for other implementations.

Now, when the code was moved to a separate function, it is not clear why the
same logic is ok for b-tree and other index methods.  If I got it right, it
doesn't affect the correctness of the algorithm, because b-tree specific 
checks are performed in another function.  Still it would be better to 
explain it in the comment for future readers.
It seems that match_orderbyop_pathkey() is simply the wrong place for this
comment.  I moved it into match_orderbyop_pathkeys() which is implementation of
ammatchorderby() for GiST an SP-GiST.  Also I added old sentence about its
correctness for GiST and SP-GiST.
3) patch 0004if (!so->distanceTypeByVal){  so->state.currDistance = PointerGetDatum(NULL);  so->state.markDistance = PointerGetDatum(NULL); }

Why do we reset these fields only for !distanceTypeByVal?
These fields should be initialized (it is initialization, not reset) only for
by-ref types because before writing a new distance values to these fields, 
the previous by-ref values are pfreed.  The corresponding comment was added.

4) patch 0004
+ * _bt_next_item() -- Advance to next tuple on current page;
+ * or if there's no more, try to step to the next page with data.
+ *
+ * If there are no more matching records in the given direction
*/

Looks like the last sentence of the comment is unfinished.
Yes, "false is returned" is missing. Fixed.
5) patch 0004
_bt_start_knn_scan()

so->currRightIsNearest = _bt_compare_current_dist(so, rstate, lstate);
/* Reset right flag if the left item is nearer. */
right = so->currRightIsNearest;

If this comment relates to the string above it?
No, it relates only to string below. 'right' flag determines later the selected 
scan direction, so 'currRightIsNearest' should be assigned to it. This comment 
was rewritten.
6) patch 0004
_bt_parallel_seize()

+ scanPage = state == &so->state
+ ? &btscan->btps_scanPage
+ : &btscan->btps_knnScanPage;

This code looks a bit tricke to me. Why do we need to pass BTScanState state
to _bt_parallel_seize() at all? Won't it be enough to choose the page before
function call and pass it?
If we will pass page, then we will have to pass it through the whole function 
tree:
_bt_parallel_seize() _bt_steppage()   _bt_next_item()     _bt_next_nearest()   _bt_load_first_page()     _bt_init_knn_scan() _bt_readnextpage()   _bt_parallel_readpage()     _bt_first()

I decided simply to add flag 'isKnn' to BtScanState, so the code now looks like 
this: scanPage = state->isKnn   ? &btscan->btps_scanPage   : &btscan->btps_knnScanPage;

I also can offer to add 'id' (0/1) to BtScanState instead, then the code will 
look like this: scanPage = &btscan->btps_scanPages[state->id];

7) patch 0006
+  <title>Upgrade notes for version 1.6</title>
+
+  <para>
+   In version 1.6 <literal>btree_gist</literal> switched to using in-core
+   distance operators, and its own implementations were removed.  References to
+   these operators in <literal>btree_gist</literal> opclasses will be updated
+   automatically during the extension upgrade, but if the user has created
+   objects referencing these operators or functions, then these objects must be
+   dropped manually before updating the extension.

Is this comment still relevant after the latest changes?
Functions are not removed, they're still in contrib.
Yes, comment is still relevant. SQL functions and operators are dropped, 
but C functions remain (see [1]).
Patches #8 and #9 need more review and tests.
I'll try to give a feedback on them in the week.

P.S. many thanks for splitting the code into separate patches. It made review a lot easier.

The new status of this patch is: Waiting on Author
Attachment

Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
Hi!

I've some questions regarding this patchset.

1) This comment needs to be revised.  Now, B-tree supports both
ammatchorderby and amcanbackward.  How do we guarantee that kNN is not
backwards scan?

/*
 * Only forward scan is supported with reordering.  Note: we can get away
 * with just Asserting here because the system will not try to run the
 * plan backwards if ExecSupportsBackwardScan() says it won't work.
 * Currently, that is guaranteed because no index AMs support both
 * ammatchorderby and amcanbackward; if any ever do,
 * ExecSupportsBackwardScan() will need to consider indexorderbys
 * explicitly.
 */

2) Why this should be so?

EXPLAIN (COSTS OFF)
SELECT thousand, tenthous FROM tenk1 WHERE thousand IN (5, 120, 3456, 23)
ORDER BY thousand DESC, tenthous <-> 3500;
                             QUERY PLAN
---------------------------------------------------------------------
 Sort
   Sort Key: thousand DESC, ((tenthous <-> 3500))
   ->  Index Only Scan using tenk1_thous_tenthous on tenk1
         Index Cond: (thousand = ANY ('{5,120,3456,23}'::integer[]))
(4 rows)

EXPLAIN (COSTS OFF)
SELECT thousand, tenthous FROM tenk1 WHERE thousand IN (5, 120, 3456, 23)
ORDER BY thousand, tenthous <-> 3500;
                          QUERY PLAN
---------------------------------------------------------------
 Index Only Scan using tenk1_thous_tenthous on tenk1
   Index Cond: (thousand = ANY ('{5,120,3456,23}'::integer[]))
   Order By: (thousand AND (tenthous <-> 3500))
(3 rows)

It seems that we restart bidirectional scan for each value specified
in IN clause.  Then why does it matter whether it is forward or
backward scan?

3) What is idea of 8th patch?  It doesn't seem to be really needed for
subsequent 9th patch, because we anyway ignore partial match pathkeys.
Then why bother producing them?  Is it stub for further incremental
sort?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:
Attached 10th versions of the patches.

Fixed two bugs in patches 3 and 10 (see below).
Patch 3 was extracted from the main patch 5 (patch 4 in v9).

On 11.03.2019 20:42, Alexander Korotkov wrote:
Hi!

I've some questions regarding this patchset.

1) This comment needs to be revised.  Now, B-tree supports both
ammatchorderby and amcanbackward.  How do we guarantee that kNN is not
backwards scan?

/** Only forward scan is supported with reordering.  Note: we can get away* with just Asserting here because the system will not try to run the* plan backwards if ExecSupportsBackwardScan() says it won't work.* Currently, that is guaranteed because no index AMs support both* ammatchorderby and amcanbackward; if any ever do,* ExecSupportsBackwardScan() will need to consider indexorderbys* explicitly.*/
Yes, there was problem with backward kNN scans: they were not disabled in 
ExecSupportsBackwardScan().  This can lead to incorrect results for backward 
fetches from cursors.  Corresponding regression test is included into patch #8.
And the comment was also fixed.
2) Why this should be so?

EXPLAIN (COSTS OFF)
SELECT thousand, tenthous FROM tenk1 WHERE thousand IN (5, 120, 3456, 23)
ORDER BY thousand DESC, tenthous <-> 3500;                            QUERY PLAN
---------------------------------------------------------------------Sort  Sort Key: thousand DESC, ((tenthous <-> 3500))  ->  Index Only Scan using tenk1_thous_tenthous on tenk1        Index Cond: (thousand = ANY ('{5,120,3456,23}'::integer[]))
(4 rows)

EXPLAIN (COSTS OFF)
SELECT thousand, tenthous FROM tenk1 WHERE thousand IN (5, 120, 3456, 23)
ORDER BY thousand, tenthous <-> 3500;                         QUERY PLAN
---------------------------------------------------------------Index Only Scan using tenk1_thous_tenthous on tenk1  Index Cond: (thousand = ANY ('{5,120,3456,23}'::integer[]))  Order By: (thousand AND (tenthous <-> 3500))
(3 rows)

It seems that we restart bidirectional scan for each value specified
in IN clause.  Then why does it matter whether it is forward or
backward scan?
kNN scans now can only be forward, and in forward btree scans array iteration 
order matches the index sort order.  We could determine array iteration order
by ScanKey strategy, but ASC/DESC info flag is not passed now to the place of 
ScanKeys initialization (see ExecIndexBuildScanKeys()).  ASC/DESC passing needs
refactoring of whole passing of orderbyclauses/orderbyclausecols.

There also was a problem in btmmatchorderby()/match_patchkey_to_indexcol():
array keys were incorrectly matched to DESC index columns.

3) What is idea of 8th patch?  It doesn't seem to be really needed for
subsequent 9th patch, because we anyway ignore partial match pathkeys.
Then why bother producing them?  Is it stub for further incremental
sort?
Yes, this is a kind of stub for incremental sort.  But also this simplifies 
a bit ammatchorderby() functions, because they should not care about the length 
of  returned pathkey list, they simply return after the first unsupported 
pathkey.  I event think that ammacthorderby() should not depend on whether we 
support incremental sorting or not.


-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: Re: [PATCH] kNN for btree

From
David Steele
Date:
On 3/15/19 2:11 AM, Nikita Glukhov wrote:
> Attached 10th versions of the patches.
> 
> Fixed two bugs in patches 3 and 10 (see below).
> Patch 3 was extracted from the main patch 5 (patch 4 in v9).

This patch no longer applies so marking Waiting on Author.

Regards,
-- 
-David
david@pgmasters.net


Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:
On 25.03.2019 11:17, David Steele wrote:

> On 3/15/19 2:11 AM, Nikita Glukhov wrote:
>> Attached 10th versions of the patches.
>>
>> Fixed two bugs in patches 3 and 10 (see below).
>> Patch 3 was extracted from the main patch 5 (patch 4 in v9).
>
> This patch no longer applies so marking Waiting on Author.
>
Attached 11th version of the patches rebased onto current master.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [PATCH] kNN for btree

From
Thomas Munro
Date:
On Tue, Mar 26, 2019 at 4:30 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> >> Fixed two bugs in patches 3 and 10 (see below).
> >> Patch 3 was extracted from the main patch 5 (patch 4 in v9).
> >
> > This patch no longer applies so marking Waiting on Author.
> >
> Attached 11th version of the patches rebased onto current master.

Hi Nikita,

Since a new Commitfest is here and this doesn't apply, could we please
have a fresh rebase?

Thanks,

-- 
Thomas Munro
https://enterprisedb.com



Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:
On 01.07.2019 13:41, Thomas Munro wrote:

> On Tue, Mar 26, 2019 at 4:30 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>>>> Fixed two bugs in patches 3 and 10 (see below).
>>>> Patch 3 was extracted from the main patch 5 (patch 4 in v9).
>>> This patch no longer applies so marking Waiting on Author.
>> Attached 11th version of the patches rebased onto current master.
> Hi Nikita,
>
> Since a new Commitfest is here and this doesn't apply, could we please
> have a fresh rebase?

Attached 12th version of the patches rebased onto the current master.

-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [PATCH] kNN for btree

From
Thomas Munro
Date:
On Tue, Jul 2, 2019 at 5:47 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> Attached 12th version of the patches rebased onto the current master.

Hi Nikita,

make check-world fails for me, and in tmp_install/log/install.log I see:

btree_int2.c:97:9: error: implicit declaration of function 'int2dist'
is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        return int2dist(fcinfo);
               ^
btree_int2.c:97:9: note: did you mean 'int2_dist'?
btree_int2.c:95:1: note: 'int2_dist' declared here
int2_dist(PG_FUNCTION_ARGS)
^
1 error generated.

-- 
Thomas Munro
https://enterprisedb.com



Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Mon, Jul 1, 2019 at 8:47 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> On 01.07.2019 13:41, Thomas Munro wrote:
> > On Tue, Mar 26, 2019 at 4:30 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> >>>> Fixed two bugs in patches 3 and 10 (see below).
> >>>> Patch 3 was extracted from the main patch 5 (patch 4 in v9).
> >>> This patch no longer applies so marking Waiting on Author.
> >> Attached 11th version of the patches rebased onto current master.
> > Hi Nikita,
> >
> > Since a new Commitfest is here and this doesn't apply, could we please
> > have a fresh rebase?
>
> Attached 12th version of the patches rebased onto the current master.

I have more thoughts about planner/executor infrastructure.  It
appears that supporting "ORDER BY col1, col2 <-> val" is too complex
for initial version of patch.  Handling of "ORDER BY col" and "ORDER
BY col <-> val" cases uses different code paths in optimizer.

So, I'd like to limit first version of this patch to support just most
simple "ORDER BY col <-> val" case.  We would probably be able to
enhance it even in v13 release cycle, but let's start with just simple
case.  I also think we can evade replacing amcanorderbyop flag with
method, but introduce just new boolean flag indicating knn supports
only first column.

------
Alexander Korotkov

Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [PATCH] kNN for btree

From
Nikita Glukhov
Date:
Attached 13th version of the patches.
On 08.07.2019 21:09, Alexander Korotkov wrote:
I have more thoughts about planner/executor infrastructure.  It
appears that supporting "ORDER BY col1, col2 <-> val" is too complex
for initial version of patch.  Handling of "ORDER BY col" and "ORDER
BY col <-> val" cases uses different code paths in optimizer.

So, I'd like to limit first version of this patch to support just most
simple "ORDER BY col <-> val" case.  We would probably be able to
enhance it even in v13 release cycle, but let's start with just simple
case.  I also think we can evade replacing amcanorderbyop flag with
method, but introduce just new boolean flag indicating knn supports
only first column.
Now patches 1-8 implement only "ORDER BY col <-> const" case.
ammatchorderby() is replaced with amorderbyopfirstcol flag.
 
Patches 9-12 contain ammatchorderby() and other features, so they may 
not be reviewed right now.

On 08.07.2019 11:07, Thomas Munro wrote:
make check-world fails for me, and in tmp_install/log/install.log I see:
btree_int2.c:97:9: error: implicit declaration of function 'int2dist'
is invalid in C99 [-Werror,-Wimplicit-function-declaration]       return int2dist(fcinfo);              ^
btree_int2.c:97:9: note: did you mean 'int2_dist'?
btree_int2.c:95:1: note: 'int2_dist' declared here
int2_dist(PG_FUNCTION_ARGS)
^
1 error generated.
Fixed.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: [PATCH] kNN for btree

From
Thomas Munro
Date:
On Sat, Jul 13, 2019 at 5:32 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> Attached 13th version of the patches.

While moving this to the next CF, I noticed that this needs to be
adjusted for the new pg_list.h API.

-- 
Thomas Munro
https://enterprisedb.com



Re: [PATCH] kNN for btree

From
Alvaro Herrera
Date:
On 2019-Jul-12, Nikita Glukhov wrote:

> Attached 13th version of the patches.

Hello Nikita,

Can you please rebase this again?

Thanks,

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Mon, Sep 2, 2019 at 9:11 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2019-Jul-12, Nikita Glukhov wrote:
>
> > Attached 13th version of the patches.
>
> Hello Nikita,
>
> Can you please rebase this again?

Nikita is on vacation now.  Rebased patchset is attached.

I think patches 0001-0008 are very clear and extends our index-AM
infrastructure in query straightforward way.  I'm going to propose
them for commit after some further polishing.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [PATCH] kNN for btree

From
Alvaro Herrera
Date:
On 2019-Sep-03, Alexander Korotkov wrote:

> I think patches 0001-0008 are very clear and extends our index-AM
> infrastructure in query straightforward way.  I'm going to propose
> them for commit after some further polishing.

Hmm.  Why is 0001 needed?  I see that 0005 introduces a call to that
function, but if attnum == 0 then it doesn't call it.  Maybe it was
necessary in an older version of the patch?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Tue, Sep 3, 2019 at 2:19 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2019-Sep-03, Alexander Korotkov wrote:
>
> > I think patches 0001-0008 are very clear and extends our index-AM
> > infrastructure in query straightforward way.  I'm going to propose
> > them for commit after some further polishing.
>
> Hmm.  Why is 0001 needed?  I see that 0005 introduces a call to that
> function, but if attnum == 0 then it doesn't call it.  Maybe it was
> necessary in an older version of the patch?

Regarding "attno >= 1" check I agree with you.  It should be changed
to assert.  But "attno <= rd_index->indnkeyatts" check appears to be
needed for current code already.  It appears that gistproperty() can
ask get_index_column_opclass() for non-key attribute.  Then
get_index_column_opclass()  returns garbage past oidvector value.
Typically get_opclass_opfamily_and_input_type() doesn't find this
garbage opclass oid and gistproperty() returns null as expected.  But
this is bug and needs to be fixed.

I'm going to push 0001 changing "attno >= 1" to assert.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Sun, Sep 8, 2019 at 11:35 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> I'm going to push 0001 changing "attno >= 1" to assert.

Pushed.  Rebased patchset is attached.  I propose to limit
consideration during this commitfest to this set of 7 remaining
patches.  The rest of patches could be considered later.  I made some
minor editing in preparation to commit.  But I decided I've couple
more notes to Nikita.

 * 0003 extracts part of fields from BTScanOpaqueData struct into new
BTScanStateData struct.  However, there is a big comment regarding
BTScanOpaqueData just before definition of BTScanPosItem.  It needs to
be revised.
 * 0004 adds "knnState" field to BTScanOpaqueData in addition to
"state" field.  Wherein "knnState" might unused during knn scan if it
could be done in one direction.  This seems counter-intuitive.  Could
we rework this to have "forwardState" and "backwardState" fields
instead of "state" and "knnState"?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Mon, Sep 9, 2019 at 11:28 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Sun, Sep 8, 2019 at 11:35 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > I'm going to push 0001 changing "attno >= 1" to assert.
>
> Pushed.  Rebased patchset is attached.  I propose to limit
> consideration during this commitfest to this set of 7 remaining
> patches.  The rest of patches could be considered later.  I made some
> minor editing in preparation to commit.  But I decided I've couple
> more notes to Nikita.
>
>  * 0003 extracts part of fields from BTScanOpaqueData struct into new
> BTScanStateData struct.  However, there is a big comment regarding
> BTScanOpaqueData just before definition of BTScanPosItem.  It needs to
> be revised.
>  * 0004 adds "knnState" field to BTScanOpaqueData in addition to
> "state" field.  Wherein "knnState" might unused during knn scan if it
> could be done in one direction.  This seems counter-intuitive.  Could
> we rework this to have "forwardState" and "backwardState" fields
> instead of "state" and "knnState"?

I have reordered patchset into fewer more self-consistent patches.

Besides comments, code beautification and other improvements, now
there are dedicated fields for forward and backward scan states.  The
forward scan state is the pointer to data structure, which is used in
ordinal unidirectional scan.  So, it's mostly cosmetic change, but it
improves the code readability.

One thing bothers me.  We have to move scalar distance operators from
btree_gist to core.  btree_gist extension upgrade script have to
qualify operators with schema in order to distinguish core and
extension implementations.  So, I have to use @extschema@.  But it's
forbidden for relocatable extensions.  For now, I marken btree_gist as
non-relocatable.  Our comment in extension.c says "For a relocatable
extension, we needn't do this.  There cannot be any need for
@extschema@, else it wouldn't be relocatable.".  Is it really true?  I
think now we have pretty valid example for relocatable extension,
which needs @extschema@ in upgrade script.  Any thoughts?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Tue, Dec 3, 2019 at 3:00 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Mon, Sep 9, 2019 at 11:28 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > On Sun, Sep 8, 2019 at 11:35 PM Alexander Korotkov
> > <a.korotkov@postgrespro.ru> wrote:
> > > I'm going to push 0001 changing "attno >= 1" to assert.
> >
> > Pushed.  Rebased patchset is attached.  I propose to limit
> > consideration during this commitfest to this set of 7 remaining
> > patches.  The rest of patches could be considered later.  I made some
> > minor editing in preparation to commit.  But I decided I've couple
> > more notes to Nikita.
> >
> >  * 0003 extracts part of fields from BTScanOpaqueData struct into new
> > BTScanStateData struct.  However, there is a big comment regarding
> > BTScanOpaqueData just before definition of BTScanPosItem.  It needs to
> > be revised.
> >  * 0004 adds "knnState" field to BTScanOpaqueData in addition to
> > "state" field.  Wherein "knnState" might unused during knn scan if it
> > could be done in one direction.  This seems counter-intuitive.  Could
> > we rework this to have "forwardState" and "backwardState" fields
> > instead of "state" and "knnState"?
>
> I have reordered patchset into fewer more self-consistent patches.
>
> Besides comments, code beautification and other improvements, now
> there are dedicated fields for forward and backward scan states.  The
> forward scan state is the pointer to data structure, which is used in
> ordinal unidirectional scan.  So, it's mostly cosmetic change, but it
> improves the code readability.
>
> One thing bothers me.  We have to move scalar distance operators from
> btree_gist to core.  btree_gist extension upgrade script have to
> qualify operators with schema in order to distinguish core and
> extension implementations.  So, I have to use @extschema@.  But it's
> forbidden for relocatable extensions.  For now, I marken btree_gist as
> non-relocatable.  Our comment in extension.c says "For a relocatable
> extension, we needn't do this.  There cannot be any need for
> @extschema@, else it wouldn't be relocatable.".  Is it really true?  I
> think now we have pretty valid example for relocatable extension,
> which needs @extschema@ in upgrade script.  Any thoughts?

I've rebased the patchset to the current master and made some
refactoring.  I hope it would be possible to bring it to committable
shape during this CF.  This need more refactoring though.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

Re: [PATCH] kNN for btree

From
Peter Geoghegan
Date:
On Mon, Mar 2, 2020 at 1:27 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> I've rebased the patchset to the current master and made some
> refactoring.  I hope it would be possible to bring it to committable
> shape during this CF.  This need more refactoring though.

This patch doesn't change anything about the B-Tree on-disk format -- right?

-- 
Peter Geoghegan



Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Wed, Mar 4, 2020 at 4:58 AM Peter Geoghegan <pg@bowt.ie> wrote:
> On Mon, Mar 2, 2020 at 1:27 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > I've rebased the patchset to the current master and made some
> > refactoring.  I hope it would be possible to bring it to committable
> > shape during this CF.  This need more refactoring though.
>
> This patch doesn't change anything about the B-Tree on-disk format -- right?

Yes, this is correct.  No on-disk format changes, just new scanning strategy.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [PATCH] kNN for btree

From
Alexander Korotkov
Date:
On Wed, Mar 4, 2020 at 2:39 PM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Wed, Mar 4, 2020 at 4:58 AM Peter Geoghegan <pg@bowt.ie> wrote:
> > On Mon, Mar 2, 2020 at 1:27 PM Alexander Korotkov
> > <a.korotkov@postgrespro.ru> wrote:
> > > I've rebased the patchset to the current master and made some
> > > refactoring.  I hope it would be possible to bring it to committable
> > > shape during this CF.  This need more refactoring though.
> >
> > This patch doesn't change anything about the B-Tree on-disk format -- right?
>
> Yes, this is correct.  No on-disk format changes, just new scanning strategy.

After another try to polish this patch I figured out that the way it's
implemented is unnatural.  I see the two reasonable ways to implement
knn for B-tree, but current implementation matches none of them.

1) Implement knn as two simultaneous scans over B-tree: forward and
backward.  It's similar to what current patchset does.  But putting
this logic to B-tree seems artificial.  What B-tree does here is still
unidirectional scan.  On top of that we merge results of two
unidirectional scans.  The appropriate place to do this high-level
work is IndexScan node or even Optimizer/Executor (Merge node over to
IndexScan nodes), but not B-tree itself.
2) Implement arbitrary scans in B-tree using priority queue like GiST
and SP-GiST do.  That would lead to much better support for KNN.  We
can end up in supporting interesting cases like "ORDER BY col1 DESC,
col2 <> val1, col2 ASC" or something.  But that's requires way more
hacking in B-tree core.

So, I'm marking this patch RWF.  We should try re-implement this for v14.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [PATCH] kNN for btree

From
"Anton A. Melnikov"
Date:
Hi!

On 16.03.2020 16:17, Alexander Korotkov wrote:
> After another try to polish this patch I figured out that the way it's
> implemented is unnatural.  I see the two reasonable ways to implement
> knn for B-tree, but current implementation matches none of them.
> 
> 1) Implement knn as two simultaneous scans over B-tree: forward and
> backward.  It's similar to what current patchset does.  But putting
> this logic to B-tree seems artificial.  What B-tree does here is still
> unidirectional scan.  On top of that we merge results of two
> unidirectional scans.  The appropriate place to do this high-level
> work is IndexScan node or even Optimizer/Executor (Merge node over to
> IndexScan nodes), but not B-tree itself.
> 2) Implement arbitrary scans in B-tree using priority queue like GiST
> and SP-GiST do.  That would lead to much better support for KNN.  We
> can end up in supporting interesting cases like "ORDER BY col1 DESC,
> col2 <> val1, col2 ASC" or something.  But that's requires way more
> hacking in B-tree core.

I've rebased and fixed the 17th version of this patch to work
with current master as a starting point for further development.

At first i'm going to implement p.1). I think it's preferable for now
because it seems easier and faster to get a working version.

If there are any ideas pro and contra would be glad to discuss them.

With the best wishes!

-- 
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
  




Attachment

Re: [PATCH] kNN for btree

From
"Andrey M. Borodin"
Date:

> On 15 Jan 2024, at 13:11, Anton A. Melnikov <a.melnikov@postgrespro.ru> wrote:
>
> If there are any ideas pro and contra would be glad to discuss them.

Hi, Anton!

This is kind of ancient thread. I've marked CF entry [0] as "Needs review" and you as an author (seems like you are
goingto be a point of correspondence on this feature). 

At this point it's obvious that the feature won't make it to 17, so let's move to July CF. Given 7 years of history in
thisthread, you might want to start a new one with a summarisation of this thread. This will attract more reviewers,
eitherway they have to dig on their own. 

Thanks!


Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4871/


Re: [PATCH] kNN for btree

From
"Anton A. Melnikov"
Date:
Hi, Andrey!

On 31.03.2024 12:22, Andrey M. Borodin wrote:
> 
> 
>> On 15 Jan 2024, at 13:11, Anton A. Melnikov <a.melnikov@postgrespro.ru> wrote:
>>
>> If there are any ideas pro and contra would be glad to discuss them.
> 
> Hi, Anton!
> 
> This is kind of ancient thread. I've marked CF entry [0] as "Needs review" and you as an author (seems like you are
goingto be a point of correspondence on this feature).
 
> 

That's right, i would like to bring the work on this feature to a positive result.

First of all, let me share a simple test that allows one to estimate the effect of applying this patch and,
i hope, can be considered as a criterion for future versions.

For all the tests below, one should set the following settings:
set enable_seqscan to false;    
set enable_indexscan to true;
set enable_bitmapscan to false;
set enable_indexonlyscan to true;
set max_parallel_workers_per_gather = 0;

To carry out the test, one can use the table "events" mentioned in the first message of this thread, linked here [1].
psql -f events.dump
Then perform a query like that:
explain (costs off, analyze on) SELECT date FROM events ORDER BY date <-> '1957-10-04'::date ASC LIMIT 100000;

When using the existing btree_gist extension and preliminary commands executing:
create extension btree_gist;

CREATE INDEX event_date_idx ON events USING gist (date);


the test query gives:

postgres=# explain (costs off, analyze on) SELECT date FROM events ORDER BY date <-> '1957-10-04'::date ASC LIMIT
100000;

                                                QUERY PLAN

------------------------------------------------------------------------------------------------------

   Limit (actual time=0.759..102.992 rows=100000 loops=1)

     ->  Index Only Scan using event_date_idx on events (actual time=0.757..97.021 rows=100000 loops=1)

           Order By: (date <-> '1957-10-04'::date)

           Heap Fetches: 0

   Planning Time: 0.504 ms

   Execution Time: 108.311 ms


Average value on my PC was 107+-1 ms.

When using an existing patch from [1] and creating a btree index:

CREATE INDEX event_date_idx ON events USING btree (date);


instead of btree_gist one, the test query gives:

postgres=# explain (costs off, analyze on) SELECT date FROM events ORDER BY date <-> '1957-10-04'::date ASC LIMIT
100000;

                                                QUERY PLAN

------------------------------------------------------------------------------------------------------

   Limit (actual time=0.120..48.817 rows=100000 loops=1)

     ->  Index Only Scan using event_date_idx on events (actual time=0.117..42.610 rows=100000 loops=1)

           Order By: (date <-> '1957-10-04'::date)

           Heap Fetches: 0

   Planning Time: 0.487 ms

   Execution Time: 54.463 ms


55+-1 ms on average.
The execution time is reduced by ~2 times. So the effect is obvious
but the implementation problems are reasonable too.


On 15.01.2024 11:11, Anton A. Melnikov wrote:
> On 16.03.2020 16:17, Alexander Korotkov wrote:
>> After another try to polish this patch I figured out that the way it's
>> implemented is unnatural.  I see the two reasonable ways to implement
>> knn for B-tree, but current implementation matches none of them.
>>
>> 1) Implement knn as two simultaneous scans over B-tree: forward and
>> backward.  It's similar to what current patchset does.  But putting
>> this logic to B-tree seems artificial.  What B-tree does here is still
>> unidirectional scan.  On top of that we merge results of two
>> unidirectional scans.  The appropriate place to do this high-level
>> work is IndexScan node or even Optimizer/Executor (Merge node over to
>> IndexScan nodes), but not B-tree itself.
>> 2) Implement arbitrary scans in B-tree using priority queue like GiST
>> and SP-GiST do.  That would lead to much better support for KNN.  We
>> can end up in supporting interesting cases like "ORDER BY col1 DESC,
>> col2 <> val1, col2 ASC" or something.  But that's requires way more
>> hacking in B-tree core.
> 

> At first i'm going to implement p.1). I think it's preferable for now
> because it seems easier and faster to get a working version.
>

I was wrong here. Firstly, this variant turned out to be not so easy and fast,
and secondly, when i received the desired query plan, i was not happy with the results:

In the case of btree_gist, splitting the query into two scans at the optimizer level
and adding MergeAppend on the top of it resulted in a ~13% slowdown in query execution.
The average time became ~121 ms.

   Limit (actual time=1.205..117.689 rows=100000 loops=1)    

     ->  Merge Append (actual time=1.202..112.260 rows=100000 loops=1)    

           Sort Key: ((events.date <-> '1957-10-04'::date))    

           ->  Index Only Scan using event_date_idx on events (actual time=0.713..43.372 rows=42585 loops=1)    

                 Index Cond: (date < '1957-10-04'::date)    

                 Order By: (date <-> '1957-10-04'::date)    

                 Heap Fetches: 0    

           ->  Index Only Scan using event_date_idx on events events_1 (actual time=0.486..58.015 rows=57416 loops=1)

 

                 Index Cond: (date >= '1957-10-04'::date)    

                 Order By: (date <-> '1957-10-04'::date)    

                 Heap Fetches: 0    

   Planning Time: 1.212 ms    

   Execution Time: 120.517 ms    


When using the btree index and the existing v18 patch, the slowdown from dividing the request
into two scans was less, ~3-4%, but i'm not sure about the correctness of the comparison in this case,
since the btree low level in the first variant proposed by Alexander
should work differently, like unpatched one.

Overall in terms of efficiency, the implementation of the first variant
turns out to be worse than the existing version of the patch.
IMO there is an additional argument pro the second variant proposed by Alexander.
The existing version of the patch does not support sorting in descending order.
Adding DESC to the test query gives:

postgres=# explain (costs off, analyze on) SELECT date FROM events ORDER BY date <-> '1957-10-04'::date DESC LIMIT
100000;       
 

                                     QUERY PLAN                                           

--------------------------------------------------------------------------------        

   Limit (actual time=113.455..133.790 rows=100000 loops=1)        

     ->  Sort (actual time=113.453..128.446 rows=100000 loops=1)        

           Sort Key: ((date <-> '1957-10-04'::date)) DESC        

           Sort Method: external merge  Disk: 2680kB        

           ->  Seq Scan on events (actual time=0.032..43.613 rows=151643 loops=1)        

   Planning Time: 0.514 ms        

   Execution Time: 142.278 ms    
    

IndexOnlyScan disappears from the plan, and the query execution time increases by ~2.5 times.

For that regard, the existing implementation in btree_gist behaves more adequately:

postgres=# explain (costs off, analyze on) SELECT date FROM events ORDER BY date <-> '1957-10-04'::date DESC LIMIT
100000;   
 

                                                   QUERY PLAN                                                     

------------------------------------------------------------------------------------------------------------    

   Limit (actual time=144.409..163.660 rows=100000 loops=1)    

     ->  Sort (actual time=144.406..158.267 rows=100000 loops=1)    

           Sort Key: ((date <-> '1957-10-04'::date)) DESC    

           Sort Method: external merge  Disk: 2680kB    

           ->  Index Only Scan using event_date_idx on events (actual time=0.553..81.035 rows=151643 loops=1)    

                 Heap Fetches: 0    

   Planning Time: 0.525 ms    

   Execution Time: 172.201 ms    


IndexOnlyScan remains in the request, the query execution time increases by ~1.5 times in comparison with ASC order.

It seems that it would be better if the both sorting directions won't give essentially different plans
and not differ greatly from each other in execution time.

On 31.03.2024 12:22, Andrey M. Borodin wrote:
> 
> At this point it's obvious that the feature won't make it to 17, so let's move to July CF.
> 

Of course. IMHO, this is the most suitable solution at the moment.
Thank you!

With the best wishes!

-- 
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

[1] https://github.com/antamel/postgres/raw/test-base-events/test-rel/events.dump.gz




Re: [PATCH] kNN for btree

From
"Anton A. Melnikov"
Date:
Hi!

Rebased existing patch on current master to have an actual working version.
There is an inconsistency with commit 5bf748b86.

Reproduction:
CREATE TABLE test (a int4);
INSERT INTO test VALUES (2), (3);
CREATE INDEX test_idx ON test USING btree(a);
SET enable_seqscan = OFF;
SELECT * FROM test WHERE a IN (2, 3) ORDER BY a <-> 5;

Actual result:
postgres=# SELECT * FROM test WHERE a IN (2, 3) ORDER BY a <-> 5;
  a
---
  3
(1 row)

Correct expected result:
postgres=# SELECT * FROM test WHERE a IN (2, 3) ORDER BY a <-> 5;
  a
---
  3
  2
(2 rows)

Reported an issue in the thread corresponding to commit 5bf748b86.

With the best regards,


-- 
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

Re: [PATCH] kNN for btree

From
Kirill Reshke
Date:
On Wed, 31 Jul 2024 at 09:46, Anton A. Melnikov
<a.melnikov@postgrespro.ru> wrote:
>
> Hi!
>
> Rebased existing patch on current master to have an actual working version.
> There is an inconsistency with commit 5bf748b86.
>
> Reproduction:
> CREATE TABLE test (a int4);
> INSERT INTO test VALUES (2), (3);
> CREATE INDEX test_idx ON test USING btree(a);
> SET enable_seqscan = OFF;
> SELECT * FROM test WHERE a IN (2, 3) ORDER BY a <-> 5;
>
> Actual result:
> postgres=# SELECT * FROM test WHERE a IN (2, 3) ORDER BY a <-> 5;
>   a
> ---
>   3
> (1 row)
>
> Correct expected result:
> postgres=# SELECT * FROM test WHERE a IN (2, 3) ORDER BY a <-> 5;
>   a
> ---
>   3
>   2
> (2 rows)
>
> Reported an issue in the thread corresponding to commit 5bf748b86.
>
> With the best regards,
>
>
> --
> Anton A. Melnikov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
Hi!
Given little activity here, there is a little chance of being
committed in the current commitfest, so I moved to the next.

I will try to take a look at v19 soon.


-- 
Best regards,
Kirill Reshke