Re: [PATCH] kNN for btree - Mailing list pgsql-hackers

From Nikita Glukhov
Subject Re: [PATCH] kNN for btree
Date
Msg-id 6a301bdc-1d95-a2ac-6e4e-23b1d5ceab51@postgrespro.ru
Whole thread Raw
In response to Re: [HACKERS] [PATCH] kNN for btree  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: [PATCH] kNN for btree
List pgsql-hackers
Attached 3rd version of the patches rebased onto the current master.

Changes from the previous version:
- Added support of INCLUDE columns to get_index_column_opclass() (1st patch).
- Added parallel kNN scan support.
- amcanorderbyop() was transformed into ammatchorderby() which takes a List of PathKeys and checks each of them with new function match_orderbyop_pathkey() extracted from match_pathkeys_to_index().  I think that this design can be used in the future to support a mix of ordinary and order-by-op PathKeys, but I am not sure.

On 09.03.2017 15:00, Alexander Korotkov wrote:
On Thu, Mar 2, 2017 at 5:57 PM, David Steele <david@pgmasters.net> wrote:
Hi Alexander,

On 2/16/17 11:20 AM, Robert Haas wrote:
> On Thu, Feb 16, 2017 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
>>> <a.korotkov@postgrespro.ru> wrote:
>>>> My idea is that we need more general redesign of specifying ordering which
>>>> index can produce.  Ideally, we should replace amcanorder, amcanbackward and
>>>> amcanorderbyop with single callback.  Such callback should take a list of
>>>> pathkeys and return number of leading pathkeys index could satisfy (with
>>>> corresponding information for index scan).  I'm not sure that other hackers
>>>> would agree with such design, but I'm very convinced that we need something
>>>> of this level of extendability.  Otherwise we would have to hack our planner
>>>> <-> index_access_method interface each time we decide to cover another index
>>>> produced ordering.
>>
>>> Yeah.  I'm not sure if that's exactly the right idea.  But it seems
>>> like we need something.
>>
>> That's definitely not exactly the right idea, because using it would
>> require the core planner to play twenty-questions trying to guess which
>> pathkeys the index can satisfy.  ("Can you satisfy some prefix of this
>> pathkey list?  How about that one?")  It could be sensible to have a
>> callback that's called once per index and hands back a list of pathkey
>> lists that represent interesting orders the index could produce, which
>> could be informed by looking aside at the PlannerInfo contents to see
>> what is likely to be relevant to the query.
>>
>> But even so, I'm not convinced that that is a better design or more
>> maintainable than the current approach.  I fear that it will lead to
>> duplicating substantial amounts of code and knowledge into each index AM,
>> which is not an improvement; and if anything, that increases the risk of
>> breaking every index AM anytime you want to introduce some fundamentally
>> new capability in the area.  Now that it's actually practical to have
>> out-of-core index AMs, that's a bigger concern than it might once have
>> been.
>
> Yeah, that's all true.  But I think Alexander is right that just
> adding amcandoblah flags ad infinitum doesn't feel good either.  The
> interface isn't really arm's-length if every new thing somebody wants
> to do something new requires another flag.
>
>> Also see the discussion that led up to commit ed0097e4f.  Users objected
>> the last time we tried to make index capabilities opaque at the SQL level,
>> so they're not going to like a design that tries to hide that information
>> even from the core C code.
>
> Discoverability is definitely important, but first we have to figure
> out how we're going to make it work, and then we can work out how to
> let users see how it works.

Reading through this thread I'm concerned that this appears to be a big
change making its first appearance in the last CF.  There is also the
need for a new patch and a general consensus of how to proceed.
 
Yes, refactoring of amcanorder/amcanorderbyop should be very thoughtful.

I recommend moving this patch to 2017-07 or marking it RWF.

I agree. Done.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Online verification of checksums
Next
From: Alvaro Herrera
Date:
Subject: Re: transction_timestamp() inside of procedures