Re: [HACKERS] [PATCH] kNN for btree - Mailing list pgsql-hackers

From David Steele
Subject Re: [HACKERS] [PATCH] kNN for btree
Date
Msg-id e3287c0c-9fa6-7f9c-f0d4-51cc94bd30e5@pgmasters.net
Whole thread Raw
In response to Re: [HACKERS] [PATCH] kNN for btree  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] [PATCH] kNN for btree  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
Hi Alexander,

On 2/16/17 11:20 AM, Robert Haas wrote:
> On Thu, Feb 16, 2017 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Thu, Feb 16, 2017 at 8:05 AM, Alexander Korotkov
>>> <a.korotkov@postgrespro.ru> wrote:
>>>> My idea is that we need more general redesign of specifying ordering which
>>>> index can produce.  Ideally, we should replace amcanorder, amcanbackward and
>>>> amcanorderbyop with single callback.  Such callback should take a list of
>>>> pathkeys and return number of leading pathkeys index could satisfy (with
>>>> corresponding information for index scan).  I'm not sure that other hackers
>>>> would agree with such design, but I'm very convinced that we need something
>>>> of this level of extendability.  Otherwise we would have to hack our planner
>>>> <-> index_access_method interface each time we decide to cover another index
>>>> produced ordering.
>>
>>> Yeah.  I'm not sure if that's exactly the right idea.  But it seems
>>> like we need something.
>>
>> That's definitely not exactly the right idea, because using it would
>> require the core planner to play twenty-questions trying to guess which
>> pathkeys the index can satisfy.  ("Can you satisfy some prefix of this
>> pathkey list?  How about that one?")  It could be sensible to have a
>> callback that's called once per index and hands back a list of pathkey
>> lists that represent interesting orders the index could produce, which
>> could be informed by looking aside at the PlannerInfo contents to see
>> what is likely to be relevant to the query.
>>
>> But even so, I'm not convinced that that is a better design or more
>> maintainable than the current approach.  I fear that it will lead to
>> duplicating substantial amounts of code and knowledge into each index AM,
>> which is not an improvement; and if anything, that increases the risk of
>> breaking every index AM anytime you want to introduce some fundamentally
>> new capability in the area.  Now that it's actually practical to have
>> out-of-core index AMs, that's a bigger concern than it might once have
>> been.
> 
> Yeah, that's all true.  But I think Alexander is right that just
> adding amcandoblah flags ad infinitum doesn't feel good either.  The
> interface isn't really arm's-length if every new thing somebody wants
> to do something new requires another flag.
> 
>> Also see the discussion that led up to commit ed0097e4f.  Users objected
>> the last time we tried to make index capabilities opaque at the SQL level,
>> so they're not going to like a design that tries to hide that information
>> even from the core C code.
> 
> Discoverability is definitely important, but first we have to figure
> out how we're going to make it work, and then we can work out how to
> let users see how it works.

Reading through this thread I'm concerned that this appears to be a big
change making its first appearance in the last CF.  There is also the
need for a new patch and a general consensus of how to proceed.

I recommend moving this patch to 2017-07 or marking it RWF.

Thanks,
-- 
-David
david@pgmasters.net



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] multivariate statistics (v25)
Next
From: David Steele
Date:
Subject: Re: [HACKERS] delta relations in AFTER triggers