Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date
Msg-id CAH2-WzmWuo5J3+KGh_QpZLhVQAdJzb7VDwJxFgqROhg4vp-dfg@mail.gmail.com
Whole thread Raw
In response to Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan  ("Anton A. Melnikov" <a.melnikov@postgrespro.ru>)
List pgsql-hackers
On Wed, Jul 31, 2024 at 12:47 AM Anton A. Melnikov
<a.melnikov@postgrespro.ru> wrote:
>  From the 5bf748b86bc commit message:
>
> > There is a theoretical risk that removing restrictions on SAOP index
> > paths from the planner will break compatibility with amcanorder-based
> > index AMs maintained as extensions.  Such an index AM could have the
> > same limitations around ordered SAOP scans as nbtree had up until now.
> > Adding a pro forma incompatibility item about the issue to the Postgres
> > 17 release notes seems like a good idea.

Here you've quoted the commit message's description of the risks
around breaking third party index AMs maintained as extensions. FWIW
that seems like a rather different problem to the one that you ran
into with your KNN nbtree patch.

> Seems, this commit broke our posted knn_btree patch. [1]
> If the point from which ORDER BY goes by distance is greater than the elements of ScalarArrayOp,
> then knn_btree algorithm will give only the first tuple. It sorts the elements of ScalarArrayOp
> in descending order and starts searching from smaller to larger
> and always expects that for each element of ScalarArrayOp there will be a separate scan.
> And now it does not work. Reproduction is described in [2].
>
> Seems it is impossible to solve this problem only from the knn-btree patch side.
> Could you advise any ways how to deal with this. Would be very grateful.

Well, you're actually patching nbtree. Your patch isn't upstream of
the nbtree code. You're going to have to reconcile your design with
the design for advancing arrays that is primarily implemented by
_bt_advance_array_keys.

I'm not entirely sure what the best way to go about doing that is --
that would require a lot of context about the KNN patch. It wouldn't
be particularly hard to selectively reenable the old Postgres 16 SAOP
behavior, though. You could conditionally force "goto new_prim_scan"
whenever the KNN patch's mechanism was in use.

That kind of approach might have unintended consequences elsewhere,
though. For example it could break things like mark/restore processing
by merge joins, which assumes that the scan's current array keys can
always be reconstructed after restoring a marked position by resetting
the array's to their first elements (or final elements if it's a
backwards scan). On the other hand, I think that you already shouldn't
expect anything like that to work with your patch. After all, it
changes the order of the tuples returned by the scan, which is already
bound to break certain assumptions made by "regular" index scans.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: "Vitaly Davydov"
Date:
Subject: Re: Fsync (flush) all inserted WAL records
Next
From: Heikki Linnakangas
Date:
Subject: Re: Interrupts vs signals