Re: Questions regarding Index AMs and natural ordering - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Questions regarding Index AMs and natural ordering
Date
Msg-id CAH2-WzkDrpWGRMdJo2MKbt4LjOTn5HOvm4h2riNw9_hRpVTrOg@mail.gmail.com
Whole thread Raw
In response to Re: Questions regarding Index AMs and natural ordering  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Nov 23, 2023 at 11:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Agreed on that, but I don't have that hard a time imagining cases
> where it might be useful for btree not to guarantee ordered output.
> IIRC, it currently has to do extra pushups to ensure that behavior
> in ScalarArrayOp cases.  We've not bothered to expand the planner
> infrastructure to distinguish "could, but doesn't" paths for btree
> scans, but that's more about it not being a priority than because it
> wouldn't make sense.

I'm glad that you brought that up, because it's an important issue for
my ScalarArrayOp patch (which Matthias referenced). My patch simply
removes this restriction from the planner -- now ScalarArrayOp clauses
aren't treated as a special case when generating path keys. This works
in all cases because the patch generalizes nbtree's approach to native
ScalarArrayOp execution, allowing index scans to work as one
continuous index scan in all cases.

As you might recall, the test case that exercises the issue is:

SELECT thousand, tenthous FROM tenk1
WHERE thousand < 2 AND tenthous IN (1001,3000)
ORDER BY thousand;

It doesn't actually make much sense to execute this as two primitive
index scans, though. The most efficient approach is to perform a
single index descent, while still being able to use a true index qual
for "tenthous" (since using a filter qual is so much slower due to the
overhead of accessing the heap just to filter out non-matching
tuples). That's what the patch does.

This would be true even without the "ORDER BY" -- accessing the leaf
page once is faster than accessing it twice (same goes for the root).
So I see no principled reason why we'd ever really want to start a
primitive index scan that wasn't "anchored to an equality constraint".
This is reliably faster, while also preserving index sort order,
almost as a side-effect.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: PL/pgSQL: Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]
Next
From: David Rowley
Date:
Subject: Re: Properly pathify the union planner