Questions regarding Index AMs and natural ordering - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Questions regarding Index AMs and natural ordering
Date
Msg-id CAEze2WgOAiKHoQFXujDd8kopq=bJLFVHh9mKfA502Rxf1Xh3dQ@mail.gmail.com
Whole thread Raw
Responses Re: Questions regarding Index AMs and natural ordering
List pgsql-hackers
Hi,

Over in [0] and [1] there are patches that touch on the topic of
'natual ordering' index retrieval, and [2] also touches on the topic.
For those patches, I've been looking at how the planner and executor
indicate to index AMs that they expects the output to be ordered, and
how this ordering should work.
I've mostly found how it works for index_key opr constant, but I've
yet to find a good mental model for how the planner handles indexes
that can expose the 'intrinsic order' of data, i.e. indexes with
`amcanorder=true`, because there is very little (if any) real
documentation on what is expected from indexes when it advertises
certain features, and how the executor signals to the AM that it wants
to make use of those features.

For example, btree ignores any ordering scan keys that it is given in
btrescan, which seems fine for btree because the ordering of a btree
is static (and no other order than that is expected apart from it's
reverse order), but this becomes problematic for other indexes that
could return ordered data but would prefer not to have to go through
the motions of making sure the return order is 100% correct, rather
than a k-sorted sequence, or just the matches to the quals (like
GIST). Is returning index scan results in (reverse) natural order not
optional but required with amcanorder? If it is required, why is the
am indicator called 'canorder' instead of 'willorder', 'doesorder' or
'isordered'?

Alternatively, if an am should be using the order scan keys from
.amrescan and natural order scans also get scan keys, is there some
place I find the selection process for ordered index AMs, and how this
ordering should be interepreted? There are no good examples available
in core code because btree is special-cased, and there are no other
in-tree AMs that have facilities where both `amcanorderbyop` and
`amcanorder` are set.

I did read through indexam.sgml, but that does not give a clear answer
on this distinction of 'amcanorder' having required ordered results or
not, nor on how to interpret amrescan's orderbys argument. I also
looked at planner code where it interacts with amcanorder /
amcanorderbyop, but I couldn't wrap my head around its interactions
with indexes, either (more specifically, the ordering part of those
indexes) due to the complexity of the planner and the many layers that
the various concepts are passed through. The README in
backend/optimizer didn't answer this question for me, either.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

[0] https://www.postgresql.org/message-id/flat/EB2AF704-70FC-4D73-A97A-A7884A0381B5%40kleczek.org
[1]
https://www.postgresql.org/message-id/flat/CAH2-Wz%3DksvN_sjcnD1%2BBt-WtifRA5ok48aDYnq3pkKhxgMQpcw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/e70fa091-e338-1598-9de4-6d0ef6b693e2%40enterprisedb.com



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: autovectorize page checksum code included elsewhere
Next
From: Tom Lane
Date:
Subject: Re: Use index to estimate expression selectivity