Re: Questions regarding Index AMs and natural ordering - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Questions regarding Index AMs and natural ordering |
Date | |
Msg-id | CAH2-WzkrvBZNzA3+-Nx+r8SrUNbnLuff81HxL3Zk79AZ2PaafA@mail.gmail.com Whole thread Raw |
In response to | Re: Questions regarding Index AMs and natural ordering (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Fri, Nov 24, 2023 at 10:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <pg@bowt.ie> writes: > > I suppose that amcanorder=true cannot mean that, since we have the > > SAOP path key thing (at least for now). > > As things stand, amcanorder definitely means that the index always > returns ordered data, since the planner will unconditionally apply > pathkeys to the indexscan Paths generated for it (see plancat.c's > get_relation_info which sets up info->sortopfamily, and > build_index_pathkeys which uses that). We could reconsider that > definition if there were a reason to, but so far it hasn't seemed > interesting. The hack in 807a40c5 is a hack, without a doubt, but > that doesn't necessarily mean we should spend time on generalizing it, > and even less that we should have done so in 2012. That is certainly my preferred interpretation. Not least because I am in the process of removing the hack completely, which has shown very significant benefits for queries with SAOPs that now get to depend on the sort order in some crucial way. > Maybe by now there > are non-core index AMs that have cases where it's worth being pickier. > We'd have to invent some API that allows the index AM to have a say in > what pathkeys get applied. Matthias and I recently discussed a design sketched by James Coleman some years back, which Matthias seemed particularly interested in: https://www.postgresql.org/message-id/flat/CAAaqYe-SsHgXKXPpjn7WCTUnB_RQSxXjpSaJd32aA%3DRquv0AgQ%40mail.gmail.com James' test case benefits from my own patch in the obvious way: it can use SAOP index quals, while still being able to get an ordered scan that terminates early via a LIMIT. But the design he sketched proposes to go much further than that -- it's far more targeted. His design reconstructs a useful sort order by "multiplexing" different parts of the index (for different SAOP constants), merging together multiple streams of ordered tuples into one stream. This means that the index produces tuples in a useful order, sufficient to terminate the scan early -- but it's a sort order that doesn't match the index at all. Obviously, that's a totally novel idea. It's possible that something like that might just make sense -- it's theoretically optimal, at least. My guess is that if it really did happen then the planner would treat it like the special case that it is. It very much reminds me of loose index scan, where the index AM API has to be revised so that high level semantic information can be pushed down into the index AM. If something like that can offer stellar performance, that just isn't achievable in any other way, then I guess it's worth accepting the cost of an uglified index AM API. Whether or not such a feature really can be that compelling likely depends on a myriad of factors that we cannot possibly anticipate ahead of time. There are just too many things in this general area that *might* make sense someday. As we discussed back in 2022, I think that MDAM style "skip scan" (meaning support for skipping around an index given a query with "missing key predicates") shouldn't be a special case in the planner -- only costing needs to know anything about skipping. In general I find that it's most useful to classify all of these techniques according to whether or not they are compatible with the orthodox definition of amcanorder that you described. In other words, to classify them based on whether they involve pushing down high level semantic information to the index AM. -- Peter Geoghegan
pgsql-hackers by date: