Re: Questions regarding Index AMs and natural ordering - Mailing list pgsql-hackers
From | Matthias van de Meent |
---|---|
Subject | Re: Questions regarding Index AMs and natural ordering |
Date | |
Msg-id | CAEze2WhY+aPtpLJZfXN+BC4FjojqBoy3JsxwKrkgT_vn5vv1kA@mail.gmail.com Whole thread Raw |
In response to | Re: Questions regarding Index AMs and natural ordering (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Questions regarding Index AMs and natural ordering
|
List | pgsql-hackers |
On Thu, 23 Nov 2023 at 19:52, Peter Geoghegan <pg@bowt.ie> wrote: > > On Thu, Nov 23, 2023 at 9:16 AM Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > > For example, btree ignores any ordering scan keys that it is given in > > btrescan, which seems fine for btree because the ordering of a btree > > is static (and no other order than that is expected apart from it's > > reverse order), but this becomes problematic for other indexes that > > could return ordered data but would prefer not to have to go through > > the motions of making sure the return order is 100% correct, rather > > than a k-sorted sequence, or just the matches to the quals (like > > GIST). Is returning index scan results in (reverse) natural order not > > optional but required with amcanorder? If it is required, why is the > > am indicator called 'canorder' instead of 'willorder', 'doesorder' or > > 'isordered'? > > I don't know. I have a hard time imagining an index AM that is > amcanorder=true that isn't either nbtree, or something very similar > (so similar that it seems unlikely that anybody would actually go to > the trouble of implementing it from scratch). Well, BRIN (with minmax opclasses) could return ordered results if it needs to (see [0]; though that implements it as a distinct plan node). Ordering the tuples correctly takes quite some effort, but is quite likely to use less effort and/or scratch space than a table/bitmap scan + sort, because we won't have to manage all tuples of the table at the same time. However, it woould be extremely expensive if the planner expects this to always return the data in btree order. For GIST with the btree_gist opclasses it is even easier to return ordered results (patch over at [1]), but then still it prefers not to have to make a strict ordering as it adds overhead vs 'normal' index scans. Also, was that a confirmation that amcanorder is a requirement for the AM to return data in index order (unless amrescan's orderbys is not null), or just a comment on the reason for the name of 'amcanorder' being unclear? > You didn't mention support for merge joins. That's one of the defining > characteristics of an amcanorder=true index AM, since an > "ammarkpos/amrestrpos function need only be provided if the access > method supports ordered scans". It's hard to imagine how that could > work with a loosely ordered index. It seems to imply that the scan > must always work with a simple linear order. I probably didn't think of merge join support because 'merge join' is not mentioned as such in the index AM api - I knew of ammarkpos/amrestrpos, but hadn't yet gone into detail what they're used for. > Cases where the planner uses a merge join often involve an index path > with an "interesting sort order" that "enables" the merge join. > Perhaps most of the alternative plans (that were almost as cheap as > the merge join plan) would have had to scan the same index in the same > way anyway, so it ends up making sense to use a merge join. The merge > join can get ordered results from the index "at no extra cost". (All > of this is implicit, of course -- the actual reason why the planner > chose the merge join plan is because it worked out to be the cheapest > plan.) Couldn't the merge join (or scan node) use a tuple store to return to some earlier point in the index scan when a native version of markpos is not easily supported? It would add (potentially very significant) IO overhead, but it would also allow merge joins on ordered paths that currently don't have a natural way of marking their position. > > Alternatively, if an am should be using the order scan keys from > > .amrescan and natural order scans also get scan keys, is there some > > place I find the selection process for ordered index AMs, and how this > > ordering should be interepreted? There are no good examples available > > in core code because btree is special-cased, and there are no other > > in-tree AMs that have facilities where both `amcanorderbyop` and > > `amcanorder` are set. > > The general notion of a data type's sort order comes from its default > btree operator class, so the whole idea of a generic sort order is > deeply tied to the nbtree AM. That's why we sometimes have btree > operator classes for types that you'd never actually want to index (at > least not using a btree index). Yes, the part where btree opclasses determine a type's ordering is clear. But what I'm looking for is "how do I, as an index AM implementation, get the signal that I need to return column-ordered data?" If that signal is "index AM marked amcanorder == index must return ordered data", then that's suboptimal for the index AM writer, but understandable. If amcanorder does not imply always ordered retrieval, then I'd like to know how it is signaled to the AM. But as of yet I've not found a clear and conclusive answer either way. Kind regards, Mattthias van de Meent. [0] https://www.postgresql.org/message-id/flat/e70fa091-e338-1598-9de4-6d0ef6b693e2%40enterprisedb.com [1] https://www.postgresql.org/message-id/flat/EB2AF704-70FC-4D73-A97A-A7884A0381B5%40kleczek.org
pgsql-hackers by date: