Re: index prefetching - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: index prefetching |
Date | |
Msg-id | CAH2-WzkpNN1+sovB8G=5dVwYW25=J6Qj4V9L7DzD26NTVQWM2w@mail.gmail.com Whole thread Raw |
In response to | Re: index prefetching (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: index prefetching
|
List | pgsql-hackers |
On Thu, Feb 15, 2024 at 12:26 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > I may be missing something, but it seems fairly self-evident to me an > entry at the beginning of an index page won't get prefetched (assuming > the page-at-a-time thing). Sure, if the first item on the page is also the first item that we need the scan to return (having just descended the tree), then it won't get prefetched under a scheme that sticks with the current page-at-a-time behavior (at least in v1). Just like when the first item that we need the scan to return is from the middle of the page, or more towards the end of the page. It is of course also true that we can't prefetch the next page's first item until we actually visit the next page -- clearly that's suboptimal. Just like we can't prefetch any other, later tuples from the next page (until such time as we have determined for sure that there really will be a next page, and have called _bt_readpage for that next page.) This is why I don't think that the tuples with lower page offset numbers are in any way significant here. The significant part is whether or not you'll actually need to visit more than one leaf page in the first place (plus the penalty from not being able to reorder the work across page boundaries in your initial v1 of prefetching). > If I understand your point about boundary cases / suffix truncation, > that helps us by (a) picking the split in a way to minimize a single key > spanning multiple pages, if possible and (b) increasing the number of > entries that fit onto a single index page. More like it makes the boundaries between leaf pages (i.e. high keys) align with the "natural boundaries of the key space". Simple point queries should practically never require more than a single leaf page access as a result. Even somewhat complicated index scans that are reasonably selective (think tens to low hundreds of matches) don't tend to need to read more than a single leaf page match, at least with equality type scan keys for the index qual. > That's certainly true / helpful, and it makes the "first entry" issue > much less common. But the issue is still there. Of course, this says > nothing about the importance of the issue - the impact may easily be so > small it's not worth worrying about. Right. And I want to be clear: I'm really *not* sure how much it matters. I just doubt that it's worth worrying about in v1 -- time grows short. Although I agree that we should commit a v1 that leaves the door open to improving matters in this area in v2. > One case I've been thinking about is sorting using index, where we often > read large part of the index. That definitely seems like a case where reordering work/desynchronization of the heap and index scans might be relatively important. > > I think that there is no question that this will need to not > > completely disable kill_prior_tuple -- I'd be surprised if one single > > person disagreed with me on this point. There is also a more nuanced > > way of describing this same restriction, but we don't necessarily need > > to agree on what exactly that is right now. > > > > Even for the page-at-a-time approach? Or are you talking about the v2? I meant that the current kill_prior_tuple behavior isn't sacred, and can be revised in v2, for the benefit of lifting the restriction on prefetching. But that's going to involve a trade-off of some kind. And not a particularly simple one. > Yeah. The basic idea was that by moving this above index AM it will work > for all indexes automatically - but given the current discussion about > kill_prior_tuple, locking etc. I'm not sure that's really feasible. > > The index AM clearly needs to have more control over this. Cool. I think that that makes the layering question a lot clearer, then. -- Peter Geoghegan
pgsql-hackers by date: