Re: index prefetching - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: index prefetching |
Date | |
Msg-id | c333af4f-6448-4494-aaf7-9dea3f0afb2e@vondra.me Whole thread Raw |
In response to | Re: index prefetching (Nazir Bilal Yavuz <byavuz81@gmail.com>) |
Responses |
Re: index prefetching
|
List | pgsql-hackers |
On 7/21/25 08:53, Nazir Bilal Yavuz wrote: > Hi, > > On Mon, 21 Jul 2025 at 03:59, Thomas Munro <thomas.munro@gmail.com> wrote: >> >> On Sun, Jul 20, 2025 at 1:07 AM Thomas Munro <thomas.munro@gmail.com> wrote: >>> On Sat, Jul 19, 2025 at 11:23 PM Tomas Vondra <tomas@vondra.me> wrote: >>>> The thing that however concerns me is that what I observed was not the >>>> distance getting reset to 1, and then ramping up. Which should happen >>>> pretty quickly, thanks to the doubling. In my experiments it *never* >>>> ramped up again, it stayed at 1. I still don't quite understand why. >>> >>> Huh. Will look into that on Monday. >> >> I suspect that it might be working as designed, but suffering from a >> bit of a weakness in the distance control algorithm, which I described >> in another thread[1]. In short, the simple minded algorithm that >> doubles on miss and subtracts one on hit can get stuck alternating >> between 1 and 2 if you hit certain patterns. Bilal pinged me off-list >> to say that he'd repro'd something like your test case and that's what >> seemed to be happening, anyway? I will dig out my experimental >> patches that tried different adjustments to escape from that state.... > > I used Tomas Vondra's test [1]. I tracked how many times > StartReadBuffersImpl() functions return true (IO is needed) and false > (IO is not needed, cache hit). It returns true ~%6 times on both > simple and complex patches (~116000 times true, ~1900000 times false > on both patches). > > A complex patch ramps up to ~250 distance at the start of the stream > and %6 is enough to stay at distance. Actually, it is enough to ramp > up more but it seems the max distance is about ~270 so it stays there. > On the other hand, a simple patch doesn't ramp up at the start of the > stream and %6 is not enough to ramp up. It is always like distance is > 1 and IO needed, so multiplying the distance by 2 -> distance = 2 but > then the next block is cached, so decreasing the distance by 1 and > distance is 1 again. > > [1] https://www.postgresql.org/message-id/aa46af80-5219-47e6-a7d0-7628106965a6%40vondra.me > Yes, this is the behavior I observed too. I was wondering if the 5% miss ratio hit some special "threshold" in the distance heuristics, and maybe it'd work fine with a couple more misses. But I don't think so, I think pretty workloads with up to 50% misses may hit this problem. We reset the distance to 1, and then with 50% misses we'll do about 1 hit + 1 miss, which doubles the distance to 2 and then reduces the distance to 1, infinitely. Of course, that's only for even distribution hits/misses (and the synthetic workloads are fairly even). Real workloads are likely to have multiple misses in a row, which indeed ramps up the distance quickly. So maybe it's not that bad. Could we track a longer history of hits/misses, and consider that when adjusting the distance? Not just the most recent hit/miss? FWIW I re-ran the index-prefetch-test benchmarks with restoring the distance for the "simple" patch. The results are in the same github repository, in a separate branch: https://github.com/tvondra/indexscan-prefetch-tests/tree/with-distance-restore-after-reset I'm attaching two PDFs with results for eic=16 (linear and log-scale, to compare timings for quick queries). This shows that with restoring distance after reset, the simple patch is pretty much the same as the complex patch. The only data set where that's not the case is the "linear" data set, when everything is perfectly sequential. In this case the simple patch performs like "master" (i.e. no prefetching). I'm not sure why is that. Anyway, it seems to confirm most of the differences between the two patches is due to the "distance collapse". The impact of the resets in the first benchmarks surprised me quite a bit, but if we don't ramp up the distance that makes perfect sense. The issue probably affects other queries that do a lot of resets. Index scan prefetching just makes it very obvious. regards -- Tomas Vondra
Attachment
pgsql-hackers by date: