Re: index prefetching - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: index prefetching |
Date | |
Msg-id | y32ow7vfdrrcchxrwgfr5tbwg3ncm4f3rlp65svcbqorbm3tmn@cwvipbplnqyt Whole thread Raw |
In response to | Re: index prefetching (Tomas Vondra <tomas@vondra.me>) |
List | pgsql-hackers |
Hi, On 2025-08-25 15:00:39 +0200, Tomas Vondra wrote: > Thanks. Based on the testing so far, the patch seems to be a substantial > improvement. What's needed to make this prototype committable? Mainly some testing infrastructure that can trigger this kind of stream. The logic is too finnicky for me to commit it without that. > I assume this is PG19+ improvement, right? It probably affects PG18 too, > but it's harder to hit / the impact is not as bad as on PG19. Yea. It does apply to 18 too, but I can't come up with realistic scenarios where it's a real issue. I can repro a slowdown when using many parallel seqscans with debug_io_direct=data - but that's even slower in 17... > On a related note, my test that generates random datasets / queries, and > compares index prefetching with different io_method values found a > pretty massive difference between worker and io_uring. I wonder if this > might be some issue in io_method=worker. > while with index prefetching (with the aio prototype patch), it looks > like this: > > QUERY PLAN > ---------------------------------------------------------------------- > Index Scan using idx on t (actual rows=9048576.00 loops=1) > Index Cond: ((a >= 16150) AND (a <= 4540437)) > Index Searches: 1 > Prefetch Distance: 2.032 > Prefetch Count: 868165 > Prefetch Stalls: 2140228 > Prefetch Skips: 6039906 > Prefetch Resets: 0 > Stream Ungets: 0 > Stream Forwarded: 4 > Prefetch Histogram: [2,4) => 855753, [4,8) => 12412 > Buffers: shared hit=2577599 read=455610 > Planning: > Buffers: shared hit=78 read=26 dirtied=1 > Planning Time: 1.032 ms > Execution Time: 3150.578 ms > (16 rows) > > So it's about 2x slower. The prefetch distance collapses, because > there's a lot of cache hits (about 50% of requests seem to be hits of > already visited blocks). I think that's a problem with how we adjust the > distance, but I'll post about that separately. > > Let's try to simply set io_method=io_uring: > > QUERY PLAN > ---------------------------------------------------------------------- > Index Scan using idx on t (actual rows=9048576.00 loops=1) > Index Cond: ((a >= 16150) AND (a <= 4540437)) > Index Searches: 1 > Prefetch Distance: 2.032 > Prefetch Count: 868165 > Prefetch Stalls: 2140228 > Prefetch Skips: 6039906 > Prefetch Resets: 0 > Stream Ungets: 0 > Stream Forwarded: 4 > Prefetch Histogram: [2,4) => 855753, [4,8) => 12412 > Buffers: shared hit=2577599 read=455610 > Planning: > Buffers: shared hit=78 read=26 > Planning Time: 2.212 ms > Execution Time: 1837.615 ms > (16 rows) > > That's much closer to master (and the difference could be mostly noise). > > I'm not sure what's causing this, but almost all regressions my script > is finding look like this - always io_method=worker, with distance close > to 2.0. Is this some inherent io_method=worker overhead? I think what you might be observing might be the inherent IPC / latency overhead of the worker based approach. This is particularly pronounced if the workers are idle (and the CPU they get scheduled on is clocked down). The latency impact of that is small, but if you never actually get to do much readahead it can be visible. Greetings, Andres Freund
pgsql-hackers by date: