Re: index prefetching - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: index prefetching |
Date | |
Msg-id | 51b5f71b-5f19-4453-91ff-2b9f2a840c58@vondra.me Whole thread Raw |
In response to | Re: index prefetching ("Peter Geoghegan" <pg@bowt.ie>) |
Responses |
Re: index prefetching
|
List | pgsql-hackers |
On 8/12/25 23:22, Peter Geoghegan wrote: > ... > > It looks like the patch does significantly better with the forwards scan, > compared to the backwards scan (though both are improved by a lot). But that's > not the main thing about these results that I find interesting. > > The really odd thing is that we get "shared hit=6619 read=49933" for the > forwards scan, and "shared hit=10350 read=49933" for the backwards scan. The > latter matches master (regardless of the scan direction used on master), while > the former just looks wrong. What explains the "missing buffer hits" seen with > the forwards scan? > > Discrepancies > ------------- > > All 4 query executions agree that "rows=1048576.00", so the patch doesn't appear > to simply be broken/giving wrong answers. Might it be that the "Buffers" > instrumentation is broken? > I think a bug in the prefetch patch is more likely. I tried with a patch that adds various prefetch-related counters to explain, and I see this: test=# EXPLAIN (ANALYZE, VERBOSE, COSTS OFF) SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a; QUERY PLAN ------------------------------------------------------------------------ Index Scan using idx on public.t (actual time=0.682..527.055 rows=1048576.00 loops=1) Output: a, b Index Cond: ((t.a >= 16336) AND (t.a <= 49103)) Index Searches: 1 Prefetch Distance: 271.263 Prefetch Count: 60888 Prefetch Stalls: 1 Prefetch Skips: 991211 Prefetch Resets: 3 Prefetch Histogram: [2,4) => 2, [4,8) => 8, [8,16) => 17, [16,32) => 24, [32,64) => 34, [64,128) => 52, [128,256) => 82, [256,512) => 60669 Buffers: shared hit=5027 read=50872 I/O Timings: shared read=33.528 Planning: Buffers: shared hit=78 read=23 I/O Timings: shared read=2.349 Planning Time: 3.686 ms Execution Time: 559.659 ms (17 rows) test=# EXPLAIN (ANALYZE, VERBOSE, COSTS OFF) SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a DESC; QUERY PLAN ------------------------------------------------------------------------ Index Scan Backward using idx on public.t (actual time=1.110..4116.201 rows=1048576.00 loops=1) Output: a, b Index Cond: ((t.a >= 16336) AND (t.a <= 49103)) Index Searches: 1 Prefetch Distance: 271.061 Prefetch Count: 118806 Prefetch Stalls: 1 Prefetch Skips: 962515 Prefetch Resets: 3 Prefetch Histogram: [2,4) => 2, [4,8) => 7, [8,16) => 12, [16,32) => 17, [32,64) => 24, [64,128) => 3, [128,256) => 4, [256,512) => 118737 Buffers: shared hit=30024 read=50872 I/O Timings: shared read=581.353 Planning: Buffers: shared hit=82 read=23 I/O Timings: shared read=3.168 Planning Time: 4.289 ms Execution Time: 4185.407 ms (17 rows) These two parts are interesting: Prefetch Count: 60888 Prefetch Skips: 991211 Prefetch Count: 118806 Prefetch Skips: 962515 It looks like the backwards scan skips fewer blocks. This is based on the lastBlock optimization, i.e. looking for runs of the same block number. I don't quite see why would it affect just the backwards scan, though. Seems weird. > The premise of my original complaint was that big inconsistencies in performance > shouldn't happen between similar forwards and backwards scans (at least not with > direct I/O). I now have serious doubts about that premise, since it looks like > OS readahead remains a big factor with direct I/O. Did I just miss something > obvious? > I don't think you missed anything. It does seem the assumption relies on the OS handling the underlying I/O patterns equally, and unfortunately that does not seem to be the case. Maybe we could "invert" the data set, i.e. make it "descending" instead of "ascending"? That would make the heap access direction "forward" again ... regards -- Tomas Vondra
pgsql-hackers by date: