Re: index prefetching - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: index prefetching |
Date | |
Msg-id | CA+hUKGKMaZLmNQHaa_DZMw9MJJKGegjrqnTY3KOZB-_nvFa3wQ@mail.gmail.com Whole thread Raw |
In response to | Re: index prefetching (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: index prefetching
Re: index prefetching |
List | pgsql-hackers |
On Tue, Aug 12, 2025 at 11:42 AM Peter Geoghegan <pg@bowt.ie> wrote: > On Mon, Aug 11, 2025 at 5:07 PM Tomas Vondra <tomas@vondra.me> wrote: > > I can do some tests with forward vs. backwards scans. Of course, the > > trouble with finding these weird cases is that they may be fairly rare. > > So hitting them is a matter or luck or just happening to generate the > > right data / query. But I'll give it a try and we'll see. > > I was talking more about finding "performance bugs" through a > semi-directed process of trying random things while looking out for > discrepancies. Something like that shouldn't require the usual > "benchmarking rigor", since suspicious inconsistencies should be > fairly obvious once encountered. I expect similar queries to have > similar performance, regardless of superficial differences such as > scan direction, DESC vs ASC column order, etc. I'd be interested to hear more about reverse scans. Bilal was speculating about backwards I/O combining in read_stream.c a while back, but we didn't have anything interesting to use it yet. You'll probably see a flood of uncombined 8KB IOs in the pg_aios view while travelling up the heap with cache misses today. I suspect Linux does reverse sequential prefetching with buffered I/O (less sure about other OSes) which should help but we'd still have more overheads than we could if we combined them, not to mention direct I/O. Not tested, but something like this might do it: /* Can we merge it with the pending read? */ - if (stream->pending_read_nblocks > 0 && - stream->pending_read_blocknum + stream->pending_read_nblocks == blocknum) + if (stream->pending_read_nblocks > 0) { - stream->pending_read_nblocks++; - continue; + if (stream->pending_read_blocknum + stream->pending_read_nblocks == + blocknum) + { + stream->pending_read_nblocks++; + continue; + } + else if (stream->pending_read_blocknum == blocknum + 1 && + stream->forwarded_buffers == 0) + { + stream->pending_read_blocknum--; + stream->pending_read_nblocks++; + continue; + } } > I tested this issue again (using my original pgbench_account query), > having rebased on top of HEAD as of today. I found that the > inconsistency seems to be much smaller now -- so much so that I don't > think that the remaining inconsistency is particularly suspicious. > > I also think that performance might have improved across the board. I > see that the same TPC-C query that took 768.454 ms a few weeks back > now takes only 617.408 ms. Also, while I originally saw "I/O Timings: > shared read=138.856" with this query, I now see "I/O Timings: shared > read=46.745". That feels like a performance bug fix to me. > > I wonder if today's commit b4212231 from Thomas ("Fix rare bug in > read_stream.c's split IO handling") fixed the issue, without anyone > realizing that the bug in question could manifest like this. I can't explain that. If you can consistently reproduce the change at the two base commits, maybe bisect? If it's a real phenomenon I'm definitely curious to know what you're seeing.
pgsql-hackers by date: