On Wed, Oct 16, 2024 at 5:48 PM Matthias van de Meent
<boekewurm@gmail.com> wrote:
> In v17 and the master branch you'll note 16 buffer hits for the test
> query. However, when we use more expensive btree compare operations
> (e.g. by adding pg_usleep(1) to both btint8cmp and btint4cmp), the
> buffer access count starts to vary a lot and skyrockets to 30+ on my
> machine, in some cases reaching >100 buffer hits. After applying my
> patch, the buffer access count is capped to a much more agreeable
> 16-18 hits - it still shows signs of overshooting the serial bounds,
> but the number of buffers we overshoot our target is capped and thus
> significantly lower.
It's not exactly capped, though. Since in any case you're always prone
to getting extra leaf page reads at the end of each primitive index
scan. That's not something that's new to Postgres 17, though.
Anyway, I'm still not convinced. Your test case requires adding a one
second delay to each ORDER proc comparison, and so has an unrealistic
adversarial character. It uses an index-only scan that is drastically
faster if we don't use a parallel scan at all. The serial case is
0.046 ms for me, whereas the parallel case is 3.094 ms (obviously
that's without the addition of a 1 second delay). You've thrown
everything but the kitchen sink at the issue, and yet the impact on
buffer hits really isn't too bad.
Does anybody else have an opinion on this?
--
Peter Geoghegan