On 7/11/25 23:03, Tomas Vondra wrote:
> ...
>
> e) indexscan regression (ryzen-indexscan-uniform-pg17-checksums.png)
>
> There's an interesting difference difference I noticed in the run with
> checksums on PG17. The full PDF is available here:
>
> https://github.com/tvondra/iomethod-tests/blob/run2-17-checksums-on/ryzen-rows-cold-32GB-16-unscaled.pdf
>
> The interesting thing is that PG17 indexscans on uniform dataset got a
> little bit faster. In the attached PDF it's exactly on par with PG18,
> but here it got a bit faster. Which makes no sense, if it has to also
> verify checksums. I haven't had time to investigate this yet.
I was intrigued by this, so I looked into this today.
TL;DR I believe it was caused by something in the filesystem or even the
storage devices, making the "PG17" data directory (or maybe even just
the "uniform" table) a bit faster.
I started by reproducing the behavior with an indexscan matching 10% of
the rows, and it was very easy to reproduce the difference shows on the
chart (all timings in milliseconds):
PG17: 14112.800 ms
PG18: 21612.090 ms
This was perfectly reproducible, affecting the whole table (not just one
part of it), etc. At some point I recalled that I might have initialized
the databases in slightly different ways - one by running the SQL, the
other one by pg_dump/pg_restore (likely with multiple jobs).
I couldn't think of any other difference between the data directories,
so I simply reloaded them by pg_restore (from the same dump). Which
however made them both slow :O
And it didn't matter how many jobs are used, or anything else I tried.
But every now and then an instance (17 or 18) happened to be fast
(~14000 ms). Consistently, for all queries on the table, not randomly.
In the end I recreated the (ext4) filesystem, loaded the databases and
now both instances are fast. I have no idea what the root cause was, and
I assume recreating the filesystem destroyed all the evidence.
I'll rerun the tests - will take a couple days. I don't think it's
likely to change the conclusions, though. It should only affect how PG17
compares to PG18, not how the io_methods compare to each other. Also, I
don't think the "xeon" resuls are affected.
regards
--
Tomas Vondra