Thread: Re: strange perf regression with data checksums
Hi Tomas, > While running some benchmarks comparing 17 and 18, I ran into a simple > workload where 18 throughput drops by ~80%. After pulling my hair for a > couple hours I realized the change that triggered this is 04bec894a04c, > which set checksums on by default. Which is very bizarre, because the > workload is read-only and fits into shared buffers. > > [...] > > But why would it depend on checksums at all? This read-only test should > be entirely in-memory, so how come it's affected? These are interesting results. Just wanted to clarify: did you make sure that all the hint bits were set before executing the benchmark? I'm not claiming that hint bits are necessarily the reason for the observed behavior but when something is off with presumably read-only queries this is the first reason that comes to mind. At least we should make sure hint bits are excluded from the equation. If memory serves, VACUUM FULL and CHECKPOINT after filling the table and creating the index should do the trick. -- Best regards, Aleksander Alekseev
On 5/9/25 14:53, Aleksander Alekseev wrote: > Hi Tomas, > >> While running some benchmarks comparing 17 and 18, I ran into a simple >> workload where 18 throughput drops by ~80%. After pulling my hair for a >> couple hours I realized the change that triggered this is 04bec894a04c, >> which set checksums on by default. Which is very bizarre, because the >> workload is read-only and fits into shared buffers. >> >> [...] >> >> But why would it depend on checksums at all? This read-only test should >> be entirely in-memory, so how come it's affected? > > These are interesting results. > > Just wanted to clarify: did you make sure that all the hint bits were > set before executing the benchmark? > > I'm not claiming that hint bits are necessarily the reason for the > observed behavior but when something is off with presumably read-only > queries this is the first reason that comes to mind. At least we > should make sure hint bits are excluded from the equation. If memory > serves, VACUUM FULL and CHECKPOINT after filling the table and > creating the index should do the trick. > Good question. I haven't checked that explicitly, but it's a tiny data set (15MB) and I observed this even on long benchmarks with tens of millions of queries. So the hint bits should have been set. Also, I should have mentioned the query does an index-only scan, and the pin/unpin calls are on index pages, not on the heap. regards -- Tomas Vondra
Hi, > > I'm not claiming that hint bits are necessarily the reason for the > > observed behavior but when something is off with presumably read-only > > queries this is the first reason that comes to mind. At least we > > should make sure hint bits are excluded from the equation. If memory > > serves, VACUUM FULL and CHECKPOINT after filling the table and > > creating the index should do the trick. > > Good question. I haven't checked that explicitly, but it's a tiny data > set (15MB) and I observed this even on long benchmarks with tens of > millions of queries. So the hint bits should have been set. > > Also, I should have mentioned the query does an index-only scan, and the > pin/unpin calls are on index pages, not on the heap. There is one more thing I would check. As I recall perf shows only on-CPU time while actually the backends may be sleeping on the locks most of the time. If this is the case perf will not show you the accurate picture. In order to check this personally I create gdb.script with a single GDB command: ``` bt ``` And execute: ``` gdb --batch --command=gdb.script -p (backend_pid_here) ``` ... 10+ times or so. If what you are observing is actually a lock contention and the backend sleeps on a lock most of the time, 8/10 or so stacktraces will show you this. I assume of course that the benchmark is done on release builds with disabled Asserts, etc. BTW do you believe this is a problem related exclusively to NUMA CPUs with 90+ cores or I can reproduce it on SMT as well? -- Best regards, Aleksander Alekseev