Thread: Re: strange perf regression with data checksums

Re: strange perf regression with data checksums

From
Aleksander Alekseev
Date:
Hi Tomas,

> While running some benchmarks comparing 17 and 18, I ran into a simple
> workload where 18 throughput drops by ~80%. After pulling my hair for a
> couple hours I realized the change that triggered this is 04bec894a04c,
> which set checksums on by default. Which is very bizarre, because the
> workload is read-only and fits into shared buffers.
>
> [...]
>
> But why would it depend on checksums at all? This read-only test should
> be entirely in-memory, so how come it's affected?

These are interesting results.

Just wanted to clarify: did you make sure that all the hint bits were
set before executing the benchmark?

I'm not claiming that hint bits are necessarily the reason for the
observed behavior but when something is off with presumably read-only
queries this is the first reason that comes to mind. At least we
should make sure hint bits are excluded from the equation. If memory
serves, VACUUM FULL and CHECKPOINT after filling the table and
creating the index should do the trick.

-- 
Best regards,
Aleksander Alekseev



Re: strange perf regression with data checksums

From
Tomas Vondra
Date:
On 5/9/25 14:53, Aleksander Alekseev wrote:
> Hi Tomas,
> 
>> While running some benchmarks comparing 17 and 18, I ran into a simple
>> workload where 18 throughput drops by ~80%. After pulling my hair for a
>> couple hours I realized the change that triggered this is 04bec894a04c,
>> which set checksums on by default. Which is very bizarre, because the
>> workload is read-only and fits into shared buffers.
>>
>> [...]
>>
>> But why would it depend on checksums at all? This read-only test should
>> be entirely in-memory, so how come it's affected?
> 
> These are interesting results.
> 
> Just wanted to clarify: did you make sure that all the hint bits were
> set before executing the benchmark?
> 
> I'm not claiming that hint bits are necessarily the reason for the
> observed behavior but when something is off with presumably read-only
> queries this is the first reason that comes to mind. At least we
> should make sure hint bits are excluded from the equation. If memory
> serves, VACUUM FULL and CHECKPOINT after filling the table and
> creating the index should do the trick.
> 

Good question. I haven't checked that explicitly, but it's a tiny data
set (15MB) and I observed this even on long benchmarks with tens of
millions of queries. So the hint bits should have been set.

Also, I should have mentioned the query does an index-only scan, and the
pin/unpin calls are on index pages, not on the heap.


regards

-- 
Tomas Vondra




Re: strange perf regression with data checksums

From
Aleksander Alekseev
Date:
Hi,

> > I'm not claiming that hint bits are necessarily the reason for the
> > observed behavior but when something is off with presumably read-only
> > queries this is the first reason that comes to mind. At least we
> > should make sure hint bits are excluded from the equation. If memory
> > serves, VACUUM FULL and CHECKPOINT after filling the table and
> > creating the index should do the trick.
>
> Good question. I haven't checked that explicitly, but it's a tiny data
> set (15MB) and I observed this even on long benchmarks with tens of
> millions of queries. So the hint bits should have been set.
>
> Also, I should have mentioned the query does an index-only scan, and the
> pin/unpin calls are on index pages, not on the heap.

There is one more thing I would check. As I recall perf shows only
on-CPU time while actually the backends may be sleeping on the locks
most of the time. If this is the case perf will not show you the
accurate picture.

In order to check this personally I create gdb.script with a single GDB command:

```
bt
```

And execute:

```
gdb --batch --command=gdb.script -p (backend_pid_here)
```

... 10+ times or so. If what you are observing is actually a lock
contention and the backend sleeps on a lock most of the time, 8/10 or
so stacktraces will show you this.

I assume of course that the benchmark is done on release builds with
disabled Asserts, etc.

BTW do you believe this is a problem related exclusively to NUMA CPUs
with 90+ cores or I can reproduce it on SMT as well?

-- 
Best regards,
Aleksander Alekseev