Re: Adding basic NUMA awareness - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Adding basic NUMA awareness
Date
Msg-id 06c7e398-3336-49f2-a011-cc89ee28b9c0@vondra.me
Whole thread Raw
In response to Re: Adding basic NUMA awareness  (Andres Freund <andres@anarazel.de>)
Responses Re: Adding basic NUMA awareness
List pgsql-hackers

On 1/13/26 01:10, Andres Freund wrote:
> Hi,
> 
> On 2026-01-13 00:58:49 +0100, Tomas Vondra wrote:
>> On 1/10/26 02:42, Andres Freund wrote:
>>> psql -Xq -c 'SELECT pg_buffercache_evict_all();' -c 'SELECT numa_node, sum(size) FROM pg_shmem_allocations_numa
GROUPBY 1;' && perf stat --per-socket  -M memory_bandwidth_read,memory_bandwidth_write -a psql -c 'SELECT sum(abalance)
FROMpgbench_accounts;'
 
> 
>> And then I initialized pgbench with scale that is much larger than
>> shared buffers, but fits into RAM. So cached, but definitely > NB/4. And
>> then I ran
>>
>>   select * from pgbench_accounts offset 1000000000;
>>
>> which does a sequential scan with the circular buffer you mention abobe
> 
> Did you try it with the query I suggested? One plausible reason why you did
> not see an effect with your query is that with a huge offset you actually
> never deform the tuple, which is an important and rather latency sensitive
> path.
> 

I did try with the agg query too, and there's still no difference on
either machine.


I can't do the perf on the Azure VM, because the Ubuntu is image is
borked and does not allow installing the package. But on my xeon I can
do the perf, and that gives me this:

numactl --membind=0 --cpunodebind=0 ~/builds/master-test/bin/pg_ctl
-----------------------------------------------------------------------
S0 1     24,677,226    UNC_M_CAS_COUNT.WR #    79.0 MB/s ... idth_write
S0 1 20,001,829,522 ns duration_time                     ...
S0 1    972,631,426    UNC_M_CAS_COUNT.RD #  3112.2 MB/s ... idth_read
S0 1 20,001,822,807 ns duration_time                     ...
S1 1     15,602,233    UNC_M_CAS_COUNT.WR #    49.9 MB/s ... idth_write
S1 1    712,431,146    UNC_M_CAS_COUNT.RD #  2279.6 MB/s ... idth_read


numactl --membind=0 --cpunodebind=1 ~/builds/master-test/bin/pg_ctl
-----------------------------------------------------------------------
S0 1     47,931,019    UNC_M_CAS_COUNT.WR #    153.4 MB/s ... idth_write
S0 1 20,002,933,380 ns duration_time                      ...
S0 1  1,007,386,994    UNC_M_CAS_COUNT.RD #   3223.2 MB/s ... idth_read
S0 1 20,002,927,341 ns duration_time                      ...
S1 1     10,310,201    UNC_M_CAS_COUNT.WR #     33.0 MB/s ... idth_write
S1 1    714,826,668    UNC_M_CAS_COUNT.RD #   2287.2 MB/s ... idth_read

so there is a little bit of a difference for some stats, but not much.


FWIW this is from

perf stat --per-socket  -M memory_bandwidth_read,memory_bandwidth_write
-a -- sleep 20

while the agg query runs in a loop.


cheers
-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Buffer locking is special (hints, checksums, AIO writes)
Next
From: Andres Freund
Date:
Subject: Re: Adding basic NUMA awareness