Home > mailing lists

Re: Prefetch the next tuple's memory during seqscans - Mailing list pgsql-hackers

From	sirisha chamarthi
Subject	Re: Prefetch the next tuple's memory during seqscans
Date	November 23, 2022 10:29:47
Msg-id	CAKrAKeXEEf8o9Y8ZXg3-f62++y1zW3KpjHVym=HiGLvum5NT1A@mail.gmail.com Whole thread Raw
In response to	Re: Prefetch the next tuple's memory during seqscans (David Rowley <dgrowleyml@gmail.com>)
Responses	Re: Prefetch the next tuple's memory during seqscans
List	pgsql-hackers

Tree view

On Tue, Nov 22, 2022 at 1:58 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 3 Nov 2022 at 06:25, Andres Freund <andres@anarazel.de> wrote:
> Attached is an experimental patch/hack for that. It ended up being more
> beneficial to make the access ordering more optimal than prefetching the tuple
> contents, but I'm not at all sure that's the be-all-end-all.

Thanks for writing that patch. I've been experimenting with it.

I tried unrolling the loop (patch 0003) as you mentioned in:

+ * FIXME: Worth unrolling so that we don't fetch the same cacheline
+ * over and over, due to line items being smaller than a cacheline?

but didn't see any gains from doing that.

I also adjusted your patch a little so that instead of doing:

- OffsetNumber rs_vistuples[MaxHeapTuplesPerPage]; /* their offsets */
+ OffsetNumber *rs_vistuples;
+ OffsetNumber rs_vistuples_d[MaxHeapTuplesPerPage]; /* their offsets */

to work around the issue of having to populate rs_vistuples_d in
reverse, I added a new field called rs_startindex to mark where the
first element in the rs_vistuples array is. The way you wrote it seems
to require fewer code changes, but per the FIXME comment you left, I
get the idea you just did it the way you did to make it work enough
for testing.

I'm quite keen to move forward in committing the 0001 patch to add the
pg_prefetch_mem macro. What I'm a little undecided about is what the
best patch is to commit first to make use of the new macro.

I did some tests on the attached set of patches:

alter system set max_parallel_workers_per_gather = 0;
select pg_reload_conf();

create table t as select a from generate_series(1,10000000)a;
alter table t set (autovacuum_enabled=false);

$ cat bench.sql
select * from t where a = 0;

psql -c "select pg_prewarm('t');" postgres

-- Test 1 no frozen tuples in "t"

Master (@9c6ad5eaa):
$ pgbench -n -f bench.sql -M prepared -T 10 postgres | grep -E "^latency"
latency average = 383.332 ms
latency average = 375.747 ms
latency average = 376.090 ms

Master + 0001 + 0002:
$ pgbench -n -f bench.sql -M prepared -T 10 postgres | grep -E "^latency"
latency average = 370.133 ms
latency average = 370.149 ms
latency average = 370.157 ms

Master + 0001 + 0005:
$ pgbench -n -f bench.sql -M prepared -T 10 postgres | grep -E "^latency"
latency average = 372.662 ms
latency average = 371.034 ms
latency average = 372.709 ms

-- Test 2 "select count(*) from t" with all tuples frozen

$ cat bench1.sql
select count(*) from t;

psql -c "vacuum freeze t;" postgres
psql -c "select pg_prewarm('t');" postgres

Master (@9c6ad5eaa):
$ pgbench -n -f bench1.sql -M prepared -T 10 postgres | grep -E "^latency"
latency average = 406.238 ms
latency average = 407.029 ms
latency average = 406.962 ms

Master + 0001 + 0005:
$ pgbench -n -f bench1.sql -M prepared -T 10 postgres | grep -E "^latency"
latency average = 345.470 ms
latency average = 345.775 ms
latency average = 345.354 ms

My current thoughts are that it might be best to go with 0005 to start
with. I know Melanie is working on making some changes in this area,
so perhaps it's best to leave 0002 until that work is complete.

I ran your test1 exactly like your setup except the row count is 3000000 (with 13275 blocks). Shared_buffers is 128MB and the hardware configuration details at the bottom of the mail. It appears Master + 0001 + 0005 regressed compared to master slightly .

Master (@56d0ed3b756b2e3799a7bbc0ac89bc7657ca2c33)

Before vacuum:
/usr/local/pgsql/bin/pgbench -n -f bench.sql -M prepared -T 30 -P 10 postgres | grep -E "^latency"
latency average = 430.287 ms

After Vacuum:
/usr/local/pgsql/bin/pgbench -n -f bench.sql -M prepared -T 30 -P 10 postgres | grep -E "^latency"
latency average = 369.046 ms

Master + 0001 + 0002:

Before vacuum:
/usr/local/pgsql/bin/pgbench -n -f bench.sql -M prepared -T 30 -P 10 postgres | grep -E "^latency"
latency average = 427.983 ms

After Vacuum:
/usr/local/pgsql/bin/pgbench -n -f bench.sql -M prepared -T 30 -P 10 postgres | grep -E "^latency"
latency average = 367.185 ms

Master + 0001 + 0005:

Before vacuum:
/usr/local/pgsql/bin/pgbench -n -f bench.sql -M prepared -T 30 -P 10 postgres | grep -E "^latency"
latency average = 447.045 ms

After Vacuum:
/usr/local/pgsql/bin/pgbench -n -f bench.sql -M prepared -T 30 -P 10 postgres | grep -E "^latency"
latency average = 374.484 ms

lscpu output

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Stepping: 2
CPU MHz: 2397.224
BogoMIPS: 4794.44
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 32 KiB
L1i cache: 32 KiB
L2 cache: 256 KiB
L3 cache: 30 MiB
NUMA node0 CPU(s): 0
Vulnerability Itlb multihit: KVM: Mitigation: VMX unsupported
Vulnerability L1tf: Mitigation; PTE Inversion
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni
pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt m
d_clear

David

pgsql-hackers by date:

From: Justin Pryzby
Date: 23 November 2022, 09:36:54
Subject: Re: Documentation for building with meson

From: David Rowley
Date: 23 November 2022, 10:44:04
Subject: Re: Prefetch the next tuple's memory during seqscans

Re: Prefetch the next tuple's memory during seqscans - Mailing list pgsql-hackers

Previous

Next