Re: WIP: WAL prefetch (another approach) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: WIP: WAL prefetch (another approach)
Date
Msg-id c5d52837-6256-0556-ac8c-d6d3d558820a@enterprisedb.com
Whole thread Raw
In response to Re: WIP: WAL prefetch (another approach)  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: WIP: WAL prefetch (another approach)
List pgsql-hackers
Hi,

I did a bunch of tests on v15, mostly to asses how much could the
prefetching help. The most interesting test I did was this:

1) primary instance on a box with 16/32 cores, 64GB RAM, NVMe SSD

2) replica on small box with 4 cores, 8GB RAM, SSD RAID

3) pause replication on the replica (pg_wal_replay_pause)

4) initialize pgbench scale 2000 (fits into RAM on primary, while on
replica it's about 4x RAM)

5) run 1h pgbench: pgbench -N -c 16 -j 4 -T 3600 test

6) resume replication (pg_wal_replay_resume)

7) measure how long it takes to catch up, monitor lag

This is nicely reproducible test case, it eliminates influence of
network speed and so on.

Attached is a chart showing the lag with and without the prefetching. In
both cases we start with ~140GB of redo lag, and the chart shows how
quickly the replica applies that. The "waves" are checkpoints, where
right after a checkpoint the redo gets much faster thanks to FPIs and
then slows down as it gets to parts without them (having to do
synchronous random reads).

With master, it'd take ~16000 seconds to catch up. I don't have the
exact number, because I got tired of waiting, but the estimate is likely
accurate (judging by other tests and how regular the progress is).

With WAL prefetching enabled (I bumped up the buffer to 2MB, and
prefetch limit to 500, but that was mostly just arbitrary choice), it
finishes in ~3200 seconds. This includes replication of the pgbench
initialization, which took ~200 seconds and where prefetching is mostly
useless. That's a damn pretty improvement, I guess!

In a way, this means the tiny replica would be able to keep up with a
much larger machine, where everything is in memory.


One comment about the patch - the postgresql.conf.sample change says:

#recovery_prefetch = on      # whether to prefetch pages logged with FPW
#recovery_prefetch_fpw = off # whether to prefetch pages logged with FPW

but clearly that comment is only for recovery_prefetch_fpw, the first
GUC enables prefetching in general.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Multiple full page writes in a single checkpoint?
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Correct comment in StartupXLOG().