There was very interesting presentation at pgconf about pg_prefaulter:
http://www.pgcon.org/2018/schedule/events/1204.en.html
But it is implemented in GO and using pg_waldump.
I tried to do the same but using built-on Postgres WAL traverse functions.
I have implemented it as extension for simplicity of integration.
In principle it can be started as BG worker.
First of all I tried to estimate effect of preloading data.
I have implemented prefetch utility with is also attached to this mail.
It performs random reads of blocks of some large file and spawns some
number of prefetch threads:
Just normal read without prefetch:
./prefetch -n 0 SOME_BIG_FILE
One prefetch thread which uses pread:
./prefetch SOME_BIG_FILE
One prefetch thread which uses posix_fadvise:
./prefetch -f SOME_BIG_FILE
4 prefetch thread which uses posix_fadvise:
./prefetch -f -n 4 SOME_BIG_FILE
Based on this experiments (on my desktop), I made the following conclusions:
1. Prefetch at HDD doesn't give any positive effect.
2. Using posix_fadvise allows to speed-up random read speed at SSD up to
2 times.
3. posix_fadvise(WILLNEED) is more efficient than performing normal reads.
4. Calling posix_fadvise in more than one thread has no sense.
I have tested wal_prefetch at two powerful servers with 24 cores, 3Tb
NVME RAID 10 storage device and 256Gb of RAM connected using InfiniBand.
The speed of synchronous replication between two nodes is increased from
56k TPS to 60k TPS (on pgbench with scale 1000).
Usage:
1. At master: create extension wal_prefetch
2. At replica: Call pg_wal_prefetch() function: it will not return until
you interrupt it.
pg_wal_prefetch function will infinitely traverse WAL and prefetch block
references in WAL records
using posix_fadvise(WILLNEED) system call.
It is possible to explicitly specify start LSN for pg_wal_prefetch()
function. Otherwise, WAL redo position will be used as start LSN.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company