Hello,
As a very simple exploration of the possible gains from batching redo
records during replay, I tried to avoid acquiring and releasing
buffers pins and locks while replaying records that touch the same
page as the previous record. The attached experiment-grade patch
works by trying to give a locked buffer to the next redo handler,
which then releases it if it wants a different buffer. Crash recovery
on my dev machine went from 62s to 34s (1.8x speedup) for:
create table t (i int, foo text);
insert into t select generate_series(1, 50000000), 'the quick brown
fox jumped over the lazy dog';
delete from t;
Of course that workload was contrived to produce a suitable WAL
history for this demo. The patch doesn't help more common histories
from the real world, involving (non-HOT) UPDATEs and indexes etc,
because then you have various kinds of interleaving that defeat this
simple-minded optimisation. To get a more general improvement, it
seems that we'd need a smarter redo loop that could figure out what
can safely be reordered to maximise the page-level batching and
locality effects. I haven't studied the complications of reordering
yet, and I'm not working on that for PostgreSQL 14, but I wanted to
see if others have thoughts about it. The WAL prefetching patch that
I am planning to get into 14 opens up these possibilities by decoding
many records into a circular WAL decode buffer, so you can see a whole
chain of them at once.