Re: WIP: WAL prefetch (another approach) - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: WIP: WAL prefetch (another approach)
Date
Msg-id CA+hUKGLr6nU7yb8CO7_DEppBmPCOkCUvE6HBza8himYV_s5nuQ@mail.gmail.com
Whole thread Raw
In response to Re: WIP: WAL prefetch (another approach)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: WIP: WAL prefetch (another approach)
List pgsql-hackers
On Wed, Sep 2, 2020 at 2:18 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On Wed, Sep 02, 2020 at 02:05:10AM +1200, Thomas Munro wrote:
> >On Wed, Sep 2, 2020 at 1:14 AM Tomas Vondra
> ><tomas.vondra@2ndquadrant.com> wrote:
> >> from the archive
> >
> >Ahh, so perhaps that's the key.
>
> Maybe. For the record, the commands look like this:
>
> archive_command = 'gzip -1 -c %p > /mnt/raid/wal-archive/%f.gz'
>
> restore_command = 'gunzip -c /mnt/raid/wal-archive/%f.gz > %p.tmp && mv %p.tmp %p'

Yeah, sorry, I goofed here by not considering archive recovery
properly.  I have special handling for crash recovery from files in
pg_wal (XLRO_END, means read until you run out of files) and streaming
replication (XLRO_WALRCV_WRITTEN, means read only as far as the wal
receiver has advertised as written in shared memory), as a way to
control the ultimate limit on how far ahead to read when
maintenance_io_concurrency and max_recovery_prefetch_distance don't
limit you first.  But if you recover from a base backup with a WAL
archive, it uses the XLRO_END policy which can run out of files just
because a new file hasn't been restored yet, so it gives up
prefetching too soon, as you're seeing.  That doesn't cause any
damage, but it stops doing anything useful because the prefetcher
thinks its job is finished.

It'd be possible to fix this somehow in the two-XLogReader design, but
since I'm testing a new version that has a unified
XLogReader-with-read-ahead I'm not going to try to do that.  I've
added a basebackup-with-archive recovery to my arsenal of test
workloads to make sure I don't forget about archive recovery mode
again, but I think it's actually harder to get this wrong in the new
design.  In the meantime, if you are still interested in studying the
potential speed-up from WAL prefetching using the most recently shared
two-XLogReader patch, you'll need to unpack all your archived WAL
files into pg_wal manually beforehand.



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: A micro-optimisation for walkdir()
Next
From: Andres Freund
Date:
Subject: Re: A micro-optimisation for walkdir()