Re: Is there a way to speed up WAL replay? - Mailing list pgsql-general

From Thomas Munro
Subject Re: Is there a way to speed up WAL replay?
Date
Msg-id CAEepm=0_QgthfHxmCuCtDtLVorn86sxH5=UZBfSz_Mwr73mcZw@mail.gmail.com
Whole thread Raw
In response to Re: Is there a way to speed up WAL replay?  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-general
On Thu, Nov 1, 2018 at 4:25 AM Jeff Janes <jeff.janes@gmail.com> wrote:
> On Wed, Oct 31, 2018 at 1:38 AM Torsten Förtsch <tfoertsch123@gmail.com> wrote:
>> I am working on restoring a database from a base backup + WAL. With the default settings the database replays about
3-4WAL files per second. The startup process takes about 65% of a CPU and writes data with something between 50 and 100
MB/sec.
>>
>> Is there a way to speed that up? The disk can easily sustain 400-500 MB/sec.
>
>
> WAL replay is single-threaded, so the most you would be able to speed it up is 50%, to where it would be taking 100%
CPU.
>
> Is the time spent not on the CPU being spent waiting for WAL files to arrive from the restore_command, or waiting for
theblocks it needs to replay into to get read from disk, or waiting for dirty pages to get written to disk so they can
beevicted to make way for new ones? 
>
> One way I found to speed up restore_command is to have another program run a few WAL files ahead of it, copying the
WALfrom the real archive into a scratch space which is on the same filesystem as pg_xlog/pg_wal.  Then have
restore_commandsimply move (not copy) the requested files from the scratch space onto %p.  The intra-filesystem move
completesmuch faster than a copy. 
>
> If it spends time waiting for blocks that need to be recovered into to get read from disk, and you have enough RAM,
youcould speed it up by pre-warming the file system cache.  Something like: 
>
> tar -cf - $PGDATA | wc -c

For more targeted prewarming of large systems that don't fit in RAM
and to get all the way into PostgreSQL's buffer pool, I suppose you
could write a small Python/whatever script that extracts the
relfilenode + block references from the output of pg_waldump (one file
ahead, or whatever), sorts and uniques them, merges them into block
ranges, converts the relfilenode reference to relation OID, and then
calls pg_prewarm() for each range.

--
Thomas Munro
http://www.enterprisedb.com


pgsql-general by date:

Previous
From: Rich Shepard
Date:
Subject: Re: Broken postgres links need to find callers
Next
From: "James A. Robinson"
Date:
Subject: Working around, or with, bitmap heap scan?