Re: Add a log message on recovery startup before syncing datadir - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Add a log message on recovery startup before syncing datadir
Date
Msg-id CA+hUKGLRY+C5vxPQJ45Vxww-68jF0w9hXm8xhbVdiZPjZC1KqA@mail.gmail.com
Whole thread Raw
In response to Add a log message on recovery startup before syncing datadir  (Michael Banck <michael.banck@credativ.de>)
Responses Re: Add a log message on recovery startup before syncing datadir
List pgsql-hackers
On Wed, Oct 7, 2020 at 8:58 PM Michael Banck <michael.banck@credativ.de> wrote:
> we had a customer incident recently where they needed to do a PITR.
> Their data directory is on a NetApp NFS and they have several hundred
> databases in their instance. The startup sync (i.e. before the message
> "starting archive recovery" appears) took 20 minutes and during the

Nice data point.

> first try[1] they were wondering what's going on because there is just
> one log message ("database system was interrupted; last known up at
> ...") and the postmaster process is in state 'D'. Attaching strace
> revealed that it was syncing files and due to the NFS performance that
> took a long time.

No objection to adding a message, but see also this other thread,
about potential ways to get rid of that sync completely, or at least
the phase where you have to open all the files one by one:

https://www.postgresql.org/message-id/flat/CAEET0ZHGnbXmi8yF3ywsDZvb3m9CbdsGZgfTXscQ6agcbzcZAw%40mail.gmail.com

Also, maybe of interest for PITR use cases, see this other thread
about relaxing the end-of-recovery checkpoint (well the patch doesn't
do that yet but it'd be a small step to not wait for it, based on a
GUC, once the checkpointer is running):

https://commitfest.postgresql.org/30/2706/



pgsql-hackers by date:

Previous
From: Michael Banck
Date:
Subject: Add a log message on recovery startup before syncing datadir
Next
From: Michael Paquier
Date:
Subject: Re: [patch] Fix checksum verification in base backups for zero page headers