Re: Run end-of-recovery checkpoint in non-wait mode or skip it entirely for faster server availability? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Run end-of-recovery checkpoint in non-wait mode or skip it entirely for faster server availability?
Date
Msg-id FF923252-1631-4B16-A3A1-36909D6F644F@anarazel.de
Whole thread Raw
In response to Re: Run end-of-recovery checkpoint in non-wait mode or skip it entirely for faster server availability?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On March 25, 2022 9:56:38 AM PDT, Robert Haas <robertmhaas@gmail.com> wrote:
>On Fri, Mar 25, 2022 at 3:40 AM Bharath Rupireddy
><bharath.rupireddyforpostgres@gmail.com> wrote:
>> Since the server spins up checkpointer process [1] while the startup
>> process performs recovery, isn't it a good idea to make
>> end-of-recovery completely optional for the users or at least run it
>> in non-wait mode so that the server will be available faster. The next
>> checkpointer cycle will take care of performing the EOR checkpoint
>> work, if user chooses to skip the EOR or the checkpointer will run EOR
>> checkpoint  in background, if user chooses to run it in the non-wait
>> mode (without CHECKPOINT_WAIT flag). Of course by choosing this
>> option, users must be aware of the fact that the extra amount of
>> recovery work that needs to be done if a crash happens from the point
>> EOR gets skipped or runs in non-wait mode until the next checkpoint.
>> But the advantage that users get is the faster server availability.
>
>I think that we should remove end-of-recovery checkpoints completely
>and instead use the end-of-recovery WAL record (cf.
>CreateEndOfRecoveryRecord). However, when I tried to do that, I ran
>into some problems:
>
>http://postgr.es/m/CA+TgmobrM2jvkiccCS9NgFcdjNSgAvk1qcAPx5S6F+oJT3D2mQ@mail.gmail.com
>
>The second problem described in that email has subsequently been
>fixed, I believe, but the first one remains.

Seems we could deal with that by making latestCompleted a 64bit xid? Then there never are cases where we have to
retreatback into such early xids? 

A random note from a conversation with Thomas a few days ago: We still perform timeline increases with checkpoints in
somecases. Might be worth fixing as a step towards just using EOR. 

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Re: pg_dump new feature: exporting functions only. Bad or good idea ?
Next
From: Tom Lane
Date:
Subject: Re: Corruption during WAL replay