Re: A few new options for CHECKPOINT - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: A few new options for CHECKPOINT
Date
Msg-id 20201128174946.GR16415@tamriel.snowman.net
Whole thread Raw
In response to Re: A few new options for CHECKPOINT  ("Bossart, Nathan" <bossartn@amazon.com>)
Responses Re: A few new options for CHECKPOINT  ("Bossart, Nathan" <bossartn@amazon.com>)
List pgsql-hackers
Greetings,

* Bossart, Nathan (bossartn@amazon.com) wrote:
> On 11/27/20, 10:58 AM, "Stephen Frost" <sfrost@snowman.net> wrote:
> > If you'd like to show that I'm wrong, and it's entirely possible that I
> > am, then retry the above with actual load on the system, and also
> > actually look at how much outstanding WAL you end up with given the
> > different scenarios which has to be replayed during crash recovery.
>
> I did a little experiment to show the behavior I'm referring to.  I

I'm rather confused why you didn't use your patch to show the actual
behavior that you'll get as a result of this change..?  That seems like
what would be meaningful here.  I appreciate that the analysis you did
might correlate but I don't really get why we'd try to use a proxy for
this.

> used these settings:
>
>         checkpoint_completion_target = 0.9
>         checkpoint_timeout = 30s
>         max_wal_size = 20GB
>         WAL segment size is 64MB

That's an exceedingly short and very uncommon checkpoint timeout in my
experience..  If anything, folks increase checkpoint timeout from the
default (in order to reduce WAL traffic and because they have a replica
they can flip to in the event of a crash, avoiding having to go through
WAL replay on recovery) and so I'm not really sure that it's a sensible
thing to look at?  Even so though...

>         Recovery took 14 seconds and replayed ~3.2 GB of WAL.

[ ... ]

>         Recovery took 7 seconds and replayed ~1.5 GB of WAL.

This is showing more-or-less what I expected: there's still a large
amount of outstanding WAL, even if you use a very low and unusual
timeout and attempt to time it perfectly.  A question that is still not
clear is what happens when you actually do an immediate checkpoint-
there would likely still be some outstanding WAL even in that case but
I'd expect it to be a whole lot less, which is why that comment exists
in the documentation in the first place.

> Granted, I used a rather aggressive checkpoint_timeout, but I think
> this demonstrates that waiting for a non-immediate checkpoint to
> complete can lower the amount of WAL needed for recovery, even though
> it might not lower it as much as waiting for an immediate checkpoint
> would.

The difference here feels like order of magnitudes to me, between an
immediate checkpoint and a non-immediate one, vs. a much smaller
difference as you've shown here (though, still, kill'ing the postmaster
isn't exactly the same as what your patch would be doing, so I don't
really like using this particular analysis to answer this question...).

If the use-case here was just that you wanted to add more options to the
CHECKPOINT command because we have them internally and maybe they'd be
useful to expose then these things probably wouldn't matter so much, but
to imply that this is really going to cut down on the amount of WAL
replay required a lot isn't really coming through even with these
results.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Add Information during standby recovery conflicts
Next
From: Tom Lane
Date:
Subject: Re: What to do about the broken btree_gist for inet/cidr?