Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date
Msg-id 20220210075228.nvv6jswl2r5tvq6k@jrouhaud
Whole thread Raw
In response to Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Feb 09, 2022 at 02:30:08PM -0500, Robert Haas wrote:
> On Wed, Feb 9, 2022 at 1:34 PM Bruce Momjian <bruce@momjian.us> wrote:
> > Well, I think the worst case is that the checkpoint happens exactly
> > between two checkpoints, so you are checkpointing twice as often, but if
> > it happens just before or after a checkpoint, I assume the effect would
> > be minimal.
> 
> I agree for the most part. I think that if checkpoints happen every 8
> minutes normally and the extra checkpoint happens 2 minutes after the
> previous checkpoint, the impact may be almost as bad as if it had
> happened right in the middle. If it happens 5 seconds after the
> previous checkpoint, it should be low impact.

But the extra checkpoints will be immediate, while on a properly configured
system it should be spread checkpoint.  That will add some more overhead.

> > So, it seems we are weighing having a checkpoint happen in the middle of
> > a checkpoint interval vs writing more WAL.  If the WAL traffic, without
> > CREATE DATABASE, is high, and the template database is small, writing
> > more WAL and skipping the checkpoint will be win, but if the WAL traffic
> > is small and the template database is big, the extra WAL will be a loss.
> > Is this accurate?
> 
> I think that's basically correct. I would expect that the worry about
> big template database is mostly about template databases that are
> REALLY big. I think if your template database is 10GB you probably
> shouldn't be worried about this feature. 10GB of extra WAL isn't
> nothing, but if you've got reasonably capable hardware, it's not
> overloaded, and max_wal_size is big enough, it's probably not going to
> have a huge impact. Also, most of the impact will probably be on the
> CREATE DATABASE command itself, and other things running on the system
> at the same time will be impacted to a lesser degree. I think it's
> even possible that you will be happier with this feature than without,
> because you may like the idea that CREATE DATABASE itself is slow more
> than you like the idea of it making everything else on the system
> slow. On the other hand, if your template database is 1TB, the extra
> WAL is probably going to be a fairly big problem.
> 
> Basically I think for most people this should be neutral or a win. For
> people with really large template databases, it's a loss. Hence the
> discussion about having a way for people who prefer the current
> behavior to keep it.

Those extra WALs will also impact backups and replication.  You could have
fancy hardware, a read-mostly workload and the need to replicate over a slow
WAN, and in that case the 10GB could be much more problematic.



pgsql-hackers by date:

Previous
From: Nitin Jadhav
Date:
Subject: Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Next
From: Dilip Kumar
Date:
Subject: Re: decoupling table and index vacuum