Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date
Msg-id 20210616221316.jrveut7d5kimpl6j@alap3.anarazel.de
Whole thread Raw
In response to Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
List pgsql-hackers
Hi,

On 2021-06-15 18:11:23 +0530, Dilip Kumar wrote:
> On Tue, Jun 15, 2021 at 5:34 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> >
> > On 15/06/2021 14:20, Dilip Kumar wrote:
> > > Design Idea:
> . Then
> > > we can get the relfilenode of every file we need to copy, and prepare
> > > a list of all such relfilenode.
> >
> > I guess that would work, but you could also walk the database directory
> > like copydir() does. How you find the relations to copy is orthogonal to
> > whether you WAL-log them or use checkpoints. And whether you use the
> > buffer cache is also orthogonal to the rest of the proposal; you could
> > issue FlushDatabaseBuffers() instead of a checkpoint.
> 
> Yeah, that would also work, but I thought since we are already
> avoiding the checkpoint so let's avoid FlushDatabaseBuffers() also and
> directly use the lower level buffer manager API which doesn't need
> recache.  And I am using pg_class to identify the useful relfilenode
> so that we can avoid processing some unwanted relfilenode but yeah I
> agree that this is orthogonal to whether we use checkpoint or not.

It's not entirely obvious to me that it's important to avoid
FlushDatabaseBuffers() on its own. Forcing a checkpoint is problematic because
it unnecessarily writes out dirty buffers in other databases, triggers FPWs
etc. Normally a database used as a template won't have a meaningful amount of
dirty buffers itself, so the FlushDatabaseBuffers() shouldn't trigger a lot of
writes. Of course, there is the matter of FlushDatabaseBuffers() not being
cheap with a large shared_buffers - but I suspect that's not a huge factor
compared to the rest of the database creation cost.

I think the better argument for going through shared buffers is that it might
be worth doing so for the *target* database. A common use of frequently
creating databases, in particular with a non-default template database, is to
run regression tests with pre-created schema / data - writing out all that data
just to have it then dropped a few seconds later after the regression test
completed is wasteful.



> > In principle, we could have both mechanisms, and use the new WAL-logged
> > system if the database is small, and the old system with checkpoints if
> > it's large. But I don't like idea of having to maintain both.
> 
> Yeah, I agree in some cases, where we don't have many dirty buffers,
> checkpointing can be faster.

I don't think the main issue is the speed of checkpointing itself? The reaoson
to maintain the old paths is that the "new approach" is bloating WAL volume,
no? Right now cloning a 1TB database costs a few hundred bytes of WAL and about
1TB of write IO. With the proposed approach, the write volume approximately
doubles, because there'll also be about 1TB in WAL.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Zhihong Yu
Date:
Subject: Re: A qsort template
Next
From: Tomas Vondra
Date:
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints