Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints |
Date | |
Msg-id | 20210616221316.jrveut7d5kimpl6j@alap3.anarazel.de Whole thread Raw |
In response to | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints |
List | pgsql-hackers |
Hi, On 2021-06-15 18:11:23 +0530, Dilip Kumar wrote: > On Tue, Jun 15, 2021 at 5:34 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > > > On 15/06/2021 14:20, Dilip Kumar wrote: > > > Design Idea: > . Then > > > we can get the relfilenode of every file we need to copy, and prepare > > > a list of all such relfilenode. > > > > I guess that would work, but you could also walk the database directory > > like copydir() does. How you find the relations to copy is orthogonal to > > whether you WAL-log them or use checkpoints. And whether you use the > > buffer cache is also orthogonal to the rest of the proposal; you could > > issue FlushDatabaseBuffers() instead of a checkpoint. > > Yeah, that would also work, but I thought since we are already > avoiding the checkpoint so let's avoid FlushDatabaseBuffers() also and > directly use the lower level buffer manager API which doesn't need > recache. And I am using pg_class to identify the useful relfilenode > so that we can avoid processing some unwanted relfilenode but yeah I > agree that this is orthogonal to whether we use checkpoint or not. It's not entirely obvious to me that it's important to avoid FlushDatabaseBuffers() on its own. Forcing a checkpoint is problematic because it unnecessarily writes out dirty buffers in other databases, triggers FPWs etc. Normally a database used as a template won't have a meaningful amount of dirty buffers itself, so the FlushDatabaseBuffers() shouldn't trigger a lot of writes. Of course, there is the matter of FlushDatabaseBuffers() not being cheap with a large shared_buffers - but I suspect that's not a huge factor compared to the rest of the database creation cost. I think the better argument for going through shared buffers is that it might be worth doing so for the *target* database. A common use of frequently creating databases, in particular with a non-default template database, is to run regression tests with pre-created schema / data - writing out all that data just to have it then dropped a few seconds later after the regression test completed is wasteful. > > In principle, we could have both mechanisms, and use the new WAL-logged > > system if the database is small, and the old system with checkpoints if > > it's large. But I don't like idea of having to maintain both. > > Yeah, I agree in some cases, where we don't have many dirty buffers, > checkpointing can be faster. I don't think the main issue is the speed of checkpointing itself? The reaoson to maintain the old paths is that the "new approach" is bloating WAL volume, no? Right now cloning a 1TB database costs a few hundred bytes of WAL and about 1TB of write IO. With the proposed approach, the write volume approximately doubles, because there'll also be about 1TB in WAL. Greetings, Andres Freund
pgsql-hackers by date: