On 2014-10-27 09:46:41 -0400, Tom Lane wrote:
> Heikki Linnakangas <hlinnakangas@vmware.com> writes:
> > On 10/27/2014 03:21 PM, Tomas Vondra wrote:
> >> Thinking about this a bit more, do we really need a full checkpoint? That
> >> is a checkpoint of all the databases in the cluster? Why checkpointing the
> >> source database is not enough?
>
> > A full checkpoint ensures that you always begin recovery *after* the
> > DBASE_CREATE record. I.e. you never replay a DBASE_CREATE record during
> > crash recovery (except when you crash before the transaction commits, in
> > which case it doesn't matter if the new database's directory is borked).
>
> Yeah. After re-reading the 2005 thread, I wonder if we shouldn't just
> bite the bullet and redesign CREATE DATABASE as you suggest, ie, WAL-log
> all the copied files instead of doing a "cp -r"-equivalent directory copy.
> That would fix a number of existing replay hazards as well as making it
> safe to do what Tomas wants. In the small scale this would cause more I/O
> (2 copies of the template database's data) but in production situations
> we might well come out ahead by avoiding a forced checkpoint of the rest
> of the cluster. Also I guess we could skip WAL-logging if WAL archiving
> is off, similarly to the existing optimization for CREATE INDEX etc.
+1.
Greetings,
Andres Freund
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services