On 10/28/2014 02:56 AM, David G Johnston wrote:
> Tom Lane-2 wrote
>>> So maybe we shouldn't cling to the WAL-logging approach too much. Maybe
>>> Heikki's idea from to abandon the full checkpoint and instead assume
>>> that once the transaction commits, all the files were fsynced OK. Of
>>> couse, this will do nothing about the replay hazards.
>>
>> Well, I'm not insisting on any particular method of getting there, but
>> if we're going to touch this area at all then I think "fix the replay
>> hazards" should be a non-negotiable requirement. We'd never have accepted
>> such hazards if CREATE DATABASE were being introduced for the first time;
>> it's only like this because nobody felt like rewriting a Berkeley-era
>> kluge.
>
> Maybe adding a ToDo:
>
> "Fix replay hazards in CREATE DATABASE"
>
> and listing them explicitly would be a good start.
>
> Not sure if WAL or CREATE would be more appropriate but WAL seems like a
> better fit.
My opinion is that we should do the WAL-logging, unless
wal_level=minimal, in which case we just fsync everything before commit.
That would hurt people who are currently using CREATE DATABASE with a
large template database, with WAL archiving enabled, but for everyone
else it would be a win or not noticeable.
If that's not acceptable, perhaps we could assume that a database with
!datallowconn=false can be copied without WAL-logging, i.e. assume that
a database with !datallowconn won't be modified. Of course, that's a
shaky assumption because someone might just switch datallowconn back on
after the CREATE DATABASE. Maybe we could force a checkpoint when you do
that, instead..
> To the topic at hand would "CREATE DATABASE name WITH LOGGED = true" work?
> As with UNLOGGED tables giving the user the choice of WAL/fsync/checkpoint
> behavior seems reasonable. As Thomas said there a couple of diametrically
> opposed use-cases here and it seems like we've already implemented the more
> difficult one.
Yeah, that would be another way to provide an escape hatch if someone
absolutely needs to avoid the extra I/O. I wouldn't like to maintain two
different ways of doing the same thing, though. It would be a feature
that would be used rarely, so even if we get it right, subtle bugs could
easily creep into it over the years.
- Heikki