On Wed, 2020-10-28 at 09:55 +0000, osumi.takamichi@fujitsu.com wrote:
> > > I wrote and attached the first patch to disable WAL logging.
> > > This patch passes the regression test of check-world already and is
> > > formatted by pgindent.
> >
> > Without reading the code, I have my doubts about that feature.
> > While it clearly will improve performance, it opens the door to data loss.
>
> Therefore, this feature must avoid that
> that kind of inconsistent server starts up again.
> This has been discussed in this thread already.
>
> > People will use it to speed up their data loads and then be unhappy if they
> > cannot use their backups to recover from a problem.
> > What happens if you try to do archive recovery across a time where wal_level
> > was "none"? Will the recovery process fail, as it should, or will you end up
> > with data corruption?
> > We already have a performance-related footgun in the shape of fsync = off.
> > Do we want to add another one?
>
> Further, in this thread, we discuss that
> this feature is intended to serve under
> some specific opportunities like DBA wants
> to load data as soon as possible and/or the operation itself is easily *repeatable*.
> So, before and after the change of wal_level, DBA needs to take a full backup to
> prepare the unexpected crash.
>
> But anyway, I'll fix and enrich the documents. Thanks.
I read through the thread and the patch now.
The only safety I see is that startup after a crash is prevented.
But what if someone sets wal_level=none, performs some data modifications,
sets wal_level=archive and after dome more processing decides to restore from
a backup that was taken before the cluster was set to wal_level=none?
Then they would end up with a corrupted database, right?
I think the least this patch needs is that starting with wal_level=none emits
a WAL record that will make recovery fail.
I am aware that this is intended for "specific opportunities", but we still
should make it as hard as possible for the user to cause harm. It may be that
MySQL, which inspired this feature, does not care about that, but I think we
should do better.
Another point that makes me worry is that this feature will unconditionally
break all replication, and there is not the least mention of that in the
documentation.
Yours,
Laurenz Albe