Re: Disable WAL logging to speed up data loading - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Disable WAL logging to speed up data loading
Date
Msg-id CABUevEwwzJM_Ke2rGtxRNz66RbrvVHuTMJULOoBbJjVWpXyq8Q@mail.gmail.com
Whole thread Raw
In response to Re: Disable WAL logging to speed up data loading  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Disable WAL logging to speed up data loading  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Mon, Nov 2, 2020 at 4:28 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Oct 29, 2020 at 4:00 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> > Yes. What I meant was such a safe guard needs to be implemented.
> >
> > This may mean that if we want to recover the database from that backup,
> > we need to specify the recovery target so that the archive recovery stops
> > just before the WAL record indicating wal_level change.
>
> Yeah, I think we need these kinds of safeguards, for sure.
>
> I'm also concerned about the way that this proposed feature interacts
> with incremental backup capabilities that already exist in tools like
> pgBackRest, EDB's BART, pg_probackup, and future things we might want
> to introduce into core, along the lines of what I have previously
> proposed. Now, I think pgBackRest uses only timestamps and checksums,
> so it probably doesn't care, but some of the other solutions rely on
> WAL-scanning to gather a list of changed blocks. I guess there's no
> reason that they can't notice the wal_level being changed and do the
> right thing; they should probably have that kind of capability
> already. Still, it strikes me that it might be useful if we had a
> stronger mechanism.
>
> I'm not exactly sure what that would look like, but suppose we had a
> feature where every time wal_level drops below replica, a counter gets
> incremented by 1, and that counter is saved in the control file. Or
> maybe when wal_level drops below minimal to none. Or maybe there are
> two counters. Anyway, the idea is that if you have a snapshot of the
> cluster at one time and a snapshot at another time, you can see
> whether anything scary has happened in the middle without needing all
> of the WAL in between.
>
> Maybe this is off-topic for this thread or not really needed, but I'm
> not sure. I don't think wal_level=none is a bad idea intrinsically,
> but I think it would be easy to implement it poorly and end up harming
> a lot of users. I have no problem with giving people a way to do
> dangerous things, but we should do our best to let people know how
> much danger they've incurred.

I definitely think this is something that should be thought out and
included in a patch like this, so it's definitely on-topic for this
thread.

Having the ability to turn things off can certainly be very useful.
Having the risk of having done so without realizing the damage caused
is a *big* foot-gun, and we need to do our best to protect against it.

This is not entirely unlike the idea that we've discussed before of
having basically a "tainted" flag in pg_control if the system has ever
been started up in say fsync=off, just to make sure that we have a
record of it. This wouldn't be the same flag of course, but it's a
similar problem, where even temporarily starting the cluster up with a
certain set of flags can do permanent damage which is not necessarily
fixed by changing it back and restarting.

This would also be something that should be exposed as monitoring
points (which it could be if it's in pg_control). That is, I can
imagine a *lot* of installations that would definitely want an alert
to fire if the cluster has ever been started up in a wal_level=none or
wal_level=minimal, at least up until the point where somebody has run
a new full backup.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Disable WAL logging to speed up data loading
Next
From: Isaac Morland
Date:
Subject: Re: Getting rid of aggregate_dummy()