Re: Enhance traceability of wal_level changes for backup management - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Enhance traceability of wal_level changes for backup management
Date
Msg-id 20210106173949.GJ27507@tamriel.snowman.net
Whole thread Raw
In response to RE: Enhance traceability of wal_level changes for backup management  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Responses RE: Enhance traceability of wal_level changes for backup management
List pgsql-hackers
Greetings,

* osumi.takamichi@fujitsu.com (osumi.takamichi@fujitsu.com) wrote:
> You said
> > The use case I imagined is that the user temporarily
> > changes wal_level to 'none' from 'replica' or 'logical' to speed up loading and
> > changes back to the normal. In this case, the backups taken before the
> > wal_level change cannot be used to restore the database to the point after the
> > wal_level change. So I think backup management tools would want to
> > recognize the time or LSN when/where wal_level is changed to ‘none’ in order
> > to do some actions such as invalidating older backups, re-calculating backup
> > redundancy etc.
> > Actually the same is true when the user changes to ‘minimal’. The tools would
> > need to recognize the time or LSN in this case too. Since temporarily changing
> > wal_level has been an uncommon use case some tools perhaps are not aware
> > of that yet. But since introducing wal_level = ’none’ could make the change
> > common, I think we need to provide a way for the tools.

I continue to be against the idea of introducing another wal_level.  If
there's additional things we can do to reduce WAL traffic while we
continue to use it to keep the system in a consistent state then we
should implement those for the 'minimal' case and be done with it.
Adding another wal_level is just going to be confusing and increase
complexity, and the chances that someone will end up in a bad state.

> I wondered, couldn't backup management tools utilize the information
> in the backup that is needed to be made when wal_level is changed to "none" for example ?

What information is that, exactly?  If there's a way to detect that the
wal_level has been flipped to 'minimal' and then back to 'replica',
other than scanning the WAL, we'd certainly like to hear of it, so we
can implement logic in pgbackrest to detect that happening.  I'll admit
that I've not gone hunting for such of late, but I don't recall seeing
anything that would change that either.

The idea proposed above about having the LSN and time recorded when a
wal_level change to minimal happens, presumably in pg_control, seems
like it could work as a way to allow external tools to more easily
figure out if the wal_level's been changed to minimal.  Perhaps there's
a reason to track changes between replica and logical but I can't think
of any offhand and backup tools would still need to know if the wal
level was set to *minimal* independently of changes between replica and
logical.

Then again, once we add support for scanning the WAL to pgbackrest,
we'll almost certainly track it that way since that'll work for older
and released versions of PG, so I'm not really sure it's worth it to add
this to pg_control unless there's other reasons to.

> As I said before, existing backup management tools support
> only wal_level=replica or logical at present. And, if they would wish to alter the
> status quo and want to cover the changes over wal_levels, I felt it's natural that
> they support feature like taking a full backup, trigged by the wal_level changes (or server stop).

Sure, but there needs to be a way to actually do that..

> This is because taking a backup is a must for wal_level=none,
> as I described in the patch of wal_level=none.
> For example, they could prepare an easy way to
> take an offline physical backup when the server stops for changing the wal_level.
> (Here, they can support the change to minimal, too.)

pgbackrest does support offline physical backups and it's pretty easy
(just pass in --no-online).  That doesn't really help with the issue of
detecting a wal_level change though.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Add Information during standby recovery conflicts
Next
From: Andres Freund
Date:
Subject: Re: data_checksums enabled by default (was: Move --data-checksums to common options in initdb --help)