RE: Enhance traceability of wal_level changes for backup management - Mailing list pgsql-hackers

From osumi.takamichi@fujitsu.com
Subject RE: Enhance traceability of wal_level changes for backup management
Date
Msg-id OSBPR01MB4888C0DB6C2DE261629C8D3BEDA10@OSBPR01MB4888.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: Enhance traceability of wal_level changes for backup management  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
Responses RE: Enhance traceability of wal_level changes for backup management
List pgsql-hackers
Hi


Apologies for my delay.
On Wednesday, January 6, 2021 7:03 PM I wrote:
> I'll continue the discussion of [2].
> We talked about how to recognize the time or LSN when/where wal_level is
> changed to 'none' there.
>
> You said
> > The use case I imagined is that the user temporarily changes wal_level
> > to 'none' from 'replica' or 'logical' to speed up loading and changes
> > back to the normal. In this case, the backups taken before the
> > wal_level change cannot be used to restore the database to the point
> > after the wal_level change. So I think backup management tools would
> > want to recognize the time or LSN when/where wal_level is changed to
> > ‘none’ in order to do some actions such as invalidating older backups,
> > re-calculating backup redundancy etc.
> > Actually the same is true when the user changes to ‘minimal’. The
> > tools would need to recognize the time or LSN in this case too. Since
> > temporarily changing wal_level has been an uncommon use case some
> > tools perhaps are not aware of that yet. But since introducing
> > wal_level = ’none’ could make the change common, I think we need to
> provide a way for the tools.
Before my implementation, I'd like to confirm something.

As of now, I think there are two major ideas already.
I think to implement the 1st idea suffices.
If no one disagree with it, I'll proceed with (1) below.

(1) writing the time or LSN in the control file
to indicate when/where wal_level is changed to 'minimal'
from upper level to invalidate the old backups or make alerts to users.
I think we reset this when user executes pg_basebackup successfully.

(2) implementing incremental counters that indicates
drop of wal_level from replica to minimal(or between other levels).
Its purpose was to compare the wal_level changes between snapshots.
When any monitoring tools detect any difference of the counter,
we can predict something happened immediately without checking WAL in between.

The former could give accureate information for backup management
while the latter gives easier way to compare snapshots, I think.

By the way, thankfully I got advice to refer to
Oracle's feature such as Oracle Server Alert or
backup catalog management capability from Tsunakawa-San.
However, because those development would be huge, then
I'd like to choose either the first one or the second one
and for the purpose to give better information, I prefer the first one.

Any comments ?

Best Regards,
    Takamichi Osumi




pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: patch: reduce overhead of execution of CALL statement in no atomic mode from PL/pgSQL
Next
From: Masahiko Sawada
Date:
Subject: Re: New IndexAM API controlling index vacuum strategies