Re: Enhance traceability of wal_level changes for backup management - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Enhance traceability of wal_level changes for backup management
Date
Msg-id 20210108221610.GZ27507@tamriel.snowman.net
Whole thread Raw
In response to RE: Enhance traceability of wal_level changes for backup management  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
List pgsql-hackers
Greetings,

* osumi.takamichi@fujitsu.com (osumi.takamichi@fujitsu.com) wrote:
> On Thursday, January 7, 2021 2:40 AM Stephen Frost <sfrost@snowman.net> wrote:
> > * osumi.takamichi@fujitsu.com (osumi.takamichi@fujitsu.com) wrote:
> > > You said
> > > > The use case I imagined is that the user temporarily changes
> > > > wal_level to 'none' from 'replica' or 'logical' to speed up loading
> > > > and changes back to the normal. In this case, the backups taken
> > > > before the wal_level change cannot be used to restore the database
> > > > to the point after the wal_level change. So I think backup
> > > > management tools would want to recognize the time or LSN when/where
> > > > wal_level is changed to ‘none’ in order to do some actions such as
> > > > invalidating older backups, re-calculating backup redundancy etc.
> > > > Actually the same is true when the user changes to ‘minimal’. The
> > > > tools would need to recognize the time or LSN in this case too.
> > > > Since temporarily changing wal_level has been an uncommon use case
> > > > some tools perhaps are not aware of that yet. But since introducing
> > > > wal_level = ’none’ could make the change common, I think we need to
> > provide a way for the tools.
> >
> > I continue to be against the idea of introducing another wal_level.  If there's
> > additional things we can do to reduce WAL traffic while we continue to use it to
> > keep the system in a consistent state then we should implement those for the
> > 'minimal' case and be done with it.

> The new wal_level=none patch achieves something that cannot be done
> or cannot be implemented together with wal_level='minimal' clearly.

I disagree.

> Did you have a look at the peformance evaluation result that I conducted in [1] ?
> It proved that data loading of 'none' is much faster than that of 'minimal'.

That test claims to have generated 10G worth of WAL, which makes it seem
pretty likely that you didn't use UNLOGGED tables, and didn't create the
table in the same transaction as you performed the data loading for that
table, so, naturally, you got the full amount of WAL.

> > Adding another wal_level is just going to be confusing and increase complexity,
> > and the chances that someone will end up in a bad state.
> Even if when we committed another idea,
> that is "ALTER TABLE tbl SET UNLOGGED/LOGGED without copying relation data",
> the complexity like taking a full backup before bulk data loading didn't change
> and when any accidents happened during no wal logging for specific table with the improvement,
> user would need to start from the backup again. This looks same to me.

Yes, there is still the issue that running with wal_level = minimal for
a period of time means that you can't perform PITR from before the
change to minimal through to the time after it's been changed back to
something higher than minimal.  That's no different, I agree.  That
isn't an argument to introduce another WAL level though.

> Additionally, the patch itself in that thread is big and more complicated.

This isn't a reason to avoid introducing another wal_level.

> The complexity you meant is the wal_level's impact to backup management tools or anything else ?

The complexity of having another wal_level and having to document it and
explain how it behaves to users, particularly when, effectively, it
shouldn't actually be any different from wal_level = minimal, assuming
we've found and implemented the interesting optimizations regarding
wal_level = minimal.

> > > I wondered, couldn't backup management tools utilize the information
> > > in the backup that is needed to be made when wal_level is changed to "none"
> > for example ?
> >
> > What information is that, exactly?  If there's a way to detect that the wal_level
> > has been flipped to 'minimal' and then back to 'replica', other than scanning the
> > WAL, we'd certainly like to hear of it, so we can implement logic in pgbackrest
> > to detect that happening.  I'll admit that I've not gone hunting for such of late,
> > but I don't recall seeing anything that would change that either.
> Excuse me for making you confused.
> I was thinking about control file in the backup as information.
>
> I'm not familiar with the internals of those backup management tools
> but do they monitor the control file and its values of the runnning server at short intervals ?
> And, if they don't do so and if we want accurate time or LSN that indicates wal_level changes,
> I thought we could pick up exact information from control file of cluster directory or its backup
> during server stop (because changing wal_level requires server stop).
> That's all. Sorry for the noise.

If it's in the control file then pg_controldata should be able to pull
it ount and all of the backup tools should be able to manage that, even
if they can't read it directly like tools which are also written in C,
so having it in the control file seems alright to me.

> > The idea proposed above about having the LSN and time recorded when a
> > wal_level change to minimal happens, presumably in pg_control, seems like it
> > could work as a way to allow external tools to more easily figure out if the
> > wal_level's been changed to minimal.  Perhaps there's a reason to track
> > changes between replica and logical but I can't think of any offhand and backup
> > tools would still need to know if the wal level was set to *minimal*
> > independently of changes between replica and logical.

> Here, probably we use different assumptions.
> What you say makes sense if we commit neither the patch of
> ALTER TABLE SET UNLOGGED/LOGGED without copy (during wal_level=minimal)
> nor the patch of new wal_level 'none' ?

Yes, it does.

> We were talking about a case that fast bulk data loading is needed
> and the user changes wal_level to 'none' or 'minimal' temporarily
> from higher level in order to speed up data loading and make it back to the higher level again.
> After this operation, we need to invalidate the old backups taken before the data loading.
> In this case, we have to track the change to 'minimal'. Didn't it make sense ?

The old backups don't actually need to be invaliated, but we want to
strongly encourage a new backup be taken as soon as possible since WAL
replay from an older backup through this period would end up having the
wrong result.  We really shouldn't need to track anything more than the
"LSN when wal_level was last seen to be less than replica" or something
along those lines.  I was questioning if there was any need to track the
changes to wal_level at a more granular level- none come to mind for me,
but perhaps someone else sees a reason to do so.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Key management with tests
Next
From: Tomas Vondra
Date:
Subject: Re: list of extended statistics on psql