RE: Disable WAL logging to speed up data loading - Mailing list pgsql-hackers

From osumi.takamichi@fujitsu.com
Subject RE: Disable WAL logging to speed up data loading
Date
Msg-id OSBPR01MB48887EF72DE013EDFB987BC5EDEA0@OSBPR01MB4888.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Disable WAL logging to speed up data loading  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Disable WAL logging to speed up data loading
List pgsql-hackers
Hello, Stephen


On Tuesday, Nov 3, 2020 3:02 AM Stephen Frost <sfrost@snowman.net> wrote:
> * Magnus Hagander (magnus@hagander.net) wrote:
> > On Mon, Nov 2, 2020 at 4:28 PM Robert Haas <robertmhaas@gmail.com>
> wrote:
> > > On Thu, Oct 29, 2020 at 4:00 PM Fujii Masao
> <masao.fujii@oss.nttdata.com> wrote:
> > > > Yes. What I meant was such a safe guard needs to be implemented.
> > > >
> > > > This may mean that if we want to recover the database from that
> > > > backup, we need to specify the recovery target so that the archive
> > > > recovery stops just before the WAL record indicating wal_level change.
> > >
> > > Yeah, I think we need these kinds of safeguards, for sure.
> > >
> > > I'm also concerned about the way that this proposed feature
> > > interacts with incremental backup capabilities that already exist in
> > > tools like pgBackRest, EDB's BART, pg_probackup, and future things
> > > we might want to introduce into core, along the lines of what I have
> > > previously proposed. Now, I think pgBackRest uses only timestamps
> > > and checksums, so it probably doesn't care, but some of the other
> > > solutions rely on WAL-scanning to gather a list of changed blocks. I
> > > guess there's no reason that they can't notice the wal_level being
> > > changed and do the right thing; they should probably have that kind
> > > of capability already. Still, it strikes me that it might be useful
> > > if we had a stronger mechanism.
>
> Checking the WAL level certainly seems critical to anything that's reading the
> WAL.  We certainly do this already when running as a
> replica:
>
> ereport(WARNING,
>         (errmsg("WAL was generated with wal_level=minimal, data may be
> missing"),
>         errhint("This happens if you temporarily set wal_level=minimal
> without taking a new base backup.")));
>
> There's definitely a question about if a WARNING there is really sufficient or
> not, considering that you could end up with 'logged'
> tables on the replica that are missing data, but I'm not sure that inventing a
> new, independent, mechanism for checking WAL level changes makes sense.
>
> While pgbackrest backups of a primary wouldn't be impacted, it does support
> backing up from a replica (as do other backup systems, of
> course) and if data that's supposed to be on the replica isn't there because
> someone restarted PG with wal_level=minimal and then flipped it back to
> replica and got the replica to move past that part of the WAL by turning off hot
> standby, replaying, and then turning it back on, then the backup is going to
> also be missing that data.
>
> Perhaps that's where we need to have a stronger mechanism though- that is,
> if we hit the above WARNING, ever, set a flag somewhere that tools can check
> and which we could also check and throw further warnings about.  In other
> words, every time this replica is started, we could check for this flag and
> throw the same warning above about how data may be missing, and we could
> have pg_basebackup flat out refuse to back up from a replica where this flag
> has been set (maybe with some kind of override mechanism...  maybe not).
> After all, that's where the real issue here is, isn't it?
>
> > > I'm not exactly sure what that would look like, but suppose we had a
> > > feature where every time wal_level drops below replica, a counter
> > > gets incremented by 1, and that counter is saved in the control
> > > file. Or maybe when wal_level drops below minimal to none. Or maybe
> > > there are two counters. Anyway, the idea is that if you have a
> > > snapshot of the cluster at one time and a snapshot at another time,
> > > you can see whether anything scary has happened in the middle
> > > without needing all of the WAL in between.
> > >
> > > Maybe this is off-topic for this thread or not really needed, but
> > > I'm not sure. I don't think wal_level=none is a bad idea
> > > intrinsically, but I think it would be easy to implement it poorly
> > > and end up harming a lot of users. I have no problem with giving
> > > people a way to do dangerous things, but we should do our best to
> > > let people know how much danger they've incurred.
>
> I'm not sure that wal_level=none is really the right way to address this
> use-case.  We already have unlogged tables and that's pretty clean and
> meets the "we want to load data fast without having to pay for WAL" use case.
> The argument here seems to be that to take advantage of unlogged tables
> requires the tools using PG to know how to issue a 'CREATE UNLOGGED
> TABLE' command instead of a 'CREATE TABLE' command.  That doesn't
> seem like a huge leap, but we could make it easier by just adding a
> 'table_log_default' or such GUC that could be set on the data loading role to
> have all tables created by it be unlogged.
I'm afraid to say that in the case to setup all tables as unlogged,
the user are forced to be under tension to
back up *all* commands from application, in preparation for unexpected crash.
This is because whenever the server crashes,
the unlogged tables are truncated and the DBA needs to
input the processings after the last backup again without exception.
I didn't think that this was easy and satisfied the user.

In addition, as long as the tables are unlogged, the user cannot be released from
this condition or (requirement ?) to back up all commands or
to guarantee that all commands are repeatable for the DBA.

When I consider the use case is the system of data warehouse
as described upthread, the size of each table can be large.
Thus, changing the status from unlogged to logged (recoverable)
takes much time under the current circumstances, which was discussed before.

By having the limited window of time,
during wal_level=none, I'd like to make wal_level=none work to
localize and minimize the burden to guarantee all commands are
repeatable. To achieve this, after switching wal_level from none to higher ones,
the patch must ensure crash recovery, though.

Sorry that my current patch doesn't complete this aspect fully at present
but, may I have your opinion about this ?

Best,
    Takamichi Osumi



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: Seino Yuki
Date:
Subject: Re: [PATCH] Add features to pg_stat_statements