Re: Disable WAL logging to speed up data loading - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Disable WAL logging to speed up data loading
Date
Msg-id 20201110.095941.1584254872856462674.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Disable WAL logging to speed up data loading  (Stephen Frost <sfrost@snowman.net>)
Responses RE: Disable WAL logging to speed up data loading  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
List pgsql-hackers
At Mon, 9 Nov 2020 10:18:08 -0500, Stephen Frost <sfrost@snowman.net> wrote in 
> Greetings,
> 
> * osumi.takamichi@fujitsu.com (osumi.takamichi@fujitsu.com) wrote:
> > When I consider the use case is the system of data warehouse
> > as described upthread, the size of each table can be large.
> > Thus, changing the status from unlogged to logged (recoverable)
> > takes much time under the current circumstances, which was discussed before.
> 
> Ok- so the issue is that, today, we dump all of the table into the WAL
> when we go from unlogged to logged, but as I outlined previously,
> perhaps that's because we're missing a trick there when
> wal_level=minimal.  If wal_level=minimal, then it would seem like we
> could lock the table, then sync it and then mark is as logged, which is

FWIW, the following is that, I think it works not only when
wal_level=minimal for SET UNLOGGED, and only works when minimal for
SET LOGGED.

https://www.postgresql.org/message-id/20201002.100621.1668918756520136893.horikyota.ntt@gmail.com

| For fuel(?) of the discussion, I tried a very-quick PoC for in-place
| ALTER TABLE SET LOGGED/UNLOGGED and resulted as attached. After some
| trials of several ways, I drifted to the following way after poking
| several ways.
| 
| 1. Flip BM_PERMANENT of active buffers
| 2. adding/removing init fork
| 3. sync files,
| 4. Flip pg_class.relpersistence.
| 
| It always skips table copy in the SET UNLOGGED case, and only when
| wal_level=minimal in the SET LOGGED case.  Crash recovery seems
| working by some brief testing by hand.

> more-or-less what you're asking to have be effectively done with the
> proposed wal_level=none, but this would be an optimization for all
> existing users of wal_level=minimal who have unlogged tables that they
> want to change to logged, and this works on a per-table basis instead,
> which seems like a better approach than a cluster-wide setting.
> 
> > By having the limited window of time,
> > during wal_level=none, I'd like to make wal_level=none work to
> > localize and minimize the burden to guarantee all commands are
> > repeatable. To achieve this, after switching wal_level from none to higher ones,
> > the patch must ensure crash recovery, though.
> 
> Perhaps a helper command could be added to ALTER TABLE ALL IN TABLESPACE
> to marked a bunch of unlogged tables over to being logged would be good
> to add too.

I agree to this aspect of the in-place flipping of UNLOGGED.

> > Sorry that my current patch doesn't complete this aspect fully at present
> > but, may I have your opinion about this ?
> 
> Presently, my feeling is that we could address this use-case without
> having to introduce a new cluster-wide WAL level, and that's the
> direction I'd want to see this going.  Perhaps I'm missing something
> about why the approach I've set forth above wouldn't work, and
> wal_level=none would, but I've not seen it yet.

Couldn't we have something like the following?

 ALTER TABLE table1, table2, table3 SET UNLOGGED;

That is, multiple target object specification in ALTER TABLE sttatement.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: POC: postgres_fdw insert batching
Next
From: Peter Geoghegan
Date:
Subject: Adding an aminsert() hint that triggers bottom-up index deletion for UPDATEs that can't use HOT