Re: Disable WAL logging to speed up data loading - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Disable WAL logging to speed up data loading
Date
Msg-id CAExHW5vz2pTHyoE4uetasuA6G5f03B4twmxgpK2mcPGu-L5VSw@mail.gmail.com
Whole thread Raw
In response to Disable WAL logging to speed up data loading  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Responses RE: Disable WAL logging to speed up data loading
List pgsql-hackers
Can they use a database with all unlogged tables?

On Tue, Sep 29, 2020 at 1:58 PM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
>
> Hello,
>
>
> We'd like to propose a feature to disable WAL to speed up data loading.  This was inspired by a feature added in the
latestMySQL.  I wish you won't fear this feature... 
>
>
> BACKGROUND
> ========================================
>
> This branches off from [1] as mentioned therein.  Briefly speaking, a customer wants to shorten the time for nightly
loadingof data into their data warehouse as much as possible to be prepared for using the data warehouse for new
things.
>
> Currently, they are using Oracle's SQL*Loader with its no-logging feature.  They want a similar feature to migrate to
Postgres. Other than the data loading performance, they don't want to be concerned about the storage for large volumes
ofWAL. 
>
> In [1], we thought about something like Oracle's per-table no-logging feature, but it seems difficult (or at least
noteasy.)  Meanwhile, I found another feature added in the latest MySQL 8.0.21 [2].  This proposal follows it almost
directly. That satisfies the customer request. 
>
> As an aside, it's also conceivable that in the near future, users could see the WAL bottleneck (WAL buffer or disk)
whenthey utilize the parallel COPY that is being developed in the community. 
>
>
> FUNCTIONAL SPECIFICATION
> ========================================
>
> Add a new value 'none' to the server configuration parameter wal_level.  With this setting:
>
> * No WAL is emitted.
>
> * The server refuses to start (pg_ctl start fails) after an abnormal shutdown due to power outage, pg_ctl's immediate
shutdown,etc, showing a straightforward message like MySQL. 
>
> * Features like continuous archiving, pg_basebackup, and streaming/logical replication that requires wal_level >=
replicaare not available. 
>
> * The user can use all features again if you shut down the server successfully after data loading and reset wal_level
toa value other than none.  He needs to take a base backup or rebuild the replication standby after restarting the
server.
>
>
> In addition to the cosmetic modifications to the manual articles that refer to wal_level, add a clause or paragraphs
tothe following sections to let users know the availability of this feature. 
>
> 14.4. Populating a Database
> 18.6.1. Upgrading Data via pg_dumpall
>
>
> PROGRAM DESIGN (main point only)
> ========================================
>
> As in the bootstrap mode (during initdb), when wal_level = none, XLogInsert() does nothing and just returns a fixed
value,which is the tail of the last shutdown checkpoint WAL record.  As a result, the value is set to the relation page
header'sLSN field. 
>
> In addition, it might be worth having XLogBeginInsert() and XLogRec...() to check wal_level and just return.  I don't
expectmuch from this, but it may be interesting to give it a try to see the squeezed performance. 
>
> StartupXLOG() checks the wal_level setting in pg_control and quits the startup with ereport(FATAL) accordingly.
>
>
> [1]
> Implement UNLOGGED clause for COPY FROM
> https://www.postgresql.org/message-id/OSBPR01MB488887C0BDC5129C65DFC5E5ED640@OSBPR01MB4888.jpnprd01.prod.outlook.com
>
> [2]
> Disabling Redo Logging
> https://dev.mysql.com/doc/refman/8.0/en/innodb-redo-log.html#innodb-disable-redo-logging
>
>
> Regards
> Takayuki Tsunakawa
>
>
>


--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Corner-case bug in pg_rewind
Next
From: Amit Kapila
Date:
Subject: Re: Parallel copy