Disable WAL logging to speed up data loading - Mailing list pgsql-hackers
From | tsunakawa.takay@fujitsu.com |
---|---|
Subject | Disable WAL logging to speed up data loading |
Date | |
Msg-id | TYAPR01MB29901EBE5A3ACCE55BA99186FE320@TYAPR01MB2990.jpnprd01.prod.outlook.com Whole thread Raw |
Responses |
Re: Disable WAL logging to speed up data loading
Re: Disable WAL logging to speed up data loading |
List | pgsql-hackers |
Hello, We'd like to propose a feature to disable WAL to speed up data loading. This was inspired by a feature added in the latestMySQL. I wish you won't fear this feature... BACKGROUND ======================================== This branches off from [1] as mentioned therein. Briefly speaking, a customer wants to shorten the time for nightly loadingof data into their data warehouse as much as possible to be prepared for using the data warehouse for new things. Currently, they are using Oracle's SQL*Loader with its no-logging feature. They want a similar feature to migrate to Postgres. Other than the data loading performance, they don't want to be concerned about the storage for large volumes ofWAL. In [1], we thought about something like Oracle's per-table no-logging feature, but it seems difficult (or at least not easy.) Meanwhile, I found another feature added in the latest MySQL 8.0.21 [2]. This proposal follows it almost directly. That satisfies the customer request. As an aside, it's also conceivable that in the near future, users could see the WAL bottleneck (WAL buffer or disk) whenthey utilize the parallel COPY that is being developed in the community. FUNCTIONAL SPECIFICATION ======================================== Add a new value 'none' to the server configuration parameter wal_level. With this setting: * No WAL is emitted. * The server refuses to start (pg_ctl start fails) after an abnormal shutdown due to power outage, pg_ctl's immediate shutdown,etc, showing a straightforward message like MySQL. * Features like continuous archiving, pg_basebackup, and streaming/logical replication that requires wal_level >= replicaare not available. * The user can use all features again if you shut down the server successfully after data loading and reset wal_level toa value other than none. He needs to take a base backup or rebuild the replication standby after restarting the server. In addition to the cosmetic modifications to the manual articles that refer to wal_level, add a clause or paragraphs to thefollowing sections to let users know the availability of this feature. 14.4. Populating a Database 18.6.1. Upgrading Data via pg_dumpall PROGRAM DESIGN (main point only) ======================================== As in the bootstrap mode (during initdb), when wal_level = none, XLogInsert() does nothing and just returns a fixed value,which is the tail of the last shutdown checkpoint WAL record. As a result, the value is set to the relation page header'sLSN field. In addition, it might be worth having XLogBeginInsert() and XLogRec...() to check wal_level and just return. I don't expectmuch from this, but it may be interesting to give it a try to see the squeezed performance. StartupXLOG() checks the wal_level setting in pg_control and quits the startup with ereport(FATAL) accordingly. [1] Implement UNLOGGED clause for COPY FROM https://www.postgresql.org/message-id/OSBPR01MB488887C0BDC5129C65DFC5E5ED640@OSBPR01MB4888.jpnprd01.prod.outlook.com [2] Disabling Redo Logging https://dev.mysql.com/doc/refman/8.0/en/innodb-redo-log.html#innodb-disable-redo-logging Regards Takayuki Tsunakawa
pgsql-hackers by date: