Home > mailing lists

RE: Disable WAL logging to speed up data loading - Mailing list pgsql-hackers

From	osumi.takamichi@fujitsu.com
Subject	RE: Disable WAL logging to speed up data loading
Date	October 28, 2020 07:11:28
Msg-id	OSBPR01MB48882B8D091CE688C4DE2731ED170@OSBPR01MB4888.jpnprd01.prod.outlook.com Whole thread Raw
In response to	RE: Disable WAL logging to speed up data loading ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Responses	Re: Disable WAL logging to speed up data loading RE: Disable WAL logging to speed up data loading
List	pgsql-hackers

Tree view

Hi,


I wrote and attached the first patch to disable WAL logging.
This patch passes the regression test of check-world already
and is formatted by pgindent.

Also, I conducted 3 types of performance tests to
clarify the rough benefits of the patch.

I compared two wal_levels both 'minimal' and 'none'.
For both levels, I measured
(1) cluster's restore from pg_dumpall output,
(2) COPY initial data to a defined table as initial data loading, and
(3) COPY data to a defined table with tuples, 3 times each for all cases.
After that, calculated the average and the ratio of the loading speed.
The conclusion is that wal_level=none cuts about 20% of the loading speed
compared to 'minimal' in the three cases above. For sure, we could tune the configuration of
postgresql.conf further but roughly it's meaningful to share the result, I think.
'shared_buffers' was set to 40% of RAM while 'maintenance_work_mem'
was set to 20%. I set max_wal_senders = 0 this time.

The input data was generated from pgbench with 1000 scale factor.
It's about 9.3GB. For the table definition or
the initial data for appended data loading test case,
I used pgbench to set up the schema as well.
Sharing other scenario to measure is welcome.

I need to say that the current patch doesn't take the change of wal_level='none'
from/to other ones into account fully or other commands like ones for two phase commit yet.
Sorry for that.

Also, to return the value of last shut down LSN in XLogInsert(),
it acquires lock of control file every time, which was not good clearly.
I tried to replace that code like having a LSN cache as one variable
but I was not sure where I should put the function to set the initial value of the LSN.
After LocalProcessControlFile() in PostmasterMain() was not the right place.
I'd be happy if someone gives me an advice about it.


Best,
    Takamichi Osumi

Attachment

disable_WAL_logging_v01.patch

pgsql-hackers by date:

From: Craig Ringer
Date: 28 October 2020, 07:02:46
Subject: Re: Internal key management system

From: Bharath Rupireddy
Date: 28 October 2020, 07:18:06
Subject: Re: Log message for GSS connection is missing once connection authorization is successful.

RE: Disable WAL logging to speed up data loading - Mailing list pgsql-hackers

Attachment

Previous

Next