Home > mailing lists

RE: [PoC] Non-volatile WAL buffer - Mailing list pgsql-hackers

From	tsunakawa.takay@fujitsu.com
Subject	RE: [PoC] Non-volatile WAL buffer
Date	February 15, 2021 01:19:34
Msg-id	TYAPR01MB29905D0578CEC4CA0CCC1EC8FE889@TYAPR01MB2990.jpnprd01.prod.outlook.com Whole thread
In response to	Re: [PoC] Non-volatile WAL buffer (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses	Re: [PoC] Non-volatile WAL buffer
List	pgsql-hackers

Tree view

From: Masahiko Sawada <sawada.mshk@gmail.com>
> I've done some performance benchmarks with the master and NTT v4
> patch. Let me share the results.
> 
...
>         master  NTT     master-unlogged
> 32      113209  67107   154298
> 64      144880  54289   178883
> 96      151405  50562   180018
> 
> "master-unlogged" is the same setup as "master" except for using
> unlogged tables (using --unlogged-tables pgbench option). The TPS
> increased by about 20% compared to "master" case (i.g., logged table
> case). The reason why I experimented unlogged table case as well is
> that we can think these results as an ideal performance if we were
> able to write WAL records in 0 sec. IOW, even if the PMEM patch would
> significantly improve WAL logging performance, I think it could not
> exceed this performance. But hope is that if we currently have a
> performance bottle-neck in WAL logging (.e.g, locking and writing
> WAL), removing or minimizing WAL logging would bring a chance to
> further improve performance by eliminating the new-coming bottle-neck.

Could you tell us the specifics of the storage for WAL, e.g., SSD/HDD, the interface is NVMe/SAS/SATA, read-write
throughputand latency (on the product catalog), and the product model?
 

Was the WAL stored on a storage device separate from the other files?  I want to know if the comparison is as fair as
possible. I guess that in the NTT (PMEM) case, the WAL traffic is not affected by the I/Os of the other files.
 

What would the comparison look like between master and unlogged-master if you place WAL on a DAX-aware filesystem like
xfsor ext4 on PMEM, which Oracle recommends as REDO log storage?  That is, if we place the WAL on the fastest storage
configurationpossible, what would be the difference between the logged and unlogged?
 

I'm asking these to know if we consider it worthwhile to make further efforts in special code for WAL on PMEM.


> Besides, I've checked the main wait events on each experiment using
> pg_wait_sampling. Here are the top 5 wait events on "master" case
> excluding wait events on the main function of auxiliary processes:
> 
>  event_type |        event         |  sum
> ------------+----------------------+-------
>  Client     | ClientRead           | 46902
>  LWLock     | WALWrite             | 33405
>  IPC        | ProcArrayGroupUpdate |  8855
>  LWLock     | WALInsert            |  3215
>  LWLock     | ProcArray            |  3022
> 
> We can see the wait event on WALWrite lwlock acquisition happened many
> times and it was the primary wait event.
> 
> The result of "ntt" case is:
> 
>  event_type |        event         |  sum
> ------------+----------------------+--------
>  LWLock     | WALInsert            | 126487
>  Client     | ClientRead           |  12173
>  LWLock     | BufferContent        |   4480
>  Lock       | transactionid        |   2017
>  IPC        | ProcArrayGroupUpdate |    924
> 
> The wait event on WALWrite lwlock disappeared. Instead, there were
> many wait events on WALInsert lwlock. I've not investigated this
> result yet. This could be because the v4 patch acquires WALInsert lock
> more than necessary or writing WAL records to PMEM took more time than
> writing to DRAM as Tomas mentioned before.

Increasing NUM_XLOGINSERT_LOCKS might improve the result, but I don't have much hope because PMEM appears to have
limitedconcurrency...
 


Regards
Takayuki Tsunakawa

pgsql-hackers by date:

From: Thomas Munro
Date: 15 February 2021, 01:15:51
Subject: GCC warning in back branches

From: Thomas Munro
Date: 15 February 2021, 01:19:45
Subject: Re: doing something about the broken dynloader.h symlink

RE: [PoC] Non-volatile WAL buffer - Mailing list pgsql-hackers

Previous

Next