Re: [PoC] Non-volatile WAL buffer - Mailing list pgsql-hackers

From Takashi Menjo
Subject Re: [PoC] Non-volatile WAL buffer
Date
Msg-id CAOwnP3NHAbVFOfAawZPs5ezn57_7fcX=KaaQ5YMgirc9rNrijQ@mail.gmail.com
Whole thread Raw
In response to RE: [PoC] Non-volatile WAL buffer  (Takashi Menjo <takashi.menjou.vg@hco.ntt.co.jp>)
List pgsql-hackers
Hi Gang,

I appreciate your patience. I reproduced the results you reported to me, on my environment.

First of all, the condition you gave to me was a little unstable on my environment, so I made the values of {max_,min_,nv}wal_size larger and the pre-warm duration longer to get stable performance. I didn't modify your table and query, and benchmark duration.

Under the stable condition, Original (PMEM) still got better performance than Non-volatile WAL Buffer. To sum up, the reason was that Non-volatile WAL Buffer on Optane PMem spent much more time than Original (PMEM) for XLogInsert when using your table and query. It offset the improvement of XLogFlush, and degraded performance in total. VTune told me that Non-volatile WAL Buffer took more CPU time than Original (PMEM) for (XLogInsert => XLogInsertRecord => CopyXLogRecordsToWAL =>) memcpy while it took less time for XLogFlush. This profile was very similar to the one you reported.

In general, when WAL buffers are on Optane PMem rather than DRAM, it is obvious that it takes more time to memcpy WAL records into the buffers because Optane PMem is a little slower than DRAM. In return for that, Non-volatile WAL Buffer reduces the time to let the records hit to devices because it doesn't need to write them out of the buffers to somewhere else, but just need to flush out of CPU caches to the underlying memory-mapped file.

Your report shows that Non-volatile WAL Buffer on Optane PMem is not good for certain kinds of transactions, and is good for others. I have tried to fix how to insert and flush WAL records, or the configurations or constants that could change performance such as NUM_XLOGINSERT_LOCKS, but Non-volatile WAL Buffer have not achieved better performance than Original (PMEM) yet when using your table and query. I will continue to work on this issue and will report if I have any update.

By the way, did your performance progress reported by pgbench with -P option get down to zero when you run Non-volatile WAL Buffer? If so, your {max_,min_,nv}wal_size might be too small or your checkpoint configurations might be not appropriate. Could you check your results again?

Best regards,
Takashi

--
Takashi Menjo <takashi.menjo@gmail.com>

pgsql-hackers by date:

Previous
From: Kyotaro Horiguchi
Date:
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Some doubious code in pgstat.c