Hi,
On 2020-02-17 13:12:37 +0900, Takashi Menjo wrote:
> I applied my patchset that mmap()-s WAL segments as WAL buffers to
> refs/tags/REL_12_0, and measured and analyzed its performance with
> pgbench. Roughly speaking, When I used *SSD and ext4* to store WAL,
> it was "obviously worse" than the original REL_12_0. VTune told me
> that the CPU time of memcpy() called by CopyXLogRecordToWAL() got
> larger than before.
FWIW, this might largely be because of page faults. In contrast to
before we wouldn't reuse the same pages (because they've been
munmap()/mmap()ed), so the first time they're touched, we'll incur page
faults. Did you try mmap()ing with MAP_POPULATE? It's probably also
worthwhile to try to use MAP_HUGETLB.
Still doubtful it's the right direction, but I'd rather have good
numbers to back me up :)
Greetings,
Andres Freund