[COMMITTERS] pgsql: Increase distance between flush requests during bulk filecopies - Mailing list pgsql-committers

From Tom Lane
Subject [COMMITTERS] pgsql: Increase distance between flush requests during bulk filecopies
Date
Msg-id E1e1HCr-0008ED-7l@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Increase distance between flush requests during bulk file copies.

copy_file() reads and writes data 64KB at a time (with default BLCKSZ),
and historically has issued a pg_flush_data request after each write.
This turns out to interact really badly with macOS's new APFS file
system: a large file copy takes over 100X longer than it ought to on
APFS, as reported by Brent Dearth.  While that's arguably a macOS bug,
it's not clear whether Apple will do anything about it in the near
future, and in any case experimentation suggests that issuing flushes
a bit less often can be helpful on other platforms too.

Hence, rearrange the logic in copy_file() so that flush requests are
issued once per N writes rather than every time through the loop.
I set the FLUSH_DISTANCE to 32MB on macOS (any less than that still
results in a noticeable speed degradation on APFS), but 1MB elsewhere.
In limited testing on Linux and FreeBSD, this seems slightly faster
than the previous code, and certainly no worse.  It helps noticeably
on macOS even with the older HFS filesystem.

A simpler change would have been to just increase the size of the
copy buffer without changing the loop logic, but that seems likely
to trash the processor cache without really helping much.

Back-patch to 9.6 where we introduced msync() as an implementation
option for pg_flush_data().  The problem seems specific to APFS's
mmap/msync support, so I don't think we need to go further back.

Discussion: https://postgr.es/m/CADkxhTNv-j2jw2g8H57deMeAbfRgYBoLmVuXkC=YCFBXRuCOww@mail.gmail.com

Branch
------
REL_10_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/c3723317d08ce6df6c81234f7daebe7fd86a6ecb

Modified Files
--------------
src/backend/storage/file/copydir.c | 38 ++++++++++++++++++++++++++++++--------
1 file changed, 30 insertions(+), 8 deletions(-)


--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: [COMMITTERS] pgsql: Reduce "X = X" to "X IS NOT NULL", if it's easy to do so.
Next
From: Andres Freund
Date:
Subject: [COMMITTERS] pgsql: Reduce memory usage of targetlist SRFs.