Re: [HACKERS] Horrible CREATE DATABASE Performance in High Sierra - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Horrible CREATE DATABASE Performance in High Sierra
Date
Msg-id 10269.1506968615@sss.pgh.pa.us
Whole thread Raw
In response to [HACKERS] Horrible CREATE DATABASE Performance in High Sierra  (Brent Dearth <brent.dearth@gmail.com>)
Responses Re: [HACKERS] Horrible CREATE DATABASE Performance in High Sierra
List pgsql-hackers
Brent Dearth <brent.dearth@gmail.com> writes:
> I just recently "upgraded" to High Sierra and experiencing horrendous CREATE
> DATABASE performance. Creating a database from a 3G template DB used to
> take ~1m but post-upgrade is taking ~22m at a sustained write of around
> 4MB/s. Occasionally, attempting to create an empty database hangs
> indefinitely as well. When this happens, restarting the Postgres server
> allows empty database initialization in ~1s.

What PG version are you running?

I tried to reproduce this, using HEAD and what I had handy:
(a) Late 2016 MacBook Pro, 2.7GHz i7, still on Sierra
(b) Late 2013 MacBook Pro, 2.3GHz i7, High Sierra, drive is converted to APFS

I made a ~7.5GB test database using "pgbench -i -s 500 bench" and
then cloned it with "create database b2 with template bench".

Case 1: fsync off.
Machine A did the clone in 5.6 seconds, machine B in 12.9 seconds.

Considering the CPU speed difference and the fact that Apple put
significantly faster SSDs into the 2016 models, I'm not sure this
difference is due to anything but better hardware.

Case 2: fsync on.
Machine A did the clone in 7.5 seconds, machine B in 2523.5 sec (42 min!).

So something is badly busted in APFS' handling of fsync, and/or
we're doing it in a bad way.

Interestingly, pg_test_fsync shows only about a factor-of-2 difference
in the timings for regular file fsyncs.  So I poked into non-fsync
logic that we'd added recently, and after awhile found that diking out
the msync code path in pg_flush_data reduces machine B's time to an
entirely reasonable 11.5 seconds.

In short, therefore, APFS cannot cope with the way we're using msync().
I observe that the copy gets slower and slower as it runs (watching the
transfer rate with "iostat 1"), which makes me wonder if there's some
sort of O(N^2) issue in the kernel logic for this.  But anyway, as
a short-term workaround you might try

diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index b0c174284b..af35de5f7d 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -451,7 +451,7 @@ pg_flush_data(int fd, off_t offset, off_t nbytes)       return;   }#endif
-#if !defined(WIN32) && defined(MS_ASYNC)
+#if 0   {       void       *p;       static int  pagesize = 0;

        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [HACKERS] list of credits for release notes
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] generated columns