Re: use CREATE DATABASE STRATEGY = FILE_COPY in pg_upgrade - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: use CREATE DATABASE STRATEGY = FILE_COPY in pg_upgrade
Date
Msg-id CAEze2WgjAoU30Kr5y8U4nTNi1F_RMW4HGo1L9ALaVTJYJbDjww@mail.gmail.com
Whole thread Raw
In response to Re: use CREATE DATABASE STRATEGY = FILE_COPY in pg_upgrade  (Ranier Vilela <ranier.vf@gmail.com>)
Responses Re: use CREATE DATABASE STRATEGY = FILE_COPY in pg_upgrade
Re: use CREATE DATABASE STRATEGY = FILE_COPY in pg_upgrade
List pgsql-hackers
On Wed, 5 Jun 2024 at 18:47, Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Em ter., 4 de jun. de 2024 às 16:39, Nathan Bossart <nathandbossart@gmail.com> escreveu:
>>
>> I noticed that the "Restoring database schemas in the new cluster" part of
>> pg_upgrade can take a while if you have many databases, so I experimented
>> with a couple different settings to see if there are any easy ways to speed
>> it up.  The FILE_COPY strategy for CREATE DATABASE helped quite
>> significantly on my laptop.  For ~3k empty databases, this step went from
>> ~100 seconds to ~30 seconds with the attached patch.  I see commit ad43a41
>> made a similar change for initdb, so there might even be an argument for
>> back-patching this to v15 (where STRATEGY was introduced).  One thing I
>> still need to verify is that this doesn't harm anything when there are lots
>> of objects in the databases, i.e., more WAL generated during many
>> concurrent CREATE-DATABASE-induced checkpoints.
>>
>> Thoughts?
>
> Why not use it too, if not binary_upgrade?

Because in the normal case (not during binary_upgrade) you don't want
to have to generate 2 checkpoints for every created database,
especially not when your shared buffers are large. Checkpoints' costs
scale approximately linearly with the size of shared buffers, so being
able to skip those checkpoints (with strategy=WAL_LOG) will save a lot
of performance in the systems where this performance impact matters
most.

>> I noticed that the "Restoring database schemas in the new cluster" part of
>> pg_upgrade can take a while if you have many databases, so I experimented
>> with a couple different settings to see if there are any easy ways to speed
>> it up.  The FILE_COPY strategy for CREATE DATABASE helped quite
>> significantly on my laptop.  For ~3k empty databases, this step went from
>> ~100 seconds to ~30 seconds with the attached patch.

As for "on my laptop", that sounds very reasonable, but could you
check the performance on systems with larger shared buffer
configurations? I'd imagine (but haven't checked) that binary upgrade
target systems may already be using the shared_buffers from their
source system, which would cause a severe regression when the
to-be-upgraded system has large shared buffers. For initdb the
database size is known in advance and shared_buffers is approximately
empty, but the same is not (always) true when we're doing binary
upgrades.

Kind regards,

Matthias van de Meent



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: use CREATE DATABASE STRATEGY = FILE_COPY in pg_upgrade
Next
From: Tom Lane
Date:
Subject: Improving PL/Tcl's error context reports