On Tue, Jun 11, 2024 at 10:39:51AM +0200, Matthias van de Meent wrote:
> On Tue, 11 Jun 2024 at 04:01, Nathan Bossart <nathandbossart@gmail.com> wrote:
>> I did a handful of benchmarks on an r5.24xlarge that seem to prove your
>> point. The following are the durations of the pg_restore step of
>> pg_upgrade:
>>
>> * 10k empty databases, 128MB shared_buffers
>> WAL_LOG: 1m 01s
>> FILE_COPY: 0m 22s
>>
>> * 10k empty databases, 100GB shared_buffers
>> WAL_LOG: 2m 03s
>> FILE_COPY: 5m 08s
>>
>> * 2.5k databases with 10k tables each, 128MB shared_buffers
>> WAL_LOG: 17m 20s
>> FILE_COPY: 16m 44s
>>
>> * 2.5k databases with 10k tables each, 100GB shared_buffers
>> WAL_LOG: 16m 39s
>> FILE_COPY: 15m 21s
>>
>> I was surprised with the last result, but there's enough other stuff
>> happening during such a test that I hesitate to conclude much.
>
> If you still have the test data set up, could you test the attached
> patch (which does skip the checkpoints in FILE_COPY mode during binary
> upgrades)?
With your patch, I see the following:
* 10k empty databases, 128MB shared_buffers: 0m 27s
* 10k empty databases, 100GB shared_buffers: 1m 44s
I believe the reason the large buffer cache test is still quite a bit
slower is due to the truncation of pg_largeobject (specifically its call to
DropRelationsAllBuffers()). This TRUNCATE command was added in commit
bbe08b8.
--
nathan