Re: Performance degradation on concurrent COPY into a single relation in PG16. - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Performance degradation on concurrent COPY into a single relation in PG16.
Date
Msg-id CAD21AoBb9Sbddh+nQk1BxUFaRHaa+fL8fCzO-Lvxqb6ZcpAHqw@mail.gmail.com
Whole thread Raw
In response to Re: Performance degradation on concurrent COPY into a single relation in PG16.  (Andres Freund <andres@anarazel.de>)
Responses Re: Performance degradation on concurrent COPY into a single relation in PG16.
List pgsql-hackers
On Wed, Jul 12, 2023 at 3:52 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-07-03 11:55:13 +0900, Masahiko Sawada wrote:
> > While testing PG16, I observed that in PG16 there is a big performance
> > degradation in concurrent COPY into a single relation with 2 - 16
> > clients in my environment. I've attached a test script that measures
> > the execution time of COPYing 5GB data in total to the single relation
> > while changing the number of concurrent insertions, in PG16 and PG15.
> > Here are the results on my environment (EC2 instance, RHEL 8.6, 128
> > vCPUs, 512GB RAM):
> >
> > * PG15 (4b15868b69)
> > PG15: nclients = 1, execution time = 14.181
> >
> > * PG16 (c24e9ef330)
> > PG16: nclients = 1, execution time = 17.112
>
> > The relevant commit is 00d1e02be2 "hio: Use ExtendBufferedRelBy() to
> > extend tables more efficiently". With commit 1cbbee0338 (the previous
> > commit of 00d1e02be2), I got a better numbers, it didn't have a better
> > scalability, though:
> >
> > PG16: nclients = 1, execution time = 17.444
>
> I think the single client case is indicative of an independent regression, or
> rather regressions - it can't have anything to do with the fallocate() issue
> and reproduces before that too in your numbers.

Right.

>
> 1) COPY got slower, due to:
> 9f8377f7a27 Add a DEFAULT option to COPY  FROM
>
> This added a new palloc()/free() to every call to NextCopyFrom(). It's not at
> all clear to me why this needs to happen in NextCopyFrom(), particularly
> because it's already stored in CopyFromState?

Yeah, it seems to me that we can palloc the bool array once and use it
for the entire COPY FROM. With the attached small patch, the
performance becomes much better:

15:
14.70500 sec

16:
17.42900 sec

16 w/ patch:
14.85600 sec

>
>
> 2) pg_strtoint32_safe() got substantially slower, mainly due
>    to
> faff8f8e47f Allow underscores in integer and numeric constants.
> 6fcda9aba83 Non-decimal integer literals

Agreed.

>
> pinned to one cpu, turbo mode disabled, I get the following best-of-three times for
>   copy test from '/tmp/tmp_4.data'
> (too impatient to use the larger file every time)
>
> 15:
> 6281.107 ms
>
> HEAD:
> 7000.469 ms
>
> backing out 9f8377f7a27:
> 6433.516 ms
>
> also backing out faff8f8e47f, 6fcda9aba83:
> 6235.453 ms
>
>
> I suspect 1) can relatively easily be fixed properly. But 2) seems much
> harder. The changes increased the number of branches substantially, that's
> gonna cost in something as (previously) tight as pg_strtoint32().

I'll look at how to fix 2).

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Masahiro Ikeda
Date:
Subject: Re: Support to define custom wait events for extensions
Next
From: Michael Paquier
Date:
Subject: Re: Support to define custom wait events for extensions