Re: Performance degradation on concurrent COPY into a single relation in PG16. - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Performance degradation on concurrent COPY into a single relation in PG16. |
Date | |
Msg-id | 20230711185159.v2j5vnyrtodnwhgz@awork3.anarazel.de Whole thread Raw |
In response to | Performance degradation on concurrent COPY into a single relation in PG16. (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Performance degradation on concurrent COPY into a single relation in PG16.
|
List | pgsql-hackers |
Hi, On 2023-07-03 11:55:13 +0900, Masahiko Sawada wrote: > While testing PG16, I observed that in PG16 there is a big performance > degradation in concurrent COPY into a single relation with 2 - 16 > clients in my environment. I've attached a test script that measures > the execution time of COPYing 5GB data in total to the single relation > while changing the number of concurrent insertions, in PG16 and PG15. > Here are the results on my environment (EC2 instance, RHEL 8.6, 128 > vCPUs, 512GB RAM): > > * PG15 (4b15868b69) > PG15: nclients = 1, execution time = 14.181 > > * PG16 (c24e9ef330) > PG16: nclients = 1, execution time = 17.112 > The relevant commit is 00d1e02be2 "hio: Use ExtendBufferedRelBy() to > extend tables more efficiently". With commit 1cbbee0338 (the previous > commit of 00d1e02be2), I got a better numbers, it didn't have a better > scalability, though: > > PG16: nclients = 1, execution time = 17.444 I think the single client case is indicative of an independent regression, or rather regressions - it can't have anything to do with the fallocate() issue and reproduces before that too in your numbers. 1) COPY got slower, due to: 9f8377f7a27 Add a DEFAULT option to COPY FROM This added a new palloc()/free() to every call to NextCopyFrom(). It's not at all clear to me why this needs to happen in NextCopyFrom(), particularly because it's already stored in CopyFromState? 2) pg_strtoint32_safe() got substantially slower, mainly due to faff8f8e47f Allow underscores in integer and numeric constants. 6fcda9aba83 Non-decimal integer literals pinned to one cpu, turbo mode disabled, I get the following best-of-three times for copy test from '/tmp/tmp_4.data' (too impatient to use the larger file every time) 15: 6281.107 ms HEAD: 7000.469 ms backing out 9f8377f7a27: 6433.516 ms also backing out faff8f8e47f, 6fcda9aba83: 6235.453 ms I suspect 1) can relatively easily be fixed properly. But 2) seems much harder. The changes increased the number of branches substantially, that's gonna cost in something as (previously) tight as pg_strtoint32(). For higher concurrency numbers, I now was able to reproduce the regression, to a smaller degree. Much smaller after fixing the above. The reason we run into the issue here is basically that the rows in the test are very narrow and reach #define MAX_BUFFERED_TUPLES 1000 at a small number of pages, so we go back and forth between extending with fallocate() and not. I'm *not* saying that that is the solution, but after changing that to 5000, the numbers look a lot better (with the other regressions "worked around"): (this is again with turboboost disabled, to get more reproducible numbers) clients 1 2 4 8 16 32 15,buffered=1000 25725 13211 9232 5639 4862 4700 15,buffered=5000 26107 14550 8644 6050 4943 4766 HEAD+fixes,buffered=1000 25875 14505 8200 4900 3565 3433 HEAD+fixes,buffered=5000 25830 12975 6527 3594 2739 2642 Greetings, Andres Freund [1] https://postgr.es/m/CAD21AoAEwHTLYhuQ6PaBRPXKWN-CgW9iw%2B4hm%3D2EOFXbJQ3tOg%40mail.gmail.com
pgsql-hackers by date: