Re: [PATCH] pg_dump: lock tables in batches - Mailing list pgsql-hackers
From | Fabrízio de Royes Mello |
---|---|
Subject | Re: [PATCH] pg_dump: lock tables in batches |
Date | |
Msg-id | CAFcNs+rwNf7oUYg6T76_in8a8Ct5oHP7MspG3JO83fJHQ314fA@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] pg_dump: lock tables in batches (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [PATCH] pg_dump: lock tables in batches
|
List | pgsql-hackers |
On Wed, Dec 7, 2022 at 2:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-12-07 10:44:33 -0500, Tom Lane wrote:
> >> I have a strong sense of deja vu here. I'm pretty sure I experimented
> >> with this idea last year and gave up on it. I don't recall exactly
> >> why, but either it didn't show any meaningful performance improvement
> >> for me or there was some actual downside (that I'm not remembering
> >> right now).
>
> > IIRC the case we were looking at around 989596152 were CPU bound workloads,
> > rather than latency bound workloads. It'd not be surprising to have cases
> > where batching LOCKs helps latency, but not CPU bound.
>
> Yeah, perhaps. Anyway my main point is that I don't want to just assume
> this is a win; I want to see some actual performance tests.
>
Here we have some numbers about the Aleksander's patch:
1) Setup script
CREATE DATABASE t1000;
CREATE DATABASE t10000;
CREATE DATABASE t100000;
\c t1000
SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3 TIMESTAMPTZ);', i) FROM generate_series(1, 1000) AS i \gexec
\c t10000
SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3 TIMESTAMPTZ);', i) FROM generate_series(1, 10000) AS i \gexec
\c t100000
SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3 TIMESTAMPTZ);', i) FROM generate_series(1, 100000) AS i \gexec
2) Execution script
CREATE DATABASE t10000;
CREATE DATABASE t100000;
\c t1000
SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3 TIMESTAMPTZ);', i) FROM generate_series(1, 1000) AS i \gexec
\c t10000
SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3 TIMESTAMPTZ);', i) FROM generate_series(1, 10000) AS i \gexec
\c t100000
SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3 TIMESTAMPTZ);', i) FROM generate_series(1, 100000) AS i \gexec
2) Execution script
time pg_dump -s t1000 > /dev/null
time pg_dump -s t10000 > /dev/null
time pg_dump -s t100000 > /dev/null
3) HEAD execution
time pg_dump -s t10000 > /dev/null
time pg_dump -s t100000 > /dev/null
3) HEAD execution
$ time pg_dump -s t1000 > /dev/null
0.02user 0.01system 0:00.36elapsed 8%CPU (0avgtext+0avgdata 11680maxresident)k
0inputs+0outputs (0major+1883minor)pagefaults 0swaps
$ time pg_dump -s t10000 > /dev/null
0.30user 0.10system 0:05.04elapsed 8%CPU (0avgtext+0avgdata 57772maxresident)k
0inputs+0outputs (0major+14042minor)pagefaults 0swaps
$ time pg_dump -s t100000 > /dev/null
3.42user 2.13system 7:50.09elapsed 1%CPU (0avgtext+0avgdata 517900maxresident)k
0inputs+0outputs (0major+134636minor)pagefaults 0swaps
4) PATCH execution
$ time pg_dump -s t1000 > /dev/null
0.02user 0.00system 0:00.28elapsed 9%CPU (0avgtext+0avgdata 11700maxresident)k
0inputs+0outputs (0major+1886minor)pagefaults 0swaps
$ time pg_dump -s t10000 > /dev/null
0.18user 0.03system 0:02.17elapsed 10%CPU (0avgtext+0avgdata 57592maxresident)k
0inputs+0outputs (0major+14072minor)pagefaults 0swaps
$ time pg_dump -s t100000 > /dev/null
1.97user 0.32system 0:21.39elapsed 10%CPU (0avgtext+0avgdata 517932maxresident)k
0inputs+0outputs (0major+134892minor)pagefaults 0swaps
0.02user 0.01system 0:00.36elapsed 8%CPU (0avgtext+0avgdata 11680maxresident)k
0inputs+0outputs (0major+1883minor)pagefaults 0swaps
$ time pg_dump -s t10000 > /dev/null
0.30user 0.10system 0:05.04elapsed 8%CPU (0avgtext+0avgdata 57772maxresident)k
0inputs+0outputs (0major+14042minor)pagefaults 0swaps
$ time pg_dump -s t100000 > /dev/null
3.42user 2.13system 7:50.09elapsed 1%CPU (0avgtext+0avgdata 517900maxresident)k
0inputs+0outputs (0major+134636minor)pagefaults 0swaps
4) PATCH execution
$ time pg_dump -s t1000 > /dev/null
0.02user 0.00system 0:00.28elapsed 9%CPU (0avgtext+0avgdata 11700maxresident)k
0inputs+0outputs (0major+1886minor)pagefaults 0swaps
$ time pg_dump -s t10000 > /dev/null
0.18user 0.03system 0:02.17elapsed 10%CPU (0avgtext+0avgdata 57592maxresident)k
0inputs+0outputs (0major+14072minor)pagefaults 0swaps
$ time pg_dump -s t100000 > /dev/null
1.97user 0.32system 0:21.39elapsed 10%CPU (0avgtext+0avgdata 517932maxresident)k
0inputs+0outputs (0major+134892minor)pagefaults 0swaps
5) Summary
HEAD patch
1k tables 0:00.36 0:00.28
10k tables 0:05.04 0:02.17
100k tables 7:50.09 0:21.39
1k tables 0:00.36 0:00.28
10k tables 0:05.04 0:02.17
100k tables 7:50.09 0:21.39
Seems we get very good performance gain using Aleksander's patch. I used the "-s" to not waste time issuing COPY for each relation (even all is empty) and evidence the difference due to roundtrip for LOCK TABLE. This patch will also improve the pg_upgrade execution over database with thousands of relations.
Regards,
--
Fabrízio de Royes Mello
Fabrízio de Royes Mello
pgsql-hackers by date: