Re: Horribly slow pg_upgrade performance with many Large Objects - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Horribly slow pg_upgrade performance with many Large Objects
Date
Msg-id CAMT0RQQmMESUdZ9bAb4aKOOcbj-=wkPBAb6M71zCe2m29vyGkA@mail.gmail.com
Whole thread Raw
In response to Re: Horribly slow pg_upgrade performance with many Large Objects  (Hannu Krosing <hannuk@google.com>)
List pgsql-hackers
Looked like a bit illogical order on re-reading it so I want to make
clear that the pg_upgrade-like test showing 100min for 100 million LOs
is at the end of last message and the proposed solution is at the
beginning

On Tue, Apr 8, 2025 at 9:15 AM Hannu Krosing <hannuk@google.com> wrote:
>
> I was testing on version 17
>
>
> On Tue, Apr 8, 2025 at 6:52 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Mon, Apr 07, 2025 at 05:25:32PM -0400, Tom Lane wrote:
> > > What version are you testing?  We did some work in that area in the
> > > v17 cycle (a45c78e32).
> >
> > I am puzzled by the target version used here, as well.
>
> I was testing on version 17
>
> Here is how you can easily test too (as --binary-upgrade does not dump
> the actual data it is ok for the test to not put anything there)
>
> hannuk@db01-c1a:~/work/lo-testing$ createdb -p 5433 lodb
> hannuk@db01-c1a:~/work/lo-testing$ psql -p 5433 lodb
> psql (17.4 (Ubuntu 17.4-1.pgdg22.04+2))
> Type "help" for help.
>
> lodb=# insert into pg_largeobject_metadata(oid, lomowner) SELECT i,
> 16384 FROM generate_series(1, 100_000_000) g(i);
> INSERT 0 100000000
> Time: 162414.216 ms (02:42.414)
> lodb=#
> \q
>
> hannuk@db01-c1a:~/work/lo-testing$ time pg_dump --data-only -t
> pg_largeobject_metadata -p 5433 lodb | gzip >
> pg_largeobject_metadata.data.gz
> real 0m22.094s
> user 0m20.741s
> sys 0m2.085s
>
> hannuk@db01-c1a:~/work/lo-testing$ time pg_dump --data-only -t
> pg_largeobject_metadata --format=custom -p 5433 lodb -f
> pg_largeobject_metadata.dump
> real 0m20.226s
> user 0m18.068s
> sys 0m0.824s
>
> >  If there is
> > more that can be improved, v19 would be the version to consider for
> > future improvements at this stage.
>
> If the internal format has changed in 16 the correct way would be to
> go through the data-only dump of pg_largeobject_metadata in all cases.
> Even for the 100M case where you get the restore in 2 minutes instead
> of 100 minutes
>
> hannuk@db01-c1a:~/work/lo-testing$ createdb -p 5434 lodb
> hannuk@db01-c1a:~/work/lo-testing$ time pg_restore -p 5434
> --exit-on-error --transaction-size=1000 --dbname lodb
> pg_largeobject_metadata.dump
>
> real 2m2.277s
> user 0m2.594s
> sys 0m0.549s
>
> And even in case of the user-visible format change in acl format it is
> most likely that changing the visible format using some regexp magic,
> or even a dedicated function, would still me much faster than creating
> all the LOs though creation commands.
>
> ------
> The commands I used to do the pg_upgrade-like test were
>
> hannuk@db01-c1a:~/work/lo-testing$ time pg_dump --schema-only
> --quote-all-identifiers --binary-upgrade --format=custom
> --file=lodb100m.dump -p 5433 lodb
> real 1m58.241s
> user 0m35.229s
> sys 0m17.854s
>
> hannuk@db01-c1a:~/work/lo-testing$ time pg_restore -p 5434
> --exit-on-error --transaction-size=1000 --dbname lodb lodb100m.dump
> real    100m54.878s
> user    3m23.885s
> sys     20m33.761s
>
> (I left out the --verbose part that pg_upgrade also sets as I did not
> want to get 100M lines of  "large object created " messages )
>
> also the postgres server at -p 5434 needs to be started with -b flag
> to accept the loading a dump from --binary-upgrade. In Debian/Ubuntu
> this can be directly passed to pg_ctlcluster as follows
>
> sudo pg_ctlcluster 17 target -o -b
>
> ----
> Hannu
>
>
>
>
> > --
> > Michael



pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: Horribly slow pg_upgrade performance with many Large Objects
Next
From: Hannu Krosing
Date:
Subject: Re: Horribly slow pg_upgrade performance with many Large Objects