Horribly slow pg_upgrade performance with many Large Objects - Mailing list pgsql-hackers
From | Hannu Krosing |
---|---|
Subject | Horribly slow pg_upgrade performance with many Large Objects |
Date | |
Msg-id | CAMT0RQSS-6qLH+zYsOeUbAYhop3wmQTkNmQpo5--QRDUR+qYmQ@mail.gmail.com Whole thread Raw |
Responses |
Re: Horribly slow pg_upgrade performance with many Large Objects
Re: Horribly slow pg_upgrade performance with many Large Objects |
List | pgsql-hackers |
Hi Hackers ## The issue I have now met a not insignificant number of cases where pg_upgrade performance is really bad when the database has a large number of Large Objects. The average time to `pg_dump --binary-upgrade --format=custom ...` a database and then `pg_restore ...` it back is 1 minute per 1 million LOs, This does not seem to bad until you realize that it is - 100 minutes, ot 1H40min for 100M LOs - 16H40 min for 1 Billion large objects . - 66H40m or 4 2days 18 hours and 40 min for those unfortunate enough who have to full 4 Billion LOs My direct tests have 10m5s for 10M LOs and 100m54s for 100M. Also a smaller server just runs out of shared memory when restoring 100M LOs. I know that our semi-official recommendation for using Large Objects is "Don't!" but still people use them and sometimes they have a lot of them. One of the reasons seems to be an older version of Hibernate which converted any column of type CLOB into a Large Object, so we see cases where average LO size is a few hundred to a few thousand bytes. I read through the old discussions around the time of release of v12 and the consensus seemed to be at that time that dumping the LOs without data was ok "because there will not be too many LOs" which unfortunately has turned out to be wishful thinking, as there are lots of Tiny Large Object out there and these really slow down pg_upgrade. ## So what to do about this? The obvious solution would be to handle the table `pg_largeobject_metadata` the same way as we currently handle `pg_largeobject `by not doing anything with it in `pg_dump --binary-upgrade` and just handle the contents it like we do for user tables in pg_upgrade itself. This should work fine for all source database versions starting from PgSQL v12. For older supported versions 9.2 to 11 where `oid` was system column we should just dump the three fields of `pg_largeobject_metadata` directly so that the dump command would be `COPY (SELECT oid, lomowner, lomacl) TO stdout` and then restore would happen more or less automatically. Or we could just modify `pg_restore` so that it restores LOs using COPY instead of doing the current dance of 3 separate SELECT's from lo_xxx() functions. pg_dump --binary-upgrade format=custome is relatively fast for LOs - 1m58s for 100M object, though still much slower than `pg_dump --data-only -t pg_largeobject_metadata` which is 20 sec, or 23 when also run through gzip ## preferred solution for --binary-upgrade - link/copy data files when source is v12+ - dump the triplet of pg_largeobject_metadata columns if source is v9.2 - v11 an acceptable faster solution would be to just always dump pg_largeobject_metadata (oid, lomowner, lomacl) as this would not need any changes to `pg_upgrade`, just to `pg_dump --binary-upgrade` it would not be as fast as linking/copying but would be several orders of magnitude faster than current behaviour. Does anyone have any objections to this ? Would anyone be willing to take up fixing this ? Or perhaps advise where in code the change should be done ? I started looking at this and at high level It looks like I need to create a task to copy out specific data columns and not the structure of pg_largeobject metadata but as pg_dump.c is 582,673 it would take some time to understand where exactly to do this and all the possible interactions with flags like --binary-upgrade and specific formats. --- Cheers, Hannu
pgsql-hackers by date: