Re: Horribly slow pg_upgrade performance with many Large Objects - Mailing list pgsql-hackers

From Nitin Motiani
Subject Re: Horribly slow pg_upgrade performance with many Large Objects
Date
Msg-id CAH5HC942o6j1qyVysiBtO+Vpb=XOBPbf3rKqsOLtpDpp0SzjkA@mail.gmail.com
Whole thread Raw
In response to Re: Horribly slow pg_upgrade performance with many Large Objects  (Hannu Krosing <hannuk@google.com>)
Responses Re: Horribly slow pg_upgrade performance with many Large Objects
List pgsql-hackers
Hi, 

I have a couple of comments/questions. 

> There might be an existing issue
> here, because dbObjectTypePriorities has the following comment:
>

> * NOTE: object-type priorities must match the section assignments made in
> * pg_dump.c; that is, PRE_DATA objects must sort before DO_PRE_DATA_BOUNDARY,
> * POST_DATA objects must sort after DO_POST_DATA_BOUNDARY, and DATA objects
> * must sort between them.
>

> But dumpLO() puts large objects in SECTION_DATA, and PRIO_LARGE_OBJECT is
> before PRIO_PRE_DATA_BOUNDARY. I admittedly haven't spent too much time 
investigating this, though.

I looked through the history of this to see how this happened and if it could be an existing issue. Prior to a45c78e3284b, dumpLO used to put large objects in SECTION_PRE_DATA. That commit changed dumpLO and also changed addBoundaryDependencies to move DO_LARGE_OBJECT from pre-data to data section. Seems like since then this has been inconsistent with pg_dump_sort.c. I think the change in pg_dump_sort.c should be backported to PG17 & 18 independent of the state of the larger patch. 

But even with the inconsistency, it doesn't look like there is an existing issue. As the dependencies were changed in addBoundaryDependencies, that should take precedence over the order in pg_dump_sort.c. The dbObjectTypePriorities are used by sortDumpableObjectsByTypeName. But right after that sortDumpableObjects sorts the objects based on dependencies therefore the change in boundary dependencies should ensure that this is working as intended. Still I think dbObjectTypePriorities should be made consistent with the rest. 


Also regarding this change in the patch

-		 * pg_largeobject_metadata, after the dump is restored.
+		 * pg_largeobject_metadata, after the dump is restored.  In versions
+		 * before v12, this is done via proper large object commands.  In
+		 * newer versions, we dump the content of pg_largeobject_metadata and
+		 * any associated pg_shdepend rows, which is faster to restore. 		 */


Should the comment provide further detail on why this is only being done for v12 and above? 

Thanks & Regards,
Nitin Motiani
Google

pgsql-hackers by date:

Previous
From: "Burd, Greg"
Date:
Subject: Re: Adding basic NUMA awareness
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Conflict detection for update_deleted in logical replication