Re: Horribly slow pg_upgrade performance with many Large Objects - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Horribly slow pg_upgrade performance with many Large Objects
Date
Msg-id aBkQLSkx1zUJ-LwJ@nathan
Whole thread Raw
In response to Re: Horribly slow pg_upgrade performance with many Large Objects  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Apr 08, 2025 at 01:07:09PM -0400, Tom Lane wrote:
> Nathan Bossart <nathandbossart@gmail.com> writes:
>> I do think it's worth considering going back to copying
>> pg_largobject_metadata's files for upgrades from v16 and newer.
> 
> (If we do this) I don't see why we'd need to stop at v16.  I'm
> envisioning that we'd use COPY, which will be dealing in the
> text representation of aclitems, and I don't think that's changed
> in a long time.  The sort of thing that would break it is changes
> in the set of available/default privilege bits for large objects.
> 
> That is, where the dump currently contains something like
> 
> SELECT pg_catalog.lo_create('2121');
> ALTER LARGE OBJECT 2121 OWNER TO postgres;
> GRANT ALL ON LARGE OBJECT 2121 TO joe;
> 
> we'd have
> 
> COPY pg_largeobject_metadata FROM STDIN;
> ...
> 2121    10    {postgres=rw/postgres,joe=rw/postgres}
> ...
> 
> and some appropriate COPY data for pg_shdepend too.

Attached is a proof-of-concept grade patch for using COPY for
pg_largeobject_metadata and the relevant pg_shdepend entries.  On my
laptop, pg_upgrade with 10M LOs (each with a non-bootstrap-superuser owner
and another role with SELECT rights) goes from ~8.5 minutes to ~1 minute
with this patch.

I originally set out to invent a new lo_create_with_owner() function and
teach pg_dump to batch those together in large groups, but as I started the
required pg_dump surgery, I was quickly scared away by the complexity.
Next, I gave COPY a try.  The improvements from using COPY will likely be
limited to the pg_upgrade case, but that's the only case I regularly hear
complaints about for zillions of large objects, so maybe it's good enough
for now.

For the COPY approach, I modified pg_dump to dump the contents of
pg_largeobject_metadata.  This is easy enough, but I ran into problems with
the dependent comments and security labels.  It turns out that even before
v12, we run all the lo_create() commands just so that creating the comments
and security labels works (AFAICT).  So I suspect upgrading with many large
objects has always been slow.  The comment/security label dependency issue
can be fixed (at least well enough for the tests) by moving
PRIO_LARGE_OBJECT below PRIO_TABLE_DATA.  There might be an existing issue
here, because dbObjectTypePriorities has the following comment:

 * NOTE: object-type priorities must match the section assignments made in
 * pg_dump.c; that is, PRE_DATA objects must sort before DO_PRE_DATA_BOUNDARY,
 * POST_DATA objects must sort after DO_POST_DATA_BOUNDARY, and DATA objects
 * must sort between them.

But dumpLO() puts large objects in SECTION_DATA, and PRIO_LARGE_OBJECT is
before PRIO_PRE_DATA_BOUNDARY.  I admittedly haven't spent too much time
investigating this, though.  In any case, it might be a good idea to also
make sure we explicitly mark the large objects and their comments/seclabels
as dependent on the pg_largeobject_metadata data.

That leaves pg_shdepend.  For now, I've just instructed pg_upgrade to COPY
the relevant pg_shdepend rows as an independent step, but perhaps there's a
reasonably straightforward way to put that in pg_dump, too.

-- 
nathan

Attachment

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: PG 18 release notes draft committed
Next
From: Thomas Munro
Date:
Subject: Re: disabled SSL log_like tests