Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible - Mailing list pgsql-hackers

From Andres Freund
Subject Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible
Date
Msg-id 3yd2ss6n7xywo6pmhd7jjh3bqwgvx35bflzgv3ag4cnzfkik7m@hiyadppqxx6w
Whole thread Raw
In response to Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: pg_upgrade: transfer pg_largeobject_metadata's files when possible
List pgsql-hackers
Hi,

On 2025-09-08 14:20:33 -0500, Nathan Bossart wrote:
> On Sat, Sep 06, 2025 at 10:12:11AM +0900, Michael Paquier wrote:
> > Yep.  This plan is safe to rely on.
>
> Committed, thanks for reviewing!

I just had occasion to look at this code and I found the way this works quite
confusing:

When upgrading from a new enough server, we migrate
pg_largeobject_metadata. So far so good. But we *also* do a COPY FROM
COPY "pg_catalog"."pg_largeobject_metadata" ("oid", "lomowner", "lomacl") FROM stdin;
...

for all the large objects that have a description or a security label.


For a while I was somewhat baffled, because that sure looks like it ought to
lead to uniqueness violations.  But it doesn't.

The reason, I think, is that the COPY is happening into a relfilenode that
will be overwritten later, it doesn't yet contain the contents of the old
cluster.

Presumably we do this because we need the temporary pg_largeobject_metadata to
make COMMENT ON and security label commands not fail.


If this is the reasoning / how it works, shouldn't there be a comment in the
code or the commit message explaining that? Because it sure seems non-obvious
to me.

It's also not entirely obvious to me that this is safe - after all
(bbe08b8869bd, revised in 0e758ae89) appeared to have taken some pains to
ensure that the file gets unlinked immediately during the "binary upgrade
mode" TRUNCATE. But now we are actually filling that file again, after the
relation had been truncated?



Separately, if we don't handle large objects that don't have comments or
labels via pg_dump, why do we do dependency tracking for all LOs in getLOs(),
rather than just the ones that have a comment / label? Given how much memory
both the query results and the dependency tracking take...


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: IPC::Run::time[r|out] vs our TAP tests
Next
From: Nikolay Samokhvalov
Date:
Subject: Re: IO wait events for COPY FROM/TO PROGRAM or file