On Mon, Jul 11, 2022 at 9:16 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I am not saying we shouldn't try to fix this up more thoroughly, just
> that I think you are overestimating the consequences.
I spent a bunch of time looking at this today and I have more sympathy
for Justin's previous proposal now. I found it somewhat hacky that he
was relying on the hard-coded value of LargeObjectRelationId and
LargeObjectLOidPNIndexId, but I discovered that it's harder to do
better than I had assumed. Suppose we don't want to compare against a
hard-coded constant but against the value that is actually present
before the dump overwrites the pg_class row's relfilenode. Well, we
can't get that value from the database in question before restoring
the dump, because restoring either the dump creates or recreates the
database in all cases. The CREATE DATABASE command that will be part
of the dump always specifies TEMPLATE template0, so if we want
something other than a hard-coded constant, we need the
pg_class.relfilenode values from template0 for pg_largeobject and
pg_largeobject_loid_pn_index. But we can't connect to that database to
query those values, because it has datallowconn = false. Oops.
I have a few more ideas to try here. It occurs to me that we could fix
this more cleanly if we could get the dump itself to set the
relfilenode for pg_largeobject to the desired value. Right now, it's
just overwriting the relfilenode stored in the catalog without
actually doing anything that would cause a change on disk. But if we
could make it change the relfilenode in a more principled way that
would actually cause an on-disk change, then the orphaned-file problem
would be fixed, because we'd always be installing the new file over
top of the old file. I'm going to investigate how hard it would be to
make that work.
--
Robert Haas
EDB: http://www.enterprisedb.com