Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)
Date
Msg-id CA+TgmoY5e07+0aJYJ7Hx6bvh8EPVFYy=yN-3Y0Ka96v_z0SsVQ@mail.gmail.com
Whole thread Raw
In response to Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)  (Bruce Momjian <bruce@momjian.us>)
Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Tue, Aug 17, 2021 at 2:50 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > Less sure that this is a good idea, though.  In particular, I do not
> > think that you can make it work in the face of
> >         alter database template1 rename to oops;
> >         create database template1;
>
> That is a really good point. If we can't categorically force the OID
> of those databases to have a particular, fixed value, and based on
> this example that seems to be impossible, then there's always a
> possibility that we might find a value in the old cluster that doesn't
> happen to match what is present in the new cluster. Seen from that
> angle, the problem is really with databases that are pre-existent in
> the new cluster but whose contents still need to be dumped. Maybe we
> could (optionally? conditionally?) drop those databases from the new
> cluster and then recreate them with the OID that we want them to have.

Actually, we do that already. create_new_objects() runs pg_restore
with --create for most databases, but with --clean --create for
template1 and postgres. This means that template1 and postgres will
always be recreated in the new cluster, and other databases are
assumed not to exist in the new cluster and the upgrade will fail if
they unexpectedly do. And the reason why pg_upgrade does that is that
it wants to "propagate [the] database-level properties" of postgres
and template1. So suppose we just make the database OID one of the
database-level properties that we want to propagate. That should
mostly just work, but where can things go wrong?

The only real failure mode is we try to create a database in the new
cluster and find out that the OID is already in use. If the new OID
that collides >64k, then the user has messed with the new cluster
before doing that. And since pg_upgrade is pretty clearly already
assuming that you shouldn't do that, it's fine to also make that
assumption in this case. We can disregard such cases as user error.

If the new OID that collides is <64k, then it must be colliding with
template0, template1, or postgres in the new cluster, because those
are the only databases that can have such OIDs since, currently, we
don't allow users to specify an OID for a new database. And the
problem cannot be with template1, because we hard-code its OID to 1.
If there is a database with OID 1 in either cluster, it must be
template1, and if there is a database with OID 1 in both clusters, it
must be template1 in both cases, and we'll just drop and recreate it
with OID 1 and everything is fine. So we need only consider template0
and postgres, which are created with system-generated OIDs. And, it
would be no issue if either of those databases had the same OID in the
old and new cluster, so the only possible OID collision is one where
the same system-generated OID was assigned to one of those databases
in the old cluster and to the other in the new cluster.

First consider the case where template0 has OID, say, 13000, in the
old cluster, and postgres has that OID in the new cluster. No problem
occurs, because template0 isn't transferred anyway. The reverse
direction is a problem, though. If postgres had been assigned OID
13000 in the old cluster and, by sheer chance, template0 had that OID
in the new cluster, then the upgrade would fail, because it wouldn't
be able to recreate the postgres database with the correct OID.

But that doesn't seem very difficult to fix. I think all we need to do
is have initdb assign a fixed OID to template0 at creation time. Then,
in any new release to which someone might be trying to upgrade, the
system-generated OID assigned to postgres in the old release can't
match the fixed OID assigned to template0 in the new release, so the
one problem case is ruled out. We do need, however, to make sure that
the assign-my-database-a-fixed-OID syntax is either entirely
restricted to initdb & pg_upgrade or at least that OIDS < 64k can only
be assigned in one of those modes. Otherwise, some creative person
could manufacture new problem cases by setting up the source database
so that the OID of one of their databases matches the fixed OID we
gave to template0 or template1, or the system-generated OID for
postgres in the new cluster.

In short, as far as I can see, all we need to do to preserve database
OIDs across pg_upgrade is:

1. Add a new syntax for creating a database with a given OID, and use
it in pg_dump --binary-upgrade.
2. Don't let users use it at least for OIDs <64k, or maybe just don't
let them use it at all.
3. But let initdb use it, and have initdb set the initial OID for
template0 to a fixed value < 10000. If the user changes it later, no
problem; the cluster into which they are upgrading won't contain any
databases with high-numbered OIDs.

Anyone see a flaw in that analysis?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Printing backtrace of postgres processes
Next
From: John Naylor
Date:
Subject: Re: badly calculated width of emoji in psql