Thread: [HACKERS] Should `pg_upgrade --check` check relation filenodes are present?

[HACKERS] Should `pg_upgrade --check` check relation filenodes are present?

From
Craig de Stigter
Date:
Hi list

We attempted to pg_upgrade a database on a replication slave, and got the error:

error while creating link for relation "<schema>.<tablename>" ("/var/lib/postgresql-ext/PG_9.2_201204301/19171/141610397" to "/var/lib/postgresql-ext/PG_9.5_201510051/16401/9911696"): No such file or directory


The missing table turned out to be an unlogged table, and the data file for it was not present on the slave machine. That's reasonable. In our case we are able to start over from a snapshot and drop all the unlogged tables before trying again.

However, this problem was not caught by the `--check` command. I'm looking at the source code and it appears that pg_upgrade does not attempt to verify relation filenodes actually exist before proceeding, whether using --check or not.

Should it? I assume the reasoning is because it would take a long time and perhaps the benefit of doing so would be minimal?


--
Regards,
Craig de Stigter

Developer
Koordinates

Craig de Stigter <craig.destigter@koordinates.com> writes:
> We attempted to pg_upgrade a database on a replication slave, and got the
> error:

> error while creating link for relation "<schema>.<tablename>"
> ("/var/lib/postgresql-ext/PG_9.2_201204301/19171/141610397" to
> "/var/lib/postgresql-ext/PG_9.5_201510051/16401/9911696"): No such file or
> directory
>> 
>> 
>> 
> The missing table turned out to be an unlogged table, and the data file for
> it was not present on the slave machine. That's reasonable. In our case we
> are able to start over from a snapshot and drop all the unlogged tables
> before trying again.

> However, this problem was not caught by the `--check` command. I'm looking
> at the source code and it appears that pg_upgrade does not attempt to
> verify relation filenodes actually exist before proceeding, whether using
> --check or not.

> Should it? I assume the reasoning is because it would take a long time and
> perhaps the benefit of doing so would be minimal?

This failure would occur before we'd done anything irretrievable to the
source DB, so I'm not all that concerned.  You could have just re-initdb'd
the target directory and started over (after dropping the unlogged tables
of course).
        regards, tom lane



Re: [HACKERS] Should `pg_upgrade --check` check relation filenodesare present?

From
Peter Eisentraut
Date:
On 1/31/17 4:57 PM, Craig de Stigter wrote:
> However, this problem was not caught by the `--check` command. I'm
> looking at the source code and it appears that pg_upgrade does not
> attempt to verify relation filenodes actually exist before proceeding,
> whether using --check or not.

The purpose of --check is to see if there is anything in your database
that pg_upgrade cannot upgrade.  Its purpose is not to detect general
damage in a database.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Should `pg_upgrade --check` check relation filenodesare present?

From
Bruce Momjian
Date:
On Wed, Feb  1, 2017 at 10:57:01AM +1300, Craig de Stigter wrote:
> Hi list
> 
> We attempted to pg_upgrade a database on a replication slave, and got the
> error:
> 
> 
>         error while creating link for relation "<schema>.<tablename>" ("/var/
>         lib/postgresql-ext/PG_9.2_201204301/19171/141610397" to "/var/lib/
>         postgresql-ext/PG_9.5_201510051/16401/9911696"): No such file or
>         directory
> 
> 
> 
> 
> The missing table turned out to be an unlogged table, and the data file for it
> was not present on the slave machine. That's reasonable. In our case we are
> able to start over from a snapshot and drop all the unlogged tables before
> trying again.
> 
> However, this problem was not caught by the `--check` command. I'm looking at
> the source code and it appears that pg_upgrade does not attempt to verify
> relation filenodes actually exist before proceeding, whether using --check or
> not.
> 
> Should it? I assume the reasoning is because it would take a long time and
> perhaps the benefit of doing so would be minimal?

I think pg_upgrade needs to be improved in this area, but I am not sure
how yet.  Clearly the check should detect this or the upgrade should
succeed.

First, you are not using the standby upgrade instructions in step 10
here, right?
https://www.postgresql.org/docs/9.6/static/pgupgrade.html

I assume you don't want this standby to rejoin the primary, you just
want to upgrade it.

Second, I thought unlogged tables had empty files on the standby, not
_missing_ files.  Is that correct?  Should pg_upgrade just allow missing
unlogged table files?  I don't see any way to detect we are running on a
standby since the server is in write mode to run pg_upgrade.

I can develop a patch once I have answers to these questions.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +