Binary upgrade from <12 to 12 creates toast table for partitionedtables - Mailing list pgsql-hackers

From Andres Freund
Subject Binary upgrade from <12 to 12 creates toast table for partitionedtables
Date
Msg-id 20190306204104.yle5jfbnqkcwykni@alap3.anarazel.de
Whole thread Raw
Responses Re: Binary upgrade from <12 to 12 creates toast table for partitionedtables
Re: Binary upgrade from <12 to 12 creates toast table for partitioned tables
List pgsql-hackers
Hi,

After my tableam patch Andrew's buildfarm animal started failing in the
cross-version upgrades:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2019-03-06%2019%3A32%3A24

But I actually don't think that't really fault of the tableam patch. The
reason for the assertion is that we assert that
        Assert(relation->rd_rel->relam != InvalidOid);
for tables that have storage.  The table in question is a toast table.

The reason that table doesn't have an AM is that it's the toast table
for a partitioned relation, and for toast tables we just copy the AM
from the main table.

The backtrace shows:

#7  0x0000558b4bdbecfc in create_toast_table (rel=0x7fdab64289e0, toastOid=0, toastIndexOid=0, reloptions=0,
lockmode=8,check=false)
 
    at /home/andres/src/postgresql/src/backend/catalog/toasting.c:263
263        toast_relid = heap_create_with_catalog(toast_relname,
(gdb) p *rel->rd_rel
$2 = {oid = 80244, relname = {
    data =
"partitioned_table\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"},
relnamespace= 2200, reltype = 80246, reloftype = 0, relowner = 10, relam = 0, relfilenode = 0, 
 
  reltablespace = 0, relpages = 0, reltuples = 0, relallvisible = 0, reltoastrelid = 0, relhasindex = false,
relisshared= false, relpersistence = 112 'p', 
 
  relkind = 112 'p', relnatts = 2, relchecks = 0, relhasrules = false, relhastriggers = false, relhassubclass = false,
relrowsecurity= false, 
 
  relforcerowsecurity = false, relispopulated = true, relreplident = 100 'd', relispartition = false, relrewrite = 0,
relfrozenxid= 0, relminmxid = 0}
 

that were trying to create a toast table for a partitioned table. Which
seems wrong to me, given that recent commits made partitioned tables
have no storage.

The reason we're creating a storage is that we're upgrading from a
version of PG where partitioned tables *did* have storage. And thus the
dump looks like:

-- For binary upgrade, must preserve pg_type oid
SELECT pg_catalog.binary_upgrade_set_next_pg_type_oid('80246'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_type array oid
SELECT pg_catalog.binary_upgrade_set_next_array_pg_type_oid('80245'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_type toast oid
SELECT pg_catalog.binary_upgrade_set_next_toast_pg_type_oid('80248'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_class oids
SELECT pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('80244'::pg_catalog.oid);
SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('80247'::pg_catalog.oid);
SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('80249'::pg_catalog.oid);

CREATE TABLE "public"."partitioned_table" (
    "a" integer,
    "b" "text"
)
PARTITION BY LIST ("a");


and create_toast_table() has logic like:

    {
        /*
         * In binary-upgrade mode, create a TOAST table if and only if
         * pg_upgrade told us to (ie, a TOAST table OID has been provided).
         *
         * This indicates that the old cluster had a TOAST table for the
         * current table.  We must create a TOAST table to receive the old
         * TOAST file, even if the table seems not to need one.
         *
         * Contrariwise, if the old cluster did not have a TOAST table, we
         * should be able to get along without one even if the new version's
         * needs_toast_table rules suggest we should have one.  There is a lot
         * of daylight between where we will create a TOAST table and where
         * one is really necessary to avoid failures, so small cross-version
         * differences in the when-to-create heuristic shouldn't be a problem.
         * If we tried to create a TOAST table anyway, we would have the
         * problem that it might take up an OID that will conflict with some
         * old-cluster table we haven't seen yet.
         */
        if (!OidIsValid(binary_upgrade_next_toast_pg_class_oid) ||
            !OidIsValid(binary_upgrade_next_toast_pg_type_oid))
            return false;


I think we probably should have pg_dump suppress emitting information
about the toast table of partitioned tables?

While I'm not hugely bothered by binary upgrade mode creating
inconsistent states - there's plenty of ways to crash the server that
way - it probably also would be a good idea to have heap_create()
elog(ERROR) when accessmtd is invalid.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Should we increase the default vacuum_cost_limit?
Next
From: Robert Haas
Date:
Subject: Re: Tighten error control for OpenTransientFile/CloseTransientFile