Thread: Binary upgrade from <12 to 12 creates toast table for partitionedtables

Binary upgrade from <12 to 12 creates toast table for partitionedtables

From
Andres Freund
Date:
Hi,

After my tableam patch Andrew's buildfarm animal started failing in the
cross-version upgrades:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2019-03-06%2019%3A32%3A24

But I actually don't think that't really fault of the tableam patch. The
reason for the assertion is that we assert that
        Assert(relation->rd_rel->relam != InvalidOid);
for tables that have storage.  The table in question is a toast table.

The reason that table doesn't have an AM is that it's the toast table
for a partitioned relation, and for toast tables we just copy the AM
from the main table.

The backtrace shows:

#7  0x0000558b4bdbecfc in create_toast_table (rel=0x7fdab64289e0, toastOid=0, toastIndexOid=0, reloptions=0,
lockmode=8,check=false)
 
    at /home/andres/src/postgresql/src/backend/catalog/toasting.c:263
263        toast_relid = heap_create_with_catalog(toast_relname,
(gdb) p *rel->rd_rel
$2 = {oid = 80244, relname = {
    data =
"partitioned_table\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"},
relnamespace= 2200, reltype = 80246, reloftype = 0, relowner = 10, relam = 0, relfilenode = 0, 
 
  reltablespace = 0, relpages = 0, reltuples = 0, relallvisible = 0, reltoastrelid = 0, relhasindex = false,
relisshared= false, relpersistence = 112 'p', 
 
  relkind = 112 'p', relnatts = 2, relchecks = 0, relhasrules = false, relhastriggers = false, relhassubclass = false,
relrowsecurity= false, 
 
  relforcerowsecurity = false, relispopulated = true, relreplident = 100 'd', relispartition = false, relrewrite = 0,
relfrozenxid= 0, relminmxid = 0}
 

that were trying to create a toast table for a partitioned table. Which
seems wrong to me, given that recent commits made partitioned tables
have no storage.

The reason we're creating a storage is that we're upgrading from a
version of PG where partitioned tables *did* have storage. And thus the
dump looks like:

-- For binary upgrade, must preserve pg_type oid
SELECT pg_catalog.binary_upgrade_set_next_pg_type_oid('80246'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_type array oid
SELECT pg_catalog.binary_upgrade_set_next_array_pg_type_oid('80245'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_type toast oid
SELECT pg_catalog.binary_upgrade_set_next_toast_pg_type_oid('80248'::pg_catalog.oid);


-- For binary upgrade, must preserve pg_class oids
SELECT pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('80244'::pg_catalog.oid);
SELECT pg_catalog.binary_upgrade_set_next_toast_pg_class_oid('80247'::pg_catalog.oid);
SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('80249'::pg_catalog.oid);

CREATE TABLE "public"."partitioned_table" (
    "a" integer,
    "b" "text"
)
PARTITION BY LIST ("a");


and create_toast_table() has logic like:

    {
        /*
         * In binary-upgrade mode, create a TOAST table if and only if
         * pg_upgrade told us to (ie, a TOAST table OID has been provided).
         *
         * This indicates that the old cluster had a TOAST table for the
         * current table.  We must create a TOAST table to receive the old
         * TOAST file, even if the table seems not to need one.
         *
         * Contrariwise, if the old cluster did not have a TOAST table, we
         * should be able to get along without one even if the new version's
         * needs_toast_table rules suggest we should have one.  There is a lot
         * of daylight between where we will create a TOAST table and where
         * one is really necessary to avoid failures, so small cross-version
         * differences in the when-to-create heuristic shouldn't be a problem.
         * If we tried to create a TOAST table anyway, we would have the
         * problem that it might take up an OID that will conflict with some
         * old-cluster table we haven't seen yet.
         */
        if (!OidIsValid(binary_upgrade_next_toast_pg_class_oid) ||
            !OidIsValid(binary_upgrade_next_toast_pg_type_oid))
            return false;


I think we probably should have pg_dump suppress emitting information
about the toast table of partitioned tables?

While I'm not hugely bothered by binary upgrade mode creating
inconsistent states - there's plenty of ways to crash the server that
way - it probably also would be a good idea to have heap_create()
elog(ERROR) when accessmtd is invalid.

Greetings,

Andres Freund


Re: Binary upgrade from <12 to 12 creates toast table for partitionedtables

From
Andrew Dunstan
Date:
On 3/6/19 3:41 PM, Andres Freund wrote:
> Hi,
>
> After my tableam patch Andrew's buildfarm animal started failing in the
> cross-version upgrades:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2019-03-06%2019%3A32%3A24


Incidentally, I just fixed a bug that was preventing the module from
picking up its log files. The latest report has them all.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Binary upgrade from <12 to 12 creates toast table forpartitioned tables

From
Andres Freund
Date:
Hi,

On 2019-03-06 16:46:07 -0500, Andrew Dunstan wrote:
> 
> On 3/6/19 3:41 PM, Andres Freund wrote:
> > Hi,
> >
> > After my tableam patch Andrew's buildfarm animal started failing in the
> > cross-version upgrades:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2019-03-06%2019%3A32%3A24
> 
> 
> Incidentally, I just fixed a bug that was preventing the module from
> picking up its log files. The latest report has them all.

Ah, cool. I was wondering about that...

One thing I noticed is that it might need a more aggressive separator,
that report has:

\0\0\0rls_tbl_force\0\0\0\0ROW SECURITY\0\0\0\0\0K\0\0\0ALTER TABLE "regress_rls_schema"."rls_tbl_force" ENABLE ROW
LEVELSECURITY;\0\0\0\0\0\0\0\0\0\0\0\0regress_rls_schema\0\0\0\0\0\0\0
\0\0\0buildfarm\0\0\0\0false\0\0\0\0350\0\0\0\0\0\0\0\0\0\0\0=====================
REL_10_STABLE-pg_upgrade_dump_16384.log==============
 


Greetings,

Andres Freund


On Wed, Mar 6, 2019 at 3:41 PM Andres Freund <andres@anarazel.de> wrote:
> I think we probably should have pg_dump suppress emitting information
> about the toast table of partitioned tables?

+1.  That seems like the right fix.

> While I'm not hugely bothered by binary upgrade mode creating
> inconsistent states - there's plenty of ways to crash the server that
> way - it probably also would be a good idea to have heap_create()
> elog(ERROR) when accessmtd is invalid.

Not sure about this part.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Binary upgrade from <12 to 12 creates toast table forpartitioned tables

From
Andres Freund
Date:
Hi,

On 2019-03-07 13:08:35 -0500, Robert Haas wrote:
> On Wed, Mar 6, 2019 at 3:41 PM Andres Freund <andres@anarazel.de> wrote:
> > I think we probably should have pg_dump suppress emitting information
> > about the toast table of partitioned tables?
>
> +1.  That seems like the right fix.

Cool. Alvaro, Kyatoro, Michael, are either of you planning to tackle
that? Afaict it's caused by

commit 807ae415c54628ade937cb209f0fc9913e6b0cf5
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date:   2019-01-04 14:51:17 -0300

    Don't create relfilenode for relations without storage

    Some relation kinds had relfilenode set to some non-zero value, but
    apparently the actual files did not really exist because creation was
    prevented elsewhere.  Get rid of the phony pg_class.relfilenode values.

    Catversion bumped, but only because the sanity_test check will fail if
    run in a system initdb'd with the previous version.

    Reviewed-by: Kyotaro HORIGUCHI, Michael Paquier
    Discussion: https://postgr.es/m/20181206215552.fm2ypuxq6nhpwjuc@alvherre.pgsql


> > While I'm not hugely bothered by binary upgrade mode creating
> > inconsistent states - there's plenty of ways to crash the server that
> > way - it probably also would be a good idea to have heap_create()
> > elog(ERROR) when accessmtd is invalid.
>
> Not sure about this part.

As in, we shouldn't elog out? Or we should have an ereport with a proper
error, or ...?

Greetings,

Andres Freund


Re: Binary upgrade from <12 to 12 creates toast table forpartitioned tables

From
Alvaro Herrera
Date:
On 2019-Mar-07, Andres Freund wrote:

> Hi,
> 
> On 2019-03-07 13:08:35 -0500, Robert Haas wrote:
> > On Wed, Mar 6, 2019 at 3:41 PM Andres Freund <andres@anarazel.de> wrote:
> > > I think we probably should have pg_dump suppress emitting information
> > > about the toast table of partitioned tables?
> >
> > +1.  That seems like the right fix.
> 
> Cool. Alvaro, Kyatoro, Michael, are either of you planning to tackle
> that? Afaict it's caused by

I'll have a look.

Thanks

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Thu, Mar 7, 2019 at 1:17 PM Andres Freund <andres@anarazel.de> wrote:
> > > While I'm not hugely bothered by binary upgrade mode creating
> > > inconsistent states - there's plenty of ways to crash the server that
> > > way - it probably also would be a good idea to have heap_create()
> > > elog(ERROR) when accessmtd is invalid.
> >
> > Not sure about this part.
>
> As in, we shouldn't elog out? Or we should have an ereport with a proper
> error, or ...?

As in, I don't understand the problem well enough to have an informed opinion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Binary upgrade from <12 to 12 creates toast table forpartitioned tables

From
Alvaro Herrera
Date:
On 2019-Mar-07, Robert Haas wrote:

> On Wed, Mar 6, 2019 at 3:41 PM Andres Freund <andres@anarazel.de> wrote:
> > I think we probably should have pg_dump suppress emitting information
> > about the toast table of partitioned tables?
> 
> +1.  That seems like the right fix.

This patch fixes the upgrade problem for me.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Binary upgrade from <12 to 12 creates toast table forpartitioned tables

From
Andres Freund
Date:
Hi,

On 2019-03-08 20:18:27 -0300, Alvaro Herrera wrote:
> On 2019-Mar-07, Robert Haas wrote:
> 
> > On Wed, Mar 6, 2019 at 3:41 PM Andres Freund <andres@anarazel.de> wrote:
> > > I think we probably should have pg_dump suppress emitting information
> > > about the toast table of partitioned tables?
> > 
> > +1.  That seems like the right fix.
> 
> This patch fixes the upgrade problem for me.

Thanks!


> -- 
> Álvaro Herrera                https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

> diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> index e962ae7e913..1de8da59361 100644
> --- a/src/bin/pg_dump/pg_dump.c
> +++ b/src/bin/pg_dump/pg_dump.c
> @@ -4359,9 +4359,9 @@ binary_upgrade_set_type_oids_by_rel_oid(Archive *fout,
>                        "SELECT c.reltype AS crel, t.reltype AS trel "
>                        "FROM pg_catalog.pg_class c "
>                        "LEFT JOIN pg_catalog.pg_class t ON "
> -                      "  (c.reltoastrelid = t.oid) "
> +                      "  (c.reltoastrelid = t.oid AND c.relkind <> '%c') "
>                        "WHERE c.oid = '%u'::pg_catalog.oid;",
> -                      pg_rel_oid);
> +                      RELKIND_PARTITIONED_TABLE, pg_rel_oid);

Hm, I know this code isn't generally well documented, but perhaps we
could add a comment as to why we're excluding partitioned tables?

Greetings,

Andres Freund


Re: Binary upgrade from <12 to 12 creates toast table forpartitioned tables

From
Alvaro Herrera
Date:
Hello

On 2019-Mar-08, Andres Freund wrote:

> > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> > index e962ae7e913..1de8da59361 100644
> > --- a/src/bin/pg_dump/pg_dump.c
> > +++ b/src/bin/pg_dump/pg_dump.c
> > @@ -4359,9 +4359,9 @@ binary_upgrade_set_type_oids_by_rel_oid(Archive *fout,
> >                        "SELECT c.reltype AS crel, t.reltype AS trel "
> >                        "FROM pg_catalog.pg_class c "
> >                        "LEFT JOIN pg_catalog.pg_class t ON "
> > -                      "  (c.reltoastrelid = t.oid) "
> > +                      "  (c.reltoastrelid = t.oid AND c.relkind <> '%c') "
> >                        "WHERE c.oid = '%u'::pg_catalog.oid;",
> > -                      pg_rel_oid);
> > +                      RELKIND_PARTITIONED_TABLE, pg_rel_oid);
> 
> Hm, I know this code isn't generally well documented, but perhaps we
> could add a comment as to why we're excluding partitioned tables?

I added a short comment nearby.  Hopefully that's sufficient.  Let's see
what the buildfarm members have to say now.

(I wondered about putting the comment between two lines of the string
literal, but decided against it ...)

Thanks!

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Binary upgrade from <12 to 12 creates toast table forpartitioned tables

From
Andres Freund
Date:
On 2019-03-10 13:29:06 -0300, Alvaro Herrera wrote:
> Hello
> 
> On 2019-Mar-08, Andres Freund wrote:
> 
> > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> > > index e962ae7e913..1de8da59361 100644
> > > --- a/src/bin/pg_dump/pg_dump.c
> > > +++ b/src/bin/pg_dump/pg_dump.c
> > > @@ -4359,9 +4359,9 @@ binary_upgrade_set_type_oids_by_rel_oid(Archive *fout,
> > >                        "SELECT c.reltype AS crel, t.reltype AS trel "
> > >                        "FROM pg_catalog.pg_class c "
> > >                        "LEFT JOIN pg_catalog.pg_class t ON "
> > > -                      "  (c.reltoastrelid = t.oid) "
> > > +                      "  (c.reltoastrelid = t.oid AND c.relkind <> '%c') "
> > >                        "WHERE c.oid = '%u'::pg_catalog.oid;",
> > > -                      pg_rel_oid);
> > > +                      RELKIND_PARTITIONED_TABLE, pg_rel_oid);
> > 
> > Hm, I know this code isn't generally well documented, but perhaps we
> > could add a comment as to why we're excluding partitioned tables?
> 
> I added a short comment nearby.  Hopefully that's sufficient.  Let's see
> what the buildfarm members have to say now.

Thanks!  Looks like crake's doing better.

Greetings,

Andres Freund