Re: BUG #9616: Materialized view with indexes unable to load from pg_dump - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #9616: Materialized view with indexes unable to load from pg_dump |
Date | |
Msg-id | 3867.1395250733@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #9616: Materialized view with indexes unable to load from pg_dump (Jesse Denardo <jesse.denardo@myfarms.com>) |
List | pgsql-bugs |
Jesse Denardo <jesse.denardo@myfarms.com> writes: > [ pg_dump fails to dump this sanely: ] > CREATE SCHEMA test_mat; > CREATE TABLE test_mat.a ( > id integer, > name character varying > ); > ALTER TABLE ONLY test_mat.a ADD CONSTRAINT a_pkey PRIMARY KEY (id); > CREATE MATERIALIZED VIEW test_mat.mat AS ( > SELECT id, name FROM test_mat.a GROUP BY id > ); OK, the reason why this is odd is that the view definition requires that a.id be a primary key. Because primary keys are supposed to be emitted in the post-data section of the dump, while the CREATE MAT VIEW command will be emitted pre-data, we have a dump-object sorting failure, which manifests as a dependency loop involving the matview, its rule, a's primary key index, and the section boundary pseudo-objects. The code in pg_dump_sort.c assumes that any relation with an ON SELECT rule must be a regular view, and so it tries to break the loop by splitting the view into a CREATE TABLE command followed by a CREATE RULE command. Of course, that results in SQL that recreates a plain view not a matview; not to mention your original complaint that it fails when there are indexes on the matview. I've developed the attached draft patch that fixes this by allowing the CREATE MATERIALIZED VIEW command to be postponed into the post-data section. I think this is pretty darn ugly, since people won't expect such a categorization, but I'm not sure we have any choice so far as existing branches are concerned. If we had a CREATE OR REPLACE MATERIALIZED VIEW type of command, we could imagine fixing this by emitting a dummy create command in the pre-data section, along the lines of CREATE MATERIALIZED VIEW mv AS SELECT NULL::coltype1 AS colname1, NULL::coltype2 AS colname2, ... WITH NO DATA; and then overwriting that with CREATE OR REPLACE MATERIALIZED VIEW at the point where it's safe to give the real view definition. (If we were to write code for that, I'd be pretty inclined to change the dumping of separated views to look similar, instead of the hack they are now.) Of course, you could argue that that wouldn't be materially less confusing than the other way. But I suspect we are going to end up doing something of the sort anyhow, because I have little faith that the technique in this patch will fix every case of circularities involving matviews. Comments? Anybody want to try to fix this in another way? regards, tom lane diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 4001f3f..2653ef0 100644 *** a/src/bin/pg_dump/pg_dump.c --- b/src/bin/pg_dump/pg_dump.c *************** getTables(Archive *fout, int *numTables) *** 4796,4801 **** --- 4796,4803 ---- selectDumpableTable(&tblinfo[i]); tblinfo[i].interesting = tblinfo[i].dobj.dump; + tblinfo[i].postponed_def = false; /* might get set during sort */ + /* * Read-lock target tables to make sure they aren't DROPPED or altered * in schema before we get around to dumping them. *************** dumpTableSchema(Archive *fout, TableInfo *** 13611,13617 **** (tbinfo->relkind == RELKIND_VIEW) ? NULL : tbinfo->reltablespace, tbinfo->rolname, (strcmp(reltypename, "TABLE") == 0) ? tbinfo->hasoids : false, ! reltypename, SECTION_PRE_DATA, q->data, delq->data, NULL, NULL, 0, NULL, NULL); --- 13613,13620 ---- (tbinfo->relkind == RELKIND_VIEW) ? NULL : tbinfo->reltablespace, tbinfo->rolname, (strcmp(reltypename, "TABLE") == 0) ? tbinfo->hasoids : false, ! reltypename, ! tbinfo->postponed_def ? SECTION_POST_DATA : SECTION_PRE_DATA, q->data, delq->data, NULL, NULL, 0, NULL, NULL); diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h index 398eca1..e01015e 100644 *** a/src/bin/pg_dump/pg_dump.h --- b/src/bin/pg_dump/pg_dump.h *************** typedef struct _tableInfo *** 241,247 **** char *reltablespace; /* relation tablespace */ char *reloptions; /* options specified by WITH (...) */ char *checkoption; /* WITH CHECK OPTION */ ! char *toast_reloptions; /* ditto, for the TOAST table */ bool hasindex; /* does it have any indexes? */ bool hasrules; /* does it have any rules? */ bool hastriggers; /* does it have any triggers? */ --- 241,247 ---- char *reltablespace; /* relation tablespace */ char *reloptions; /* options specified by WITH (...) */ char *checkoption; /* WITH CHECK OPTION */ ! char *toast_reloptions; /* WITH options for the TOAST table */ bool hasindex; /* does it have any indexes? */ bool hasrules; /* does it have any rules? */ bool hastriggers; /* does it have any triggers? */ *************** typedef struct _tableInfo *** 254,262 **** /* these two are set only if table is a sequence owned by a column: */ Oid owning_tab; /* OID of table owning sequence */ int owning_col; /* attr # of column owning sequence */ ! int relpages; bool interesting; /* true if need to collect more data */ /* * These fields are computed only if we decide the table is interesting --- 254,263 ---- /* these two are set only if table is a sequence owned by a column: */ Oid owning_tab; /* OID of table owning sequence */ int owning_col; /* attr # of column owning sequence */ ! int relpages; /* table's size in pages (from pg_class) */ bool interesting; /* true if need to collect more data */ + bool postponed_def; /* matview must be postponed into post-data */ /* * These fields are computed only if we decide the table is interesting diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c index bda3535..3590932 100644 *** a/src/bin/pg_dump/pg_dump_sort.c --- b/src/bin/pg_dump/pg_dump_sort.c *************** repairTypeFuncLoop(DumpableObject *typeo *** 798,803 **** --- 798,804 ---- * will be an implicit dependency in the other direction, we need to break * the loop. If there are no other objects in the loop then we can remove * the implicit dependency and leave the ON SELECT rule non-separate. + * This applies to matviews, as well. */ static void repairViewRuleLoop(DumpableObject *viewobj, *************** repairViewRuleLoop(DumpableObject *viewo *** 814,820 **** * Because findLoop() finds shorter cycles before longer ones, it's likely * that we will have previously fired repairViewRuleLoop() and removed the * rule's dependency on the view. Put it back to ensure the rule won't be ! * emitted before the view... */ static void repairViewRuleMultiLoop(DumpableObject *viewobj, --- 815,823 ---- * Because findLoop() finds shorter cycles before longer ones, it's likely * that we will have previously fired repairViewRuleLoop() and removed the * rule's dependency on the view. Put it back to ensure the rule won't be ! * emitted before the view. ! * ! * Note: this approach does *not* work for matviews, at the moment. */ static void repairViewRuleMultiLoop(DumpableObject *viewobj, *************** repairViewRuleMultiLoop(DumpableObject * *** 842,847 **** --- 845,874 ---- } /* + * If a matview is involved in a multi-object loop, we can't currently fix + * that by splitting off the rule. As a stopgap, we try to fix it by + * dropping the constraint that the matview be dumped in the pre-data section. + * This is sufficient to handle cases where a matview depends on some unique + * index, as can happen if it has a GROUP BY for example. + * + * Note that the "next object" is not necessarily the matview itself; + * it could be the matview's rowtype, for example. We may come through here + * several times while removing all the pre-data linkages. + */ + static void + repairMatViewBoundaryMultiLoop(DumpableObject *matviewobj, + DumpableObject *boundaryobj, + DumpableObject *nextobj) + { + TableInfo *matviewinfo = (TableInfo *) matviewobj; + + /* remove boundary's dependency on object after it in loop */ + removeObjectDependency(boundaryobj, nextobj->dumpId); + /* mark matview as postponed into post-data section */ + matviewinfo->postponed_def = true; + } + + /* * Because we make tables depend on their CHECK constraints, while there * will be an automatic dependency in the other direction, we need to break * the loop. If there are no other objects in the loop then we can remove *************** repairDependencyLoop(DumpableObject **lo *** 956,965 **** return; } ! /* View and its ON SELECT rule */ if (nLoop == 2 && loop[0]->objType == DO_TABLE && loop[1]->objType == DO_RULE && ((RuleInfo *) loop[1])->ev_type == '1' && ((RuleInfo *) loop[1])->is_instead && ((RuleInfo *) loop[1])->ruletable == (TableInfo *) loop[0]) --- 983,994 ---- return; } ! /* View (including matview) and its ON SELECT rule */ if (nLoop == 2 && loop[0]->objType == DO_TABLE && loop[1]->objType == DO_RULE && + (((TableInfo *) loop[0])->relkind == 'v' || /* RELKIND_VIEW */ + ((TableInfo *) loop[0])->relkind == 'm') && /* RELKIND_MATVIEW */ ((RuleInfo *) loop[1])->ev_type == '1' && ((RuleInfo *) loop[1])->is_instead && ((RuleInfo *) loop[1])->ruletable == (TableInfo *) loop[0]) *************** repairDependencyLoop(DumpableObject **lo *** 970,975 **** --- 999,1006 ---- if (nLoop == 2 && loop[1]->objType == DO_TABLE && loop[0]->objType == DO_RULE && + (((TableInfo *) loop[1])->relkind == 'v' || /* RELKIND_VIEW */ + ((TableInfo *) loop[1])->relkind == 'm') && /* RELKIND_MATVIEW */ ((RuleInfo *) loop[0])->ev_type == '1' && ((RuleInfo *) loop[0])->is_instead && ((RuleInfo *) loop[0])->ruletable == (TableInfo *) loop[1]) *************** repairDependencyLoop(DumpableObject **lo *** 978,989 **** return; } ! /* Indirect loop involving view and ON SELECT rule */ if (nLoop > 2) { for (i = 0; i < nLoop; i++) { ! if (loop[i]->objType == DO_TABLE) { for (j = 0; j < nLoop; j++) { --- 1009,1021 ---- return; } ! /* Indirect loop involving view (but not matview) and ON SELECT rule */ if (nLoop > 2) { for (i = 0; i < nLoop; i++) { ! if (loop[i]->objType == DO_TABLE && ! ((TableInfo *) loop[i])->relkind == 'v') /* RELKIND_VIEW */ { for (j = 0; j < nLoop; j++) { *************** repairDependencyLoop(DumpableObject **lo *** 1000,1005 **** --- 1032,1061 ---- } } + /* Indirect loop involving matview and data boundary */ + if (nLoop > 2) + { + for (i = 0; i < nLoop; i++) + { + if (loop[i]->objType == DO_TABLE && + ((TableInfo *) loop[i])->relkind == 'm') /* RELKIND_MATVIEW */ + { + for (j = 0; j < nLoop; j++) + { + if (loop[j]->objType == DO_PRE_DATA_BOUNDARY) + { + DumpableObject *nextobj; + + nextobj = (j < nLoop - 1) ? loop[j + 1] : loop[0]; + repairMatViewBoundaryMultiLoop(loop[i], loop[j], + nextobj); + return; + } + } + } + } + } + /* Table and CHECK constraint */ if (nLoop == 2 && loop[0]->objType == DO_TABLE &&
pgsql-bugs by date: