Re: Bug in pg_restore with EventTrigger in parallel mode - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Bug in pg_restore with EventTrigger in parallel mode
Date
Msg-id 8858.1583624536@sss.pgh.pa.us
Whole thread Raw
In response to Re: Bug in pg_restore with EventTrigger in parallel mode  (vignesh C <vignesh21@gmail.com>)
Responses Re: Bug in pg_restore with EventTrigger in parallel mode  (Fabrízio de Royes Mello <fabriziomello@gmail.com>)
List pgsql-hackers
vignesh C <vignesh21@gmail.com> writes:
> I'm not sure if we can add a test for this, can you have a thought
> about this to check if we can add a test.

Yeah, I'm not quite sure if a test is worth the trouble or not.

We clearly do need to restore event triggers later than we do now, even
without considering parallel restore: they should not be able to prevent
us from executing other restore actions.  This is just like the rule that
we don't restore DML triggers until after we've loaded data.

However, I think that the existing code is correct to restore event
triggers before matview refreshes, not after as this patch would have us
do.  The basic idea for matview refresh is that it should happen in the
normal running state of the database.  If an event trigger interferes with
that, it would've done so in normal running as well.

I'm also not terribly on board with loading more functionality onto the
RestorePass mechanism.  That's a crock that should go away someday,
because it basically duplicates and overrides pg_dump's normal object
sorting mechanism.  So we don't want it doing more than it absolutely
has to.  But in this case, I don't see any reason why we can't just
restore event triggers and matviews in the same post-ACL restore pass.
In a serial restore, that will make the event triggers come first
because of the existing sort rules.  In a parallel restore, it's possible
that they'd be intermixed, but that doesn't bother me.  Again, if your
event triggers have side-effects on your matview refreshes, you're
going to have some issues anyway.

So that leads me to the attached, which renames the "RESTORE_PASS_REFRESH"
symbol for clarity, and updates the pg_dump_sort.c code and comments
to match what's really going on.

            regards, tom lane

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 15900ff..d3f85d6 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -670,11 +670,11 @@ RestoreArchive(Archive *AHX)
     {
         /*
          * In serial mode, process everything in three phases: normal items,
-         * then ACLs, then matview refresh items.  We might be able to skip
-         * one or both extra phases in some cases, eg data-only restores.
+         * then ACLs, then post-ACL items.  We might be able to skip one or
+         * both extra phases in some cases, eg data-only restores.
          */
         bool        haveACL = false;
-        bool        haveRefresh = false;
+        bool        havePostACL = false;

         for (te = AH->toc->next; te != AH->toc; te = te->next)
         {
@@ -689,8 +689,8 @@ RestoreArchive(Archive *AHX)
                 case RESTORE_PASS_ACL:
                     haveACL = true;
                     break;
-                case RESTORE_PASS_REFRESH:
-                    haveRefresh = true;
+                case RESTORE_PASS_POST_ACL:
+                    havePostACL = true;
                     break;
             }
         }
@@ -705,12 +705,12 @@ RestoreArchive(Archive *AHX)
             }
         }

-        if (haveRefresh)
+        if (havePostACL)
         {
             for (te = AH->toc->next; te != AH->toc; te = te->next)
             {
                 if ((te->reqs & (REQ_SCHEMA | REQ_DATA)) != 0 &&
-                    _tocEntryRestorePass(te) == RESTORE_PASS_REFRESH)
+                    _tocEntryRestorePass(te) == RESTORE_PASS_POST_ACL)
                     (void) restore_toc_entry(AH, te, false);
             }
         }
@@ -3082,8 +3082,9 @@ _tocEntryRestorePass(TocEntry *te)
         strcmp(te->desc, "ACL LANGUAGE") == 0 ||
         strcmp(te->desc, "DEFAULT ACL") == 0)
         return RESTORE_PASS_ACL;
-    if (strcmp(te->desc, "MATERIALIZED VIEW DATA") == 0)
-        return RESTORE_PASS_REFRESH;
+    if (strcmp(te->desc, "EVENT TRIGGER") == 0 ||
+        strcmp(te->desc, "MATERIALIZED VIEW DATA") == 0)
+        return RESTORE_PASS_POST_ACL;
     return RESTORE_PASS_MAIN;
 }

diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 6768f92..67f3474 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -209,10 +209,14 @@ typedef enum
  * data restore failures.  On the other hand, matview REFRESH commands should
  * come out after ACLs, as otherwise non-superuser-owned matviews might not
  * be able to execute.  (If the permissions at the time of dumping would not
- * allow a REFRESH, too bad; we won't fix that for you.)  These considerations
- * force us to make three passes over the TOC, restoring the appropriate
- * subset of items in each pass.  We assume that the dependency sort resulted
- * in an appropriate ordering of items within each subset.
+ * allow a REFRESH, too bad; we won't fix that for you.)  We also want event
+ * triggers to be restored after ACLs, so that they can't mess those up.
+ *
+ * These considerations force us to make three passes over the TOC,
+ * restoring the appropriate subset of items in each pass.  We assume that
+ * the dependency sort resulted in an appropriate ordering of items within
+ * each subset.
+ *
  * XXX This mechanism should be superseded by tracking dependencies on ACLs
  * properly; but we'll still need it for old dump files even after that.
  */
@@ -220,9 +224,9 @@ typedef enum
 {
     RESTORE_PASS_MAIN = 0,        /* Main pass (most TOC item types) */
     RESTORE_PASS_ACL,            /* ACL item types */
-    RESTORE_PASS_REFRESH        /* Matview REFRESH items */
+    RESTORE_PASS_POST_ACL        /* Event trigger and matview refresh items */

-#define RESTORE_PASS_LAST RESTORE_PASS_REFRESH
+#define RESTORE_PASS_LAST RESTORE_PASS_POST_ACL
 } RestorePass;

 typedef enum
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index 1432dd7..654e2ec 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -24,8 +24,15 @@
  * Sort priority for database object types.
  * Objects are sorted by type, and within a type by name.
  *
- * Because materialized views can potentially reference system views,
- * DO_REFRESH_MATVIEW should always be the last thing on the list.
+ * Triggers, event triggers, and materialized views are intentionally sorted
+ * late.  Triggers must be restored after all data modifications, so that
+ * they don't interfere with loading data.  Event triggers are restored
+ * next-to-last so that they don't interfere with object creations of any
+ * kind.  Matview refreshes are last because they should execute in the
+ * database's normal state (e.g., they must come after all ACLs are restored;
+ * also, if they choose to look at system catalogs, they should see the final
+ * restore state).  If you think to change this, see also the RestorePass
+ * mechanism in pg_backup_archiver.c.
  *
  * NOTE: object-type priorities must match the section assignments made in
  * pg_dump.c; that is, PRE_DATA objects must sort before DO_PRE_DATA_BOUNDARY,
@@ -66,18 +73,18 @@ static const int dbObjectTypePriority[] =
     15,                            /* DO_TSCONFIG */
     16,                            /* DO_FDW */
     17,                            /* DO_FOREIGN_SERVER */
-    33,                            /* DO_DEFAULT_ACL */
+    38,                            /* DO_DEFAULT_ACL --- done in ACL pass */
     3,                            /* DO_TRANSFORM */
     21,                            /* DO_BLOB */
     25,                            /* DO_BLOB_DATA */
     22,                            /* DO_PRE_DATA_BOUNDARY */
     26,                            /* DO_POST_DATA_BOUNDARY */
-    34,                            /* DO_EVENT_TRIGGER */
-    39,                            /* DO_REFRESH_MATVIEW */
-    35,                            /* DO_POLICY */
-    36,                            /* DO_PUBLICATION */
-    37,                            /* DO_PUBLICATION_REL */
-    38                            /* DO_SUBSCRIPTION */
+    39,                            /* DO_EVENT_TRIGGER --- next to last! */
+    40,                            /* DO_REFRESH_MATVIEW --- last! */
+    34,                            /* DO_POLICY */
+    35,                            /* DO_PUBLICATION */
+    36,                            /* DO_PUBLICATION_REL */
+    37                            /* DO_SUBSCRIPTION */
 };

 StaticAssertDecl(lengthof(dbObjectTypePriority) == (DO_SUBSCRIPTION + 1),

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: What's difference among insert/write/flush lsn?
Next
From: Tom Lane
Date:
Subject: Re: range_agg