Re: ALTER TABLE ALTER COLUMN SET TYPE crash - Mailing list pgsql-bugs

From Tom Lane
Subject Re: ALTER TABLE ALTER COLUMN SET TYPE crash
Date
Msg-id 2944490.1598473508@sss.pgh.pa.us
Whole thread Raw
In response to Re: ALTER TABLE ALTER COLUMN SET TYPE crash  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
I wrote:
> The core issue here is that the table's catalog entries, as visible within
> this transaction, don't match what's in its disk file.  ALTER TABLE knows
> that and is careful not to touch the old table using the new rowtype ---
> but nothing else knows that.  So I think we need to block other code
> from touching the table-under-modification.  As you say, there's not
> any existing infrastructure that will serve for that directly.  We might
> be able to invent something comparable to the existing relcache entry
> refcount, but counting exclusive opens ("there can be only one").

I initially tried to fix this with a hard restriction that you can't open
a relation that's marked as being exclusively accessed, as in the 0001
patch below.  That crashed and burned most spectacularly.  To understand
the second attempt, it's helpful to enumerate some of the problems:

1. We have felt free to allow ALTER TABLE and similar commands to call
subroutines that re-open the target relation.  Refactoring to the point
where that wouldn't happen (by means of always passing down an open
Relation) seems entirely impractical.

2. Keeping the exclusive-access state in relcache entries, as I first
tried to do, simply doesn't work, because ALTER TABLE doesn't
hold the relcache entry open throughout --- for instance, the initial
pin is released just after Phase 1 in ATController.  I did try not
doing that, but it breaks other things.  Besides I'm pretty sure that
that's designed that way intentionally, to reduce the number of forced
rebuilds of the relcache entry.

3. We can't have CheckTableNotInUse enforcing this either, because trying
to do so fails in numerous situation.  ALTER TABLE itself is recursive to
some extent, and it turns out that most of the other callers neither need
nor want any such restriction.  For instance there's actually a regression
test that fails if TRUNCATE tries to lock out accesses by called code.

In view of point 1, I have set this up so that we only check for
disallowed re-entrancy in the parse/rewrite/plan/execute code path.
(The checks added there are in the places where we first acquire lock
on any given relation.)  Perhaps there are other places where we need
to check, but this at least plugs the primary hole.

In view of point 2, there's a new separate hashtable that holds the
exclusive-marking state.  The extra cycles to probe it are slightly
annoying, but in the big scheme of things I doubt it's measurable.
(In any case, in a typical session that has never done ALTER TABLE,
no lookups are required.)

As for point 3, right now only ALTER TABLE is applying such marking
at all, and it's careful to release it as soon as consistency has
been restored.  I looked through the other CheckTableNotInUse
callers and tentatively concluded that none of the other ones have
a problem; but more eyeballs on that would be a good thing.

Anyway, 0001 is just for amusement's sake; 0002 is the proposed patch.

            regards, tom lane

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d2b15a3387..9f3f200e7b 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3594,6 +3594,10 @@ CheckTableNotInUse(Relation rel, const char *stmt)
         /* translator: first %s is a SQL command, eg ALTER TABLE */
                  errmsg("cannot %s \"%s\" because it has pending trigger events",
                         stmt, RelationGetRelationName(rel))));
+
+    /* Mark the relation to disallow re-entrant access */
+    Assert(rel->rd_exclusive == NULL);
+    rel->rd_exclusive = stmt;
 }

 /*
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index a2453cf1f4..beff48accf 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1095,6 +1095,7 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
      * new relations, it won't be new.  It could be temp though.
      */
     relation->rd_refcnt = 0;
+    relation->rd_exclusive = NULL;
     relation->rd_isnailed = false;
     relation->rd_createSubid = InvalidSubTransactionId;
     relation->rd_newRelfilenodeSubid = InvalidSubTransactionId;
@@ -2069,6 +2070,9 @@ RelationIdGetRelation(Oid relationId)
 void
 RelationIncrementReferenceCount(Relation rel)
 {
+    if (rel->rd_exclusive)
+        elog(ERROR, "relation %s already in use by %s",
+             RelationGetRelationName(rel), rel->rd_exclusive);
     ResourceOwnerEnlargeRelationRefs(CurrentResourceOwner);
     rel->rd_refcnt += 1;
     if (!IsBootstrapProcessingMode())
@@ -2086,6 +2090,8 @@ RelationDecrementReferenceCount(Relation rel)
     rel->rd_refcnt -= 1;
     if (!IsBootstrapProcessingMode())
         ResourceOwnerForgetRelationRef(CurrentResourceOwner, rel);
+    if (rel->rd_refcnt == 0)
+        rel->rd_exclusive = NULL;
 }

 /*
@@ -2617,8 +2623,9 @@ RelationClearRelation(Relation relation, bool rebuild)

         /* rd_smgr must not be swapped, due to back-links from smgr level */
         SWAPFIELD(SMgrRelation, rd_smgr);
-        /* rd_refcnt must be preserved */
+        /* rd_refcnt and rd_exclusive must be preserved */
         SWAPFIELD(int, rd_refcnt);
+        SWAPFIELD(const char *, rd_exclusive);
         /* isnailed shouldn't change */
         Assert(newrel->rd_isnailed == relation->rd_isnailed);
         /* creation sub-XIDs must be preserved */
@@ -5917,6 +5924,7 @@ load_relcache_init_file(bool shared)
             rel->rd_refcnt = 1;
         else
             rel->rd_refcnt = 0;
+        rel->rd_exclusive = NULL;
         rel->rd_indexvalid = false;
         rel->rd_indexlist = NIL;
         rel->rd_pkindex = InvalidOid;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 0b5957ba02..22cf7d4e70 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -57,6 +57,7 @@ typedef struct RelationData
     struct SMgrRelationData *rd_smgr;    /* cached file handle, or NULL */
     int            rd_refcnt;        /* reference count */
     BackendId    rd_backend;        /* owning backend id, if temporary relation */
+    const char *rd_exclusive;    /* if not NULL, disallow re-entrant access */
     bool        rd_islocaltemp; /* rel is a temp rel of this session */
     bool        rd_isnailed;    /* rel is nailed in cache */
     bool        rd_isvalid;        /* relcache entry is valid */
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d2b15a3387..ddcbf88c38 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3571,6 +3571,9 @@ RenameRelationInternal(Oid myrelid, const char *newrelname, bool is_internal, bo
  * we are worried about active indexscans on the index.  The trigger-event
  * check can be skipped, since we are doing no damage to the parent table.
  *
+ * Callers should consider also calling RelationMarkExclusiveAccess.
+ * We don't force that, because for many it's safe not to require it.
+ *
  * The statement name (eg, "ALTER TABLE") is passed for use in error messages.
  */
 void
@@ -3594,6 +3597,11 @@ CheckTableNotInUse(Relation rel, const char *stmt)
         /* translator: first %s is a SQL command, eg ALTER TABLE */
                  errmsg("cannot %s \"%s\" because it has pending trigger events",
                         stmt, RelationGetRelationName(rel))));
+
+    /*
+     * Ideally, we'd also do RelationCheckExclusiveAccess(rel) here, but
+     * currently ALTER TABLE can recurse in ways that would trigger an error.
+     */
 }

 /*
@@ -4001,22 +4009,45 @@ ATController(AlterTableStmt *parsetree,
     List       *wqueue = NIL;
     ListCell   *lcmd;

-    /* Phase 1: preliminary examination of commands, create work queue */
-    foreach(lcmd, cmds)
+    /*
+     * Tables are marked for exclusive access when they are added to the work
+     * queue.  We must be certain that the marks are released on exit, so use
+     * a PG_TRY block.  (If we get through without error, ATRewriteTables will
+     * have released the marks, so we needn't do so again.)
+     */
+    PG_TRY();
     {
-        AlterTableCmd *cmd = (AlterTableCmd *) lfirst(lcmd);
+        /* Phase 1: preliminary examination of commands, create work queue */
+        foreach(lcmd, cmds)
+        {
+            AlterTableCmd *cmd = (AlterTableCmd *) lfirst(lcmd);

-        ATPrepCmd(&wqueue, rel, cmd, recurse, false, lockmode, context);
-    }
+            ATPrepCmd(&wqueue, rel, cmd, recurse, false, lockmode, context);
+        }

-    /* Close the relation, but keep lock until commit */
-    relation_close(rel, NoLock);
+        /* Close the relation, but keep lock until commit */
+        relation_close(rel, NoLock);
+
+        /* Phase 2: update system catalogs */
+        ATRewriteCatalogs(&wqueue, lockmode, context);
+
+        /* Phase 3: scan/rewrite tables as needed, and run afterStmts */
+        ATRewriteTables(parsetree, &wqueue, lockmode, context);
+    }
+    PG_CATCH();
+    {
+        /* Release exclusive-access marks after an error */
+        ListCell   *ltab;

-    /* Phase 2: update system catalogs */
-    ATRewriteCatalogs(&wqueue, lockmode, context);
+        foreach(ltab, wqueue)
+        {
+            AlteredTableInfo *tab = (AlteredTableInfo *) lfirst(ltab);

-    /* Phase 3: scan/rewrite tables as needed, and run afterStmts */
-    ATRewriteTables(parsetree, &wqueue, lockmode, context);
+            RelationReleaseExclusiveAccess(tab->relid);
+        }
+        PG_RE_THROW();
+    }
+    PG_END_TRY();
 }

 /*
@@ -5064,6 +5095,19 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode,
         }
     }

+    /*
+     * At this point, the table(s) are self-consistent again, so it's safe to
+     * release our exclusive-access marks.  We must do so, in fact, because
+     * foreign key revalidation will use the executor which would fail;
+     * likewise, any afterStmts could fail.
+     */
+    foreach(ltab, *wqueue)
+    {
+        AlteredTableInfo *tab = (AlteredTableInfo *) lfirst(ltab);
+
+        RelationReleaseExclusiveAccess(tab->relid);
+    }
+
     /*
      * Foreign key constraints are checked in a final pass, since (a) it's
      * generally best to examine each one separately, and (b) it's at least
@@ -5536,6 +5580,14 @@ ATGetQueueEntry(List **wqueue, Relation rel)

     *wqueue = lappend(*wqueue, tab);

+    /*
+     * Mark the table to disallow re-entrant access by any user-defined
+     * queries that might be called from ALTER TABLE actions.  Now that the
+     * table is entered in the work queue, we can be sure that the marking
+     * will be released on error exit.
+     */
+    RelationMarkExclusiveAccess(relid, "ALTER TABLE");
+
     return tab;
 }

diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c
index 3132fd35a5..0bb4c50c96 100644
--- a/src/backend/optimizer/util/inherit.c
+++ b/src/backend/optimizer/util/inherit.c
@@ -181,9 +181,15 @@ expand_inherited_rtentry(PlannerInfo *root, RelOptInfo *rel,
             RangeTblEntry *childrte;
             Index        childRTindex;

-            /* Open rel if needed; we already have required locks */
+            /*
+             * Open rel if needed.  We already have required locks, but for
+             * child tables, we must also check for exclusive-access marks.
+             */
             if (childOID != parentOID)
+            {
                 newrelation = table_open(childOID, NoLock);
+                RelationCheckExclusiveAccess(newrelation);
+            }
             else
                 newrelation = oldrelation;

@@ -360,6 +366,9 @@ expand_partitioned_rtentry(PlannerInfo *root, RelOptInfo *relinfo,
         /* Open rel, acquiring required locks */
         childrel = table_open(childOID, lockmode);

+        /* Check for exclusive-access marks, too */
+        RelationCheckExclusiveAccess(childrel);
+
         /*
          * Temporary partitions belonging to other sessions should have been
          * disallowed at definition, but for paranoia's sake, let's double
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index b875a50646..7e6416bc79 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1336,6 +1336,8 @@ buildNSItemFromLists(RangeTblEntry *rte, Index rtindex,
  * This is essentially just the same as table_openrv(), except that it caters
  * to some parser-specific error reporting needs, notably that it arranges
  * to include the RangeVar's parse location in any resulting error.
+ * Also, we check to see if we're inside some command that has marked the
+ * table for exclusive access.
  *
  * Note: properly, lockmode should be declared LOCKMODE not int, but that
  * would require importing storage/lock.h into parse_relation.h.  Since
@@ -1379,6 +1381,14 @@ parserOpenTable(ParseState *pstate, const RangeVar *relation, int lockmode)
                                 relation->relname)));
         }
     }
+
+    /*
+     * Verify that table isn't being munged by, eg, ALTER TABLE.  Our lock
+     * ensures that that's not happening in other sessions, but it doesn't
+     * protect against an ALTER in our own session.
+     */
+    RelationCheckExclusiveAccess(rel);
+
     cancel_parser_errposition_callback(&pcbstate);
     return rel;
 }
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index fe777c3103..356c1c9e1a 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -182,6 +182,14 @@ AcquireRewriteLocks(Query *parsetree,

                 rel = table_open(rte->relid, lockmode);

+                /*
+                 * Verify that table isn't being munged by, eg, ALTER TABLE.
+                 * Our lock ensures that that's not happening in other
+                 * sessions, but it doesn't protect against an ALTER in our
+                 * own session.
+                 */
+                RelationCheckExclusiveAccess(rel);
+
                 /*
                  * While we have the relation open, update the RTE's relkind,
                  * just in case it changed since this rule was made.
@@ -3051,6 +3059,11 @@ rewriteTargetView(Query *parsetree, Relation view)
      */
     base_rel = table_open(base_rte->relid, RowExclusiveLock);

+    /*
+     * Verify that table isn't being munged by, eg, ALTER TABLE.
+     */
+    RelationCheckExclusiveAccess(base_rel);
+
     /*
      * While we have the relation open, update the RTE's relkind, just in case
      * it changed since this view was made (cf. AcquireRewriteLocks).
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 50d6ad28b4..fe47bab8f3 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1779,7 +1779,11 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
              * acquire a non-conflicting lock.
              */
             if (acquire)
+            {
                 LockRelationOid(rte->relid, rte->rellockmode);
+                /* ... and check for exclusive-access marks */
+                RelidCheckExclusiveAccess(rte->relid);
+            }
             else
                 UnlockRelationOid(rte->relid, rte->rellockmode);
         }
@@ -1839,7 +1843,11 @@ ScanQueryForLocks(Query *parsetree, bool acquire)
             case RTE_RELATION:
                 /* Acquire or release the appropriate type of lock */
                 if (acquire)
+                {
                     LockRelationOid(rte->relid, rte->rellockmode);
+                    /* ... and check for exclusive-access marks */
+                    RelidCheckExclusiveAccess(rte->relid);
+                }
                 else
                     UnlockRelationOid(rte->relid, rte->rellockmode);
                 break;
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index a2453cf1f4..6fada587f7 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -182,6 +182,17 @@ static TupleDesc *EOXactTupleDescArray;
 static int    NextEOXactTupleDescNum = 0;
 static int    EOXactTupleDescArrayLen = 0;

+/*
+ * Table of relation OIDs that are marked for exclusive access.
+ */
+typedef struct
+{
+    Oid            reloid;            /* lookup key: OID of marked relation */
+    const char *stmt;            /* statement name of marking code */
+} ExclusiveMarkEnt;
+
+static HTAB *ExclusiveMarkTab;
+
 /*
  *        macros to manipulate the lookup hashtable
  */
@@ -6397,3 +6408,130 @@ unlink_initfile(const char *initfilename, int elevel)
                             initfilename)));
     }
 }
+
+
+/*
+ * RelationMarkExclusiveAccess
+ *        Mark relation as not being available for re-entrant access.
+ *
+ * This function and its siblings allow us to prevent access to a table
+ * that's being modified by other code in the same session.  Taking
+ * AccessExclusiveLock prevents *other* sessions from touching the table,
+ * but not our own session.  Therefore, logic that might call arbitrary code
+ * while making critical changes to a table needs to use this to ensure that
+ * the table is not accessed by user code while it's in an inconsistent state.
+ *
+ * The marking persists until RelationReleaseExclusiveAccess is called.
+ * It is caller's responsibility to be sure that happens even after an
+ * error; there is no transaction-related cleanup logic!
+ *
+ * "stmt" should be a literal constant string, or otherwise guaranteed to
+ * remain stable until exclusive access is released.
+ */
+void
+RelationMarkExclusiveAccess(Oid relid, const char *stmt)
+{
+    ExclusiveMarkEnt *entry;
+
+    if (ExclusiveMarkTab == NULL)
+    {
+        /* First time through: initialize the hash table */
+        HASHCTL        ctl;
+
+        MemSet(&ctl, 0, sizeof(ctl));
+        ctl.keysize = sizeof(Oid);
+        ctl.entrysize = sizeof(ExclusiveMarkEnt);
+        ExclusiveMarkTab = hash_create("Exclusive access table", 16,
+                                       &ctl, HASH_ELEM | HASH_BLOBS);
+    }
+
+    /*
+     * Create an entry.  We don't worry about whether there already was one.
+     */
+    entry = (ExclusiveMarkEnt *) hash_search(ExclusiveMarkTab,
+                                             (void *) &relid,
+                                             HASH_ENTER, NULL);
+    entry->stmt = stmt;
+}
+
+/*
+ * RelationReleaseExclusiveAccess
+ *        Release exclusive-access marking of relation.
+ *
+ * It's not an error for the relation not to be marked.
+ */
+void
+RelationReleaseExclusiveAccess(Oid relid)
+{
+    /* Nothing to do if no table */
+    if (ExclusiveMarkTab == NULL)
+        return;
+    /* Delete entry, if any */
+    (void) hash_search(ExclusiveMarkTab,
+                       (void *) &relid,
+                       HASH_REMOVE, NULL);
+
+}
+
+/*
+ * RelationCheckExclusiveAccess
+ *        Throws error if relation is marked for exclusive access.
+ */
+void
+RelationCheckExclusiveAccess(Relation rel)
+{
+    ExclusiveMarkEnt *entry;
+
+    /* Nothing to do if no table */
+    if (ExclusiveMarkTab == NULL)
+        return;
+    /* Is rel in the table? */
+    entry = (ExclusiveMarkEnt *) hash_search(ExclusiveMarkTab,
+                                             (void *) &(rel->rd_id),
+                                             HASH_FIND, NULL);
+    if (entry)
+        ereport(ERROR,
+                (errcode(ERRCODE_OBJECT_IN_USE),
+        /* translator: second %s is a SQL command, eg ALTER TABLE */
+                 errmsg("relation \"%s\" is already in use by %s",
+                        RelationGetRelationName(rel), entry->stmt)));
+}
+
+/*
+ * RelidCheckExclusiveAccess
+ *        As above, but we only have a relation OID.
+ */
+void
+RelidCheckExclusiveAccess(Oid relid)
+{
+    ExclusiveMarkEnt *entry;
+
+    /* Nothing to do if no table */
+    if (ExclusiveMarkTab == NULL)
+        return;
+    /* Is rel in the table? */
+    entry = (ExclusiveMarkEnt *) hash_search(ExclusiveMarkTab,
+                                             (void *) &relid,
+                                             HASH_FIND, NULL);
+    if (entry)
+    {
+        /*
+         * If the OID is in the mark table, it seems reasonably safe to assume
+         * that get_rel_name will succeed.  But just in case ...
+         */
+        char       *relname = get_rel_name(relid);
+
+        if (relname)
+            ereport(ERROR,
+                    (errcode(ERRCODE_OBJECT_IN_USE),
+            /* translator: second %s is a SQL command, eg ALTER TABLE */
+                     errmsg("relation \"%s\" is already in use by %s",
+                            relname, entry->stmt)));
+        else
+            ereport(ERROR,
+                    (errcode(ERRCODE_OBJECT_IN_USE),
+            /* translator: %s is a SQL command, eg ALTER TABLE */
+                     errmsg("relation with OID %u is already in use by %s",
+                            relid, entry->stmt)));
+    }
+}
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 0b5957ba02..0887b53a38 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -640,5 +640,9 @@ typedef struct ViewOptions
 /* routines in utils/cache/relcache.c */
 extern void RelationIncrementReferenceCount(Relation rel);
 extern void RelationDecrementReferenceCount(Relation rel);
+extern void RelationMarkExclusiveAccess(Oid relid, const char *stmt);
+extern void RelationReleaseExclusiveAccess(Oid relid);
+extern void RelationCheckExclusiveAccess(Relation rel);
+extern void RelidCheckExclusiveAccess(Oid relid);

 #endif                            /* REL_H */
diff --git a/src/test/regress/expected/transactions.out b/src/test/regress/expected/transactions.out
index 1b03310029..9ef09c3a15 100644
--- a/src/test/regress/expected/transactions.out
+++ b/src/test/regress/expected/transactions.out
@@ -534,6 +534,14 @@ select * from xacttest;
 (5 rows)

 rollback;
+-- test case for trying to access a table within an ALTER TABLE on it
+alter table xacttest alter column a set data type text
+  using max_xacttest()::text;
+ERROR:  relation "xacttest" is already in use by ALTER TABLE
+LINE 1: SELECT max(a) from xacttest
+                           ^
+QUERY:  SELECT max(a) from xacttest
+CONTEXT:  PL/pgSQL function max_xacttest() line 1 at RETURN
 -- test case for problems with dropping an open relation during abort
 BEGIN;
     savepoint x;
diff --git a/src/test/regress/sql/transactions.sql b/src/test/regress/sql/transactions.sql
index bf1016489d..1ca9b335aa 100644
--- a/src/test/regress/sql/transactions.sql
+++ b/src/test/regress/sql/transactions.sql
@@ -316,6 +316,10 @@ select * from xacttest;
 rollback;


+-- test case for trying to access a table within an ALTER TABLE on it
+alter table xacttest alter column a set data type text
+  using max_xacttest()::text;
+
 -- test case for problems with dropping an open relation during abort
 BEGIN;
     savepoint x;

pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #16594: DROP INDEX CONCURRENTLY fails on partitioned table with a non helpful error message.
Next
From: Alvaro Herrera
Date:
Subject: Re: ALTER TABLE ALTER COLUMN SET TYPE crash