Incorrect order of database-locking operations in InitPostgres() - Mailing list pgsql-hackers

From Tom Lane
Subject Incorrect order of database-locking operations in InitPostgres()
Date
Msg-id 5880.1433470526@sss.pgh.pa.us
Whole thread Raw
List pgsql-hackers
I've been chasing the intermittent "cache lookup failed for access method
403" failure at session startup that's been seen lately in the buildfarm,
for instance here:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&dt=2015-06-04%2019%3A22%3A46

(Axolotl has shown this 3 times in the last 90 days, not sure if any
others have seen it.)  I hypothesized that this was triggered by the
"VACUUM FULL pg_am" in the concurrently running vacuum.sql regression
test, so I started running the regression tests in parallel with a
shell script doing

while sleep 0.1; do psql -c 'vacuum full pg_am' regression; done

and sure enough, I can reproduce it once in awhile.  I've not tracked
it down yet, but I have gotten pretty annoyed by an unrelated problem:
every so often the "DROP DATABASE regression" at the start of the
regression test sequence will hang up for 5 seconds and then fail,
complaining there's another session already in the DB.  Meanwhile, the
current "psql -c" call also hangs up for 5 seconds.  What is happening
is a sort of deadlock between InitPostgres, which does this:

    /* Now we can mark our PGPROC entry with the database ID */
    /* (We assume this is an atomic store so no lock is needed) */
    MyProc->databaseId = MyDatabaseId;

and then this:

    if (!bootstrap)
        LockSharedObject(DatabaseRelationId, MyDatabaseId, 0,
                         RowExclusiveLock);

and dropdb/CountOtherDBBackends, which first gets the database lock and
then looks to see if any other sessions are advertising the target
database OID in their PGPROC.  If such sessions exist and don't go away
within 5 seconds then CountOtherDBBackends fails.  So, if an incoming
connection sets MyProc->databaseId between the time that dropdb acquires
the database lock and the time that CountOtherDBBackends looks, we have
an effective deadlock because that incoming session will then block on
the database lock.

AFAICS, this is easy to fix by swapping the order of the above-mentioned
operations, ie get the lock before setting MyProc->databaseId, as per
attached patch.  Processes examining the PGPROC array must acquire the
database lock before looking for sessions in the target DB, so they'll
still see any conflicting session.  On the other hand, the incoming
session already has to recheck that the target DB is still there once
it's got the lock, so there's no risk of problems on that side either.

Unless somebody can see a problem with this, I propose to apply and
back-patch this change.  The behavior is infrequent, but it's pretty
nasty when it does occur, since all incoming connections for the
target database are hung up for 5 seconds.  The case of DROP DATABASE
might not be a big deal, but this would also for example cause unwanted
failures in CREATE DATABASE if there were short-term connections to the
template database.

            regards, tom lane

diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index aa67f75..a5d88c1 100644
*** a/src/backend/utils/init/postinit.c
--- b/src/backend/utils/init/postinit.c
*************** InitPostgres(const char *in_dbname, Oid
*** 848,865 ****
              strcpy(out_dbname, dbname);
      }

-     /* Now we can mark our PGPROC entry with the database ID */
-     /* (We assume this is an atomic store so no lock is needed) */
-     MyProc->databaseId = MyDatabaseId;
-
-     /*
-      * We established a catalog snapshot while reading pg_authid and/or
-      * pg_database; but until we have set up MyDatabaseId, we won't react to
-      * incoming sinval messages for unshared catalogs, so we won't realize it
-      * if the snapshot has been invalidated.  Assume it's no good anymore.
-      */
-     InvalidateCatalogSnapshot();
-
      /*
       * Now, take a writer's lock on the database we are trying to connect to.
       * If there is a concurrently running DROP DATABASE on that database, this
--- 848,853 ----
*************** InitPostgres(const char *in_dbname, Oid
*** 867,875 ****
       * pg_database).
       *
       * Note that the lock is not held long, only until the end of this startup
!      * transaction.  This is OK since we are already advertising our use of
!      * the database in the PGPROC array; anyone trying a DROP DATABASE after
!      * this point will see us there.
       *
       * Note: use of RowExclusiveLock here is reasonable because we envision
       * our session as being a concurrent writer of the database.  If we had a
--- 855,868 ----
       * pg_database).
       *
       * Note that the lock is not held long, only until the end of this startup
!      * transaction.  This is OK since we will advertise our use of the
!      * database in the PGPROC array before dropping the lock (in fact, that's
!      * the next thing to do).  Anyone trying a DROP DATABASE after this point
!      * will see us in PGPROC once they have the lock.
!      *
!      * Ordering is important here because we don't want to advertise ourselves
!      * as being in this database until we have the lock; otherwise we create
!      * what amounts to a deadlock with CountOtherDBBackends().
       *
       * Note: use of RowExclusiveLock here is reasonable because we envision
       * our session as being a concurrent writer of the database.  If we had a
*************** InitPostgres(const char *in_dbname, Oid
*** 881,886 ****
--- 874,891 ----
          LockSharedObject(DatabaseRelationId, MyDatabaseId, 0,
                           RowExclusiveLock);

+     /* Now we can mark our PGPROC entry with the database ID */
+     /* (We assume this is an atomic store so no lock is needed) */
+     MyProc->databaseId = MyDatabaseId;
+
+     /*
+      * We established a catalog snapshot while reading pg_authid and/or
+      * pg_database; but until we have set up MyDatabaseId, we won't react to
+      * incoming sinval messages for unshared catalogs, so we won't realize it
+      * if the snapshot has been invalidated.  Assume it's no good anymore.
+      */
+     InvalidateCatalogSnapshot();
+
      /*
       * Recheck pg_database to make sure the target database hasn't gone away.
       * If there was a concurrent DROP DATABASE, this ensures we will die

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file
Next
From: Kouhei Kaigai
Date:
Subject: Re: [idea] more aggressive join pushdown on postgres_fdw