ALTER TABLE lock strength reduction patch is unsafe - Mailing list pgsql-hackers

If you set up a pgbench test case that hits the database with a lot of
concurrent selects and non-exclusive-locking ALTER TABLEs, 9.1 soon
falls over.  For example:

$ cat foo.script
alter table pgbench_accounts set (fillfactor = 100);
SELECT abalance FROM pgbench_accounts WHERE aid = 525212;

$ createdb bench
$ pgbench -i -s 10 bench
...
$ pgbench -c 50 -t 1000000 -f foo.script bench
starting vacuum...end.
Client 10 aborted in state 0: ERROR:  relation "pgbench_accounts" does not exist
Client 5 aborted in state 1: ERROR:  cache lookup failed for relation 46260
Client 44 aborted in state 0: ERROR:  relation "pgbench_accounts" does not exist
Client 3 aborted in state 1: ERROR:  relation "pgbench_accounts" does not exist
LINE 1: SELECT abalance FROM pgbench_accounts WHERE aid = 525212;                            ^
Client 45 aborted in state 1: ERROR:  could not open relation with OID 46260
LINE 1: SELECT abalance FROM pgbench_accounts WHERE aid = 525212;                            ^
Client 15 aborted in state 1: ERROR:  cache lookup failed for relation 46260
Client 34 aborted in state 1: ERROR:  could not open relation with OID 46260
LINE 1: SELECT abalance FROM pgbench_accounts WHERE aid = 525212;                            ^
Client 43 aborted in state 1: ERROR:  cache lookup failed for relation 46260
Client 49 aborted in state 1: ERROR:  relation "pgbench_accounts" does not exist
LINE 1: SELECT abalance FROM pgbench_accounts WHERE aid = 525212;                            ^
Client 12 aborted in state 0: ERROR:  relation "pgbench_accounts" does not exist
Client 23 aborted in state 0: ERROR:  relation "pgbench_accounts" does not exist
Client 14 aborted in state 0: ERROR:  relation "pgbench_accounts" does not exist
Client 6 aborted in state 1: ERROR:  could not open relation with OID 46260
LINE 1: SELECT abalance FROM pgbench_accounts WHERE aid = 525212;                            ^
Client 11 aborted in state 1: ERROR:  could not open relation with OID 46260
LINE 1: SELECT abalance FROM pgbench_accounts WHERE aid = 525212;                            ^
Client 4 aborted in state 0: ERROR:  relation "pgbench_accounts" does not exist
... etc etc ...

On my four-core workstation, the failures are infrequent at up to 30
clients but come pretty fast and furious at 50.

What is happening here is this:

1. Some backend commits an ALTER TABLE and sends out an sinval message.

2. In response, some other backend starts to reload its relcache entry
for pgbench_accounts when it begins its next command.  It does an
indexscan with SnapshotNow on pg_class to find the updated pg_class row.

3. Meanwhile, some third backend commits another ALTER TABLE, updating
the pg_class row another time.  Since we have removed the
AccessExclusiveLock that all variants of ALTER TABLE used to take, this
commit can happen while backend #2 is in process of scanning pg_class.

4. Backend #2 visits the new, about-to-be-committed version of
pgbench_accounts' pg_class row just before backend #3 commits.
It sees the row as not good and keeps scanning.  By the time it
reaches the previous version of the row, however, backend #3
*has* committed.  So that version isn't good according to SnapshotNow
either.

5. Thus, backend #2 fails to find any version of the pg_class row
that satisfies SnapshotNow, and it reports an error.  Depending on just
when this happens during the cache load process, you can get any of
the errors displayed above, or probably some other ones.

The particular case I'm showing here only updates pg_class, but other
non-exclusive-lock variants of ALTER TABLE can probably provoke similar
failures with respect to other catalogs, leading to yet different
misbehaviors.

In typical cases where both versions of the row are on the same page,
the window for the concurrent commit to happen is very narrow --- that's
why you need so many clients to make it happen easily.  With enough
clients there's a good chance of losing the CPU between tuple visits.
But of course Murphy's Law says this will happen in production
situations even if the load isn't so high.

I believe that this is fundamentally unavoidable so long as we use
SnapshotNow to read catalogs --- which is something we've talked about
changing, but it will require a pretty major R&D effort to make it
happen.  In the meantime, we have to go back to using
AccessExclusiveLock for table alterations.  It doesn't help to have
a lower lock level if that means that concurrent transactions will
unpredictably fail instead of waiting.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Cédric Villemain
Date:
Subject: Re: Re: patch review : Add ability to constrain backend temporary file space
Next
From: Florian Pflug
Date:
Subject: Re: Boolean operators without commutators vs. ALL/ANY