Adding foreign key constraint holds exclusive lock for too long (on production database) - Mailing list pgsql-performance

From Ben Hoyt
Subject Adding foreign key constraint holds exclusive lock for too long (on production database)
Date
Msg-id CAL9jXCFwKXt67+fSGq2N4NDMEadAe=DtKdUGU2gFSS0t-crFxA@mail.gmail.com
Whole thread Raw
Responses Re: Adding foreign key constraint holds exclusive lock for too long (on production database)
List pgsql-performance
Hi folks,

We're adding a foreign key constraint to a 20-million row table on our
production database, and it's taking about 7 minutes. Because it's an
ALTER TABLE, Postgres acquires an ACCESS EXCLUSIVE lock that prevents
any reads/writes (though this particular table is very write-heavy, so
even a read lock wouldn't help here).

For context: we do this whenever we deploy our site, because our
database is split across two schemas ("live" and "content"), and the
"content" schema we dump from our office database and restore into our
production database. To achieve this we restore it as "contentnew"
into the production db, then rename the "content" schema to
"contentold" and the "contentnew" schema to "content".

This completes the actual deployment, however, now our live-to-content
foreign keys are pointing to "contentold", so the final step is to go
through and drop all the live-to-content foreign keys and recreate
them (against the new content schema). Most of the tables are small
and re-adding the constraint is quick, except for this one table,
which is 20M rows and basically pauses our live website for 7 minutes.

A couple of questions about the ADD CONSTRAINT. The foreign key column
on the local table is indexed, and there are only ~50 unique values,
so the db *could* come up with the unique values pretty quickly and
then check them. Or, even if it needs to do a full scan of the 20M-big
table ("ratesrequests") and join with the referenced table
("provider") on the foreign key, which is I think the most it should
have to do to check the foreign key, the following query only takes
~20s, not 7 minutes:

select p.name
from ratesrequests r
join provider p on r.providerid = p.providerid

I'm guessing the ADD CONSTRAINT logic bypasses some of the query
optimization used for SELECT queries. So I suppose my questions are:

1) Are there ways to speed up adding the constraint? Just speeding it
up a little bit won't really help -- for this purpose it'll need to be
an order of magnitude or so. I'm aware of a couple of possibilities:

a) Upgrade to Postgres 9.1 and use ADD CONSTRAINT NOT VALID. However,
this doesn't really help, as you need to run VALIDATE CONSTRAINT at
some later stage, which still grabs the exclusive lock.

b) Delete old rows from the table so it's not so big. Feels a bit
hacky just to fix this issue.

c) Get rid of this foreign key constraint entirely and just check it
in code when we insert. Pragmatic solution, but not ideal.

2) Is there a better way to do the "content" schema dump/restore that
avoids dropping and recreating the inter-schema foreign keys?

Other notes and research:

* We're running "PostgreSQL 9.0.2, compiled by Visual C++ build 1500,
64-bit" on 64-bit Windows Server 2008 SP1 (6.0.6001)
* The "ratesrequests" table has two text columns, one of which often
contains a few hundred to a couple of KB of data in the field. It is
added to rapidly. We regularly VACCUM ANALYZE it.
* As expected, the ADD CONSTRAINT has gotten slower over time as this
table grew. However -- I'm not 100% sure of this, but it seems to have
jumped recently (from 3-4 minutes to 7 minutes).
* http://www.postgresql.org/message-id/20030323112241.W14634-100000@megazone23.bigpanda.com
-- indicates that ADD CONSTRAINT isn't optimized as well as it could
be
* http://www.postgresql.org/message-id/51A11C97.90209@iol.ie --
indicates that the db ignores the index when add constraints

Thanks,
Ben.


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Problems with hash join over nested loop
Next
From: David Johnston
Date:
Subject: Re: Adding foreign key constraint holds exclusive lock for too long (on production database)