Dear Postgres Gurus;
I've just diagnosed a PostgreSQL 8.3.4 server which, about once a month,
would deadlock shortly after 11pm. It had been doing this for years,
and the prior response was simply to reboot everything. The culprit
boils down to:
# create table cache_table_20090921 (
site_key int NOT NULL REFERENCES contexts(context_key),
blah_blah TEXT NULL,
PRIMARY KEY (site_key)
);
Without the REFERENCES this is a very fast operation and always works.
Adding "DEFERRABLE INITIALLY DEFERRED" or "ON DELETE CASCADE" (as the
original did) makes no difference. With any REFERENCES, the create
table may blocks a long time (up to 45 minutes and counting), all while
attempting to get an AccessExclusiveLock. The server quickly backs up,
simple select statements block, the server maxes out the number of
incoming apache connections, things start swapping, and generally it all
gets into a huge tangle:
# SELECT l.locktype,c.relname,l.pid,l.mode,granted from pg_locks
l,pg_class c where l.relation=c.oid order by relname,granted; locktype | relname | pid |
mode | grantedrelation | article_key_idx | 18891 |
AccessShareLock | trelation | articles | 18891 |
AccessShareLock | trelation | articles_editorid_idx | 18891 |
AccessShareLock | trelation | articles_pkey | 18891 |
AccessShareLock | trelation | articles_response_to_key_idx | 18891 |
AccessShareLock | trelation | articles_state_idx | 18891 |
AccessShareLock | trelation | article_words | 18891 |
AccessShareLock | trelation | article_words_cw | 18891 |
AccessShareLock | trelation | article_words_wc | 18891 |
AccessShareLock | trelation | collection_context_key_idx | 18891 |
AccessShareLock | trelation | collection_owner_key_idx | 18891 |
AccessShareLock | trelation | collections | 18891 |
AccessShareLock | trelation | context_publication_key_idx | 18891 |
AccessShareLock | trelation | contexts | 18891 |
AccessShareLock | t <--------- HERErelation | contexts | 3879 |
AccessExclusiveLock | frelation | contexts | 5477 |
AccessShareLock | frelation | contexts | 5484 |
AccessShareLock | frelation | contexts | 5485 |
AccessShareLock | frelation | contexts | 5486 |
AccessShareLock | frelation | contexts | 5487 |
AccessShareLock | frelation | contexts | 5489 |
AccessShareLock | frelation | contexts | 5493 |
AccessShareLock | frelation | contexts | 5494 |
AccessShareLock | frelation | contexts | 5496 |
AccessShareLock | frelation | contexts | 5497 |
AccessShareLock | frelation | contexts | 5498 |
AccessShareLock | frelation | contexts | 5499 |
AccessShareLock | frelation | contexts | 5500 |
AccessShareLock | frelation | contexts | 5502 |
AccessShareLock | frelation | contexts | 5503 |
AccessShareLock | frelation | contexts | 5504 |
AccessShareLock | frelation | contexts | 5505 |
AccessShareLock | frelation | contexts | 5506 |
AccessShareLock | frelation | contexts | 5507 |
AccessShareLock | frelation | contexts | 5508 |
AccessShareLock | frelation | contexts | 5509 |
AccessShareLock | frelation | contexts | 5510 |
AccessShareLock | frelation | contexts | 5511 |
AccessShareLock | frelation | contexts | 5512 |
AccessShareLock | frelation | contexts | 5515 |
AccessShareLock | frelation | contexts | 5516 |
AccessShareLock | frelation | contexts | 5517 |
AccessShareLock | frelation | contexts | 5518 |
AccessShareLock | frelation | contexts | 5519 |
AccessShareLock | frelation | contexts | 5520 |
AccessShareLock | frelation | contexts | 5521 |
AccessShareLock | frelation | contexts | 5523 |
AccessShareLock | frelation | contexts | 5524 |
AccessShareLock | frelation | contexts_id_key | 18891 |
AccessShareLock | trelation | contexts_pkey | 18891 |
AccessShareLock | trelation | contexts_publication_date_idx | 18891 |
AccessShareLock | trelation | contexts_site_key_ct_id_journal_key | 18891 |
AccessShareLock | trelation | contexts_site_key_ct_type_idx | 18891 |
AccessShareLock | trelation | group_key_idx | 18891 |
AccessShareLock | trelation | parent_key_idx | 18891 |
AccessShareLock | trelation | pg_class | 3911 |
AccessShareLock | trelation | pg_class_oid_index | 3911 |
AccessShareLock | trelation | pg_class_relname_nsp_index | 3911 |
AccessShareLock | trelation | pg_locks | 3911 |
AccessShareLock | trelation | site_key_idx | 18891 |
AccessShareLock | trelation | virtual_ancestor_key_idx | 18891 |
AccessShareLock | trelation | virtual_ancestors | 18891 |
AccessShareLock | trelation | virtual_ancestors_pkey | 18891 |
AccessShareLock | trelation | virtual_context_key_idx | 18891 |
AccessShareLock | trelation | words | 18891 |
AccessShareLock | trelation | words_pkey | 18891 |
AccessShareLock | trelation | words_word | 18891 |
AccessShareLock | t
# select pg_size_pretty(pg_total_relation_size('contexts_pkey'));
35MB
# select pg_size_pretty(pg_total_relation_size('contexts'));
787MB
# select pg_cancel_backend(3879)
t
And all is well again.
-----------------------------------------------------------------------------------------------------------------
So my questions are various
1) Why the AccessExclusiveLock on create table?
2) Why is the foreign key check a heavy operation, since a new table
will have zero foreign keys, it can't possibly violate the constraint yet.
3) Other than eliminating dynamic table creation, how can this operation
be altered?
4) Is there more data I should collect about the lock (e.g. do I have a
good lock summary select statement)?
Note that process 18891 in the example is doing a select:
--------------
SELECT contexts.context_key FROM contexts : JOIN articles ON
(articles.context_key=contexts.context_key) : WHERE contexts.context_key IN
(SELECT context_key FROM article_words JOIN words using (word_key) WHERE
word = 'four') : AND contexts.context_key IN
(SELECT context_key FROM article_words JOIN words using (word_key) WHERE
word = 'step') : AND contexts.context_key IN
(SELECT context_key FROM article_words JOIN words using (word_key) WHERE
word = 'control') : AND contexts.context_key IN
(SELECT context_key FROM article_words JOIN words using (word_key) WHERE
word = 'process') : AND contexts.context_key IN
(SELECT context_key FROM article_words JOIN words using (word_key) WHERE
word = 'business') : AND contexts.context_key IN
: (SELECT DISTINCT
a1.context_key FROM virtual_ancestors a1, collections, virtual_ancestors a2 :
WHERE a1.ancestor_key
= collections.context_key AND collections.owner_key = a2.context_key AND
a2.ancestor_key = '8
2034') : AND articles.indexed : ORDER BY
contexts.publication_date DESC;