Re: concurrent drop table with fkeys corrupt pg_trigger - Mailing list pgsql-bugs

From Tom Lane
Subject Re: concurrent drop table with fkeys corrupt pg_trigger
Date
Msg-id 25228.1130855274@sss.pgh.pa.us
Whole thread Raw
In response to Re: concurrent drop table with fkeys corrupt pg_trigger  (Brandon Black <blblack@gmail.com>)
List pgsql-bugs
Brandon Black <blblack@gmail.com> writes:
> ERROR:  too many trigger records found for relation "important_table"

> But then it all goes back to normal until it happens again the next
> morning.  Remember, "important_table" has no triggers that I know of.

... except all the foreign-key triggers.

I think what's happening here is that a backend reads the pg_class entry
for "important_table", sees it has some triggers (because reltriggers is
nonzero), and then goes to scan pg_triggers to find them.  By the time
it manages to do the scan, somebody else has committed an addition of a
trigger.  Since we use SnapshotNow for reading system catalogs, the
added row is visible immediately, and so you get the complaint that the
contents of pg_trigger don't match up with what we saw in
pg_class.reltriggers.

What's not immediately clear though is why this scenario isn't prevented
by high-level relation locking.  We require the addition of the trigger
to take exclusive lock on the table, so how come the reader isn't
blocked until that finishes?

[ checks old notes... ]  Hm, it seems this has already come up:
http://archives.postgresql.org/pgsql-hackers/2002-10/msg01413.php
When loading a relcache entry, we really ought to obtain some lock on
the relation *before* reading the catalogs.  My recollection is that
this would have been pretty painful back in 2002, but maybe with
subsequent restructuring it wouldn't be so bad now.

            regards, tom lane

pgsql-bugs by date:

Previous
From: ""
Date:
Subject: BUG #2011: warning during link of plperl
Next
From: Tom Lane
Date:
Subject: Re: BUG #2012: SPI_fnumber sigsegv when compiled with 7.4.8