Thread: Re: concurrent drop table with fkeys corrupt pg_trigger

Re: concurrent drop table with fkeys corrupt pg_trigger

From
Brandon Black
Date:
>Subject: Re: concurrent drop table with fkeys corrupt pg_trigger
>Date: Thu, 26 May 2005 09:47:25 +0800"
>Qingqing Zhou" <zhouqq ( at ) cs ( dot ) toronto ( dot ) edu> writes
>> If we concurrently perform drop/create table (with foreign keys) commands
>> several times, we could corrupt the pg_trigger system table.
>>
>
>Anybody reproduced it?
>
>Regards,
>Qingqing

There might be something to this.  I'm running in the neighborhood of
~200 writing transactions per second 24/7 on Postgresql 8.1 beta4 at
the moment, and getting some related symptoms.

There is a table "important_table", whose primary key is an fkey to
many, many other tables in the database.  This table has no triggers
on it.  Every morning at roughly 7am, a cronjob kicks off and does a
reasonably large number of "CREATE TABLE", "CREATE TRIGGER" (on the
new table), and "DROP TABLE" statements (they're part of an
inheritance-based table partitioning scheme based on timestamps - it's
dropping outdated tables and making new tables for the upcoming
timeframes).  There are no transactions actually directly using the
tables being created or dropped at the time (since they're outside the
reasonable range of possible current timestamps).  All of the
created/dropped tables of course reference the primary key in
"important_table".

I keep a log of the (very few) failed transactions we get, and every
morning at the same time that cron job runs, we get a handful of
client transactions failing out during a SELECT statement, with the
error:

ERROR:  too many trigger records found for relation "important_table"

But then it all goes back to normal until it happens again the next
morning.  Remember, "important_table" has no triggers that I know of.=20
I suspect that when tables are in the process of being created or
dropped which have fkeys in "important_table", some kind of internal
temporary trigger is created on "important_table", and that there's a
bug in there somewhere?

Re: concurrent drop table with fkeys corrupt pg_trigger

From
Tom Lane
Date:
Brandon Black <blblack@gmail.com> writes:
> ERROR:  too many trigger records found for relation "important_table"

> But then it all goes back to normal until it happens again the next
> morning.  Remember, "important_table" has no triggers that I know of.

... except all the foreign-key triggers.

I think what's happening here is that a backend reads the pg_class entry
for "important_table", sees it has some triggers (because reltriggers is
nonzero), and then goes to scan pg_triggers to find them.  By the time
it manages to do the scan, somebody else has committed an addition of a
trigger.  Since we use SnapshotNow for reading system catalogs, the
added row is visible immediately, and so you get the complaint that the
contents of pg_trigger don't match up with what we saw in
pg_class.reltriggers.

What's not immediately clear though is why this scenario isn't prevented
by high-level relation locking.  We require the addition of the trigger
to take exclusive lock on the table, so how come the reader isn't
blocked until that finishes?

[ checks old notes... ]  Hm, it seems this has already come up:
http://archives.postgresql.org/pgsql-hackers/2002-10/msg01413.php
When loading a relcache entry, we really ought to obtain some lock on
the relation *before* reading the catalogs.  My recollection is that
this would have been pretty painful back in 2002, but maybe with
subsequent restructuring it wouldn't be so bad now.

            regards, tom lane