Thread: "too many trigger records found for relation xyz"

"too many trigger records found for relation xyz"

From
"Mikko Partio"
Date:
Hello all

Postgres version 8.3.1

I just created a bunch of tables (~10) with identical structure; all tables have 6 foreign key references to other tables and a primary key. To my surprise, some of the tables were created ok, some missed primary key and some didn't get created at all.

Postgres complained:

2008-04-08 08:50:41 EEST [721]: [27-1] ERROR:  relation 543058 deleted while still in use

and

2008-04-08 08:50:41 EEST [721]: [36-1] ERROR:  could not find tuple for relation 54309

and

2008-04-08 08:50:42 EEST [721]: [114-1] ERROR:  duplicate key value violates unique constraint "pg_class_oid_index"

When I tried to remove the tables, the same errors kept coming up

2008-04-08 08:53:15 EEST [27794]: [5-1] ERROR:  duplicate key value violates unique constraint "pg_class_oid_index"
2008-04-08 08:53:15 EEST [27794]: [6-1] STATEMENT:  drop table xyz ;

I searched google for a while, and then tried again and this time I was able to drop all but one of the tables. When I try to drop the one remaining table, I get this error message:

2008-04-08 10:09:31 EEST [27794]: [95-1] ERROR:  too many trigger records found for relation "xyz"

The database had a bunch of autovacuum runs and an import process copying data to temporary tables at the time I tried to create the tables. What could cause such behaviour, and how can I remove the one remaining table?

Regards

Mikko







Re: "too many trigger records found for relation xyz"

From
Zdenek Kotala
Date:

Mikko Partio napsal(a):
> Hello all
>
> Postgres version 8.3.1
>
> I just created a bunch of tables (~10) with identical structure; all tables
> have 6 foreign key references to other tables and a primary key. To my
> surprise, some of the tables were created ok, some missed primary key and
> some didn't get created at all.

Can you provide self contained reproduction scenario?

        Zdenek

Re: "too many trigger records found for relation xyz"

From
"Mikko Partio"
Date:


On Tue, Apr 8, 2008 at 11:27 AM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:


Mikko Partio napsal(a):

Hello all

Postgres version 8.3.1

I just created a bunch of tables (~10) with identical structure; all tables
have 6 foreign key references to other tables and a primary key. To my
surprise, some of the tables were created ok, some missed primary key and
some didn't get created at all.

Can you provide self contained reproduction scenario?


Unfortunately not -- if I try to dump one the referenced tables I get this error:

pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  more than one row returned by a subquery used as an expression
pg_dump: The command was: SELECT tableoid, oid, typname, typnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = typowner) as rolname, typinput::oid as typinput, typoutput::oid as typoutput, typelem, typrelid, CASE WHEN typrelid = 0 THEN ' '::"char" ELSE (SELECT relkind FROM pg_class WHERE oid = typrelid) END as typrelkind, typtype, typisdefined, typname[0] = '_' AND typelem != 0 AND (SELECT typarray FROM pg_type te WHERE oid = pg_type.typelem) = oid AS isarray FROM pg_type

Regards

Mikko

Re: "too many trigger records found for relation xyz"

From
"Mikko Partio"
Date:


On Tue, Apr 8, 2008 at 12:20 PM, Mikko Partio <mpartio@gmail.com> wrote:


On Tue, Apr 8, 2008 at 11:27 AM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:


Mikko Partio napsal(a):

Hello all

Postgres version 8.3.1

I just created a bunch of tables (~10) with identical structure; all tables
have 6 foreign key references to other tables and a primary key. To my
surprise, some of the tables were created ok, some missed primary key and
some didn't get created at all.

Can you provide self contained reproduction scenario?


Unfortunately not -- if I try to dump one the referenced tables I get this error:

pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  more than one row returned by a subquery used as an expression
pg_dump: The command was: SELECT tableoid, oid, typname, typnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = typowner) as rolname, typinput::oid as typinput, typoutput::oid as typoutput, typelem, typrelid, CASE WHEN typrelid = 0 THEN ' '::"char" ELSE (SELECT relkind FROM pg_class WHERE oid = typrelid) END as typrelkind, typtype, typisdefined, typname[0] = '_' AND typelem != 0 AND (SELECT typarray FROM pg_type te WHERE oid = pg_type.typelem) = oid AS isarray FROM pg_type


I little investigation showed that there is duplicate row in pg_class:

# select oid from pg_class group by oid having count(*) > 1 ;
  oid
--------
 294397
(1 row)

# \x
Expanded display is on.

# select * from pg_class where oid = 294397;
-[ RECORD 1 ]--+-----------------------------------------
relname        | abc
relnamespace   | 2200
reltype        | 294399
relowner       | 16385
relam          | 0
relfilenode    | 294397
reltablespace  | 0
relpages       | 1
reltuples      | 3
reltoastrelid  | 294400
reltoastidxid  | 0
relhasindex    | t
relisshared    | f
relkind        | r
relnatts       | 4
relchecks      | 0
reltriggers    | 6
relukeys       | 0
relfkeys       | 0
relrefs        | 0
relhasoids     | f
relhaspkey     | t
relhasrules    | f
relhassubclass | f
relfrozenxid   | 3840767676
relacl         | {removed}
reloptions     |

-[ RECORD 2 ]--+-----------------------------------------
relname        | abc
relnamespace   | 2200
reltype        | 294399
relowner       | 16385
relam          | 0
relfilenode    | 294397
reltablespace  | 0
relpages       | 1
reltuples      | 3
reltoastrelid  | 294400
reltoastidxid  | 0
relhasindex    | t
relisshared    | f
relkind        | r
relnatts       | 4
relchecks      | 0
reltriggers    | 12
relukeys       | 0
relfkeys       | 0
relrefs        | 0
relhasoids     | f
relhaspkey     | t
relhasrules    | f
relhassubclass | f
relfrozenxid   | 3840767676
relacl         | {removed}
reloptions     |

Regards

Mikko

Re: "too many trigger records found for relation xyz"

From
Zdenek Kotala
Date:
Mikko Partio napsal(a):
>
>
> I little investigation showed that there is duplicate row in pg_class:
>
> # select oid from pg_class group by oid having count(*) > 1 ;
>   oid
> --------
>  294397
> (1 row)

Could you check if pg_attribute is doubled for this relation?

        Zdenek

Re: "too many trigger records found for relation xyz"

From
"Mikko Partio"
Date:


On Tue, Apr 8, 2008 at 1:19 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:
Mikko Partio napsal(a):



I little investigation showed that there is duplicate row in pg_class:

# select oid from pg_class group by oid having count(*) > 1 ;
 oid
--------
 294397
(1 row)

Could you check if pg_attribute is doubled for this relation?

It looks like pg_attribute has only a single value:

# select * from pg_attribute where attrelid = 294397 ;
 attrelid | attname  | atttypid | attstattarget | attlen | attnum | attndims | attcacheoff | atttypmod | attbyval | attstorage | attalign | attnotnull | atthasdef | attisdropped | attislocal | attinhcount
----------+----------+----------+---------------+--------+--------+----------+-------------+-----------+----------+------------+----------+------------+-----------+--------------+------------+-------------
   294397 | tableoid |       26 |             0 |      4 |     -7 |        0 |          -1 |        -1 | t        | p          | i        | t          | f         | f            | t          |           0
   294397 | cmax     |       29 |             0 |      4 |     -6 |        0 |          -1 |        -1 | t        | p          | i        | t          | f         | f            | t          |           0
   294397 | xmax     |       28 |             0 |      4 |     -5 |        0 |          -1 |        -1 | t        | p          | i        | t          | f         | f            | t          |           0
   294397 | cmin     |       29 |             0 |      4 |     -4 |        0 |          -1 |        -1 | t        | p          | i        | t          | f         | f            | t          |           0
   294397 | xmin     |       28 |             0 |      4 |     -3 |        0 |          -1 |        -1 | t        | p          | i        | t          | f         | f            | t          |           0
   294397 | ctid     |       27 |             0 |      6 |     -1 |        0 |          -1 |        -1 | f        | p          | s        | t          | f         | f            | t          |           0
   294397 | id       |       21 |            -1 |      2 |      1 |        0 |          -1 |        -1 | t        | p          | s        | t          | f         | f            | t          |           0
   294397 | name     |     1043 |            -1 |     -1 |      2 |        0 |          -1 |        16 | f        | x          | i        | t          | f         | f            | t          |           0
   294397 | desc_fi  |       25 |            -1 |     -1 |      3 |        0 |          -1 |        -1 | f        | x          | i        | t          | f         | f            | t          |           0
   294397 | desc_en  |       25 |            -1 |     -1 |      4 |        0 |          -1 |        -1 | f        | x          | i        | t          | f         | f            | t          |           0
(10 rows)

Should I try remove one of the duplicate rows from pg_class?

Regards

Mikko

Re: "too many trigger records found for relation xyz"

From
Zdenek Kotala
Date:
Mikko Partio napsal(a):

> Should I try remove one of the duplicate rows from pg_class?

Try it with caution. You should use ctid column to refer to exact row.

Try before:

select oid, ctid, xmin, xmax, cmin, cmax, relname from pg_class where oid = 294397;

If both row are identical or not.

        Zdenek



Re: "too many trigger records found for relation xyz"

From
"Mikko Partio"
Date:


On Tue, Apr 8, 2008 at 1:45 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:
Mikko Partio napsal(a):


Should I try remove one of the duplicate rows from pg_class?

Try it with caution. You should use ctid column to refer to exact row.

Try before:

select oid, ctid, xmin, xmax, cmin, cmax, relname from pg_class where oid = 294397;

If both row are identical or not.


 # select oid, ctid, xmin, xmax, cmin, cmax, relname from pg_class where oid = 294397;
  oid   |   ctid    |    xmin    |    xmax    | cmin | cmax |   relname
--------+-----------+------------+------------+------+------+--------------
 294397 | (1125,36) | 3944654557 |          0 |   35 |   35 | abc
 294397 | (1124,55) | 3943984380 | 3943984642 |    3 |    3 | abc
(2 rows)

Which one should I remove? What is the worst case scenario for the delete? The database is 1,5TB so I would rather not restore it from backups :-)

Regards

Mikko

Re: "too many trigger records found for relation xyz"

From
Zdenek Kotala
Date:
Mikko Partio napsal(a):
> On Tue, Apr 8, 2008 at 1:45 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:
>
>> Mikko Partio napsal(a):
>>
>>  Should I try remove one of the duplicate rows from pg_class?
>> Try it with caution. You should use ctid column to refer to exact row.
>>
>> Try before:
>>
>> select oid, ctid, xmin, xmax, cmin, cmax, relname from pg_class where oid
>> = 294397;
>>
>> If both row are identical or not.
>>
>
>
>  # select oid, ctid, xmin, xmax, cmin, cmax, relname from pg_class where oid
> = 294397;
>   oid   |   ctid    |    xmin    |    xmax    | cmin | cmax |   relname
> --------+-----------+------------+------------+------+------+--------------
>  294397 | (1125,36) | 3944654557 |          0 |   35 |   35 | abc
>  294397 | (1124,55) | 3943984380 | 3943984642 |    3 |    3 | abc
> (2 rows)
>
> Which one should I remove? What is the worst case scenario for the delete?
> The database is 1,5TB so I would rather not restore it from backups :-)

(1124,55) is deleted one. This row should not be visible. But it seems a problem
with visibility - transaction wraparound. Do you have any warning message in the
log about vacuuming? Or clog corruption when transaction 3943984642 is marked as
rollbacked.

        Zdenek

Re: "too many trigger records found for relation xyz"

From
"Mikko Partio"
Date:


On Tue, Apr 8, 2008 at 2:49 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:
Mikko Partio napsal(a):

On Tue, Apr 8, 2008 at 1:45 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:

Mikko Partio napsal(a):

 Should I try remove one of the duplicate rows from pg_class?
Try it with caution. You should use ctid column to refer to exact row.

Try before:

select oid, ctid, xmin, xmax, cmin, cmax, relname from pg_class where oid
= 294397;

If both row are identical or not.



 # select oid, ctid, xmin, xmax, cmin, cmax, relname from pg_class where oid
= 294397;
 oid   |   ctid    |    xmin    |    xmax    | cmin | cmax |   relname
--------+-----------+------------+------------+------+------+--------------
 294397 | (1125,36) | 3944654557 |          0 |   35 |   35 | abc
 294397 | (1124,55) | 3943984380 | 3943984642 |    3 |    3 | abc
(2 rows)

Which one should I remove? What is the worst case scenario for the delete?
The database is 1,5TB so I would rather not restore it from backups :-)

(1124,55) is deleted one. This row should not be visible. But it seems a problem with visibility - transaction wraparound. Do you have any warning message in the log about vacuuming? Or clog corruption when transaction 3943984642 is marked as rollbacked.


There are no such messages in the logfile, but  autovacuum is working hard  to prevent xid wraparound (we have an import process which consums huge amounts of xids).

# select max(age(relfrozenxid)) from pg_class where relkind = 'r' ;
    max
------------
 204500712
(1 row)

How can I tell if there is clog corruption?

Thanks a lot for your help.

Regards

Mikko

Re: "too many trigger records found for relation xyz"

From
Zdenek Kotala
Date:
Mikko Partio napsal(a):

> How can I tell if there is clog corruption?
>

Unfortunatelly, I'm afraid that there is not way how to check it. CLOG file is
only array of bits and it does not contains any CRC or backup copy. Try to check
filesystem integrity.

    Zdenek


Re: "too many trigger records found for relation xyz"

From
"Mikko Partio"
Date:


On Tue, Apr 8, 2008 at 1:45 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:
Mikko Partio napsal(a):


Should I try remove one of the duplicate rows from pg_class?

Try it with caution. You should use ctid column to refer to exact row.



Ok I removed the faulty tuple and nothing catastrophical happened. I can do a pg_dump now, but I still can't remove the one remaining table:

# drop table xyz ;
ERROR:  too many trigger records found for relation "xyz"

Any insight for this problem?

Regards

Mikko

Re: "too many trigger records found for relation xyz"

From
"Mikko Partio"
Date:


On Tue, Apr 8, 2008 at 5:26 PM, Mikko Partio <mpartio@gmail.com> wrote:

Ok I removed the faulty tuple and nothing catastrophical happened. I can do a pg_dump now, but I still can't remove the one remaining table:

# drop table xyz ;

ERROR:  too many trigger records found for relation "xyz"

Any insight for this problem?



I managed to solve the situation with the help of this thread: http://archives.postgresql.org/pgsql-bugs/2007-02/msg00167.php

Everything seems to be ok for now, but I'm still curious what caused to whole mess to begin with.

Thanks a lot for your help again.

Regards

Mikko

Re: "too many trigger records found for relation xyz"

From
Zdenek Kotala
Date:
Mikko Partio napsal(a):
> On Tue, Apr 8, 2008 at 5:26 PM, Mikko Partio <mpartio@gmail.com> wrote:
>
>> Ok I removed the faulty tuple and nothing catastrophical happened. I can
>> do a pg_dump now, but I still can't remove the one remaining table:
>>
>> # drop table xyz ;
>> ERROR:  too many trigger records found for relation "xyz"
>>
>> Any insight for this problem?
>>
>>
>
> I managed to solve the situation with the help of this thread:
> http://archives.postgresql.org/pgsql-bugs/2007-02/msg00167.php
>
> Everything seems to be ok for now, but I'm still curious what caused to
> whole mess to begin with.

It is really strange. It would be nice to have reproduce scenario. Can you run
same DDL command which invoke a catalog corruption on test database cluster?

        Zdenek

Re: "too many trigger records found for relation xyz"

From
"Mikko Partio"
Date:


On Tue, Apr 8, 2008 at 6:38 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:
Mikko Partio napsal(a):

On Tue, Apr 8, 2008 at 5:26 PM, Mikko Partio <mpartio@gmail.com> wrote:

Ok I removed the faulty tuple and nothing catastrophical happened. I can
do a pg_dump now, but I still can't remove the one remaining table:

# drop table xyz ;
ERROR:  too many trigger records found for relation "xyz"

Any insight for this problem?



I managed to solve the situation with the help of this thread:
http://archives.postgresql.org/pgsql-bugs/2007-02/msg00167.php

Everything seems to be ok for now, but I'm still curious what caused to
whole mess to begin with.

It is really strange. It would be nice to have reproduce scenario. Can you run same DDL command which invoke a catalog corruption on test database cluster?

I tried it on a fresh instance but it went through with no errors  :(

Regards

Mikko

Re: "too many trigger records found for relation xyz"

From
Tom Lane
Date:
"Mikko Partio" <mpartio@gmail.com> writes:
> On Tue, Apr 8, 2008 at 6:38 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:
>> It is really strange. It would be nice to have reproduce scenario. Can you
>> run same DDL command which invoke a catalog corruption on test database
>> cluster?

> I tried it on a fresh instance but it went through with no errors  :(

Is it possible that anything was doing a VACUUM FULL on pg_class
concurrently with your schema changes on these tables?  The behavior
looks suspiciously like some problems we've seen before wherein rows
ended up with the wrong commit-bit markings...

            regards, tom lane

Re: "too many trigger records found for relation xyz"

From
"Mikko Partio"
Date:


On Tue, Apr 8, 2008 at 9:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Mikko Partio" <mpartio@gmail.com> writes:
> On Tue, Apr 8, 2008 at 6:38 PM, Zdenek Kotala <Zdenek.Kotala@sun.com> wrote:
>> It is really strange. It would be nice to have reproduce scenario. Can you
>> run same DDL command which invoke a catalog corruption on test database
>> cluster?

> I tried it on a fresh instance but it went through with no errors  :(

Is it possible that anything was doing a VACUUM FULL on pg_class
concurrently with your schema changes on these tables?  The behavior
looks suspiciously like some problems we've seen before wherein rows
ended up with the wrong commit-bit markings...

I may have had a VACUUM FULL on some of my own tables but definitely not on any system tables. There was also a bunch (maybe 10) autovacuum processes working at that time.

Regards

Mikko