Thread: major database breakdown

major database breakdown

From
Ulrich Wisser
Date:
Hi,

last night I tried to reindex my database with some really ugly results.
I can no longer use the database.

backend> REINDEX DATABASE "CLIX1"
NOTICE:  relation 16416 was reindexed
NOTICE:  relation 1255 was reindexed
NOTICE:  relation 16410 was reindexed
NOTICE:  relation 1247 was reindexed
NOTICE:  relation 1249 was reindexed
NOTICE:  relation 1259 was reindexed
NOTICE:  relation 16388 was reindexed
NOTICE:  relation 16390 was reindexed
NOTICE:  relation 16392 was reindexed
NOTICE:  relation 16394 was reindexed
NOTICE:  relation 16396 was reindexed
NOTICE:  relation 16398 was reindexed
NOTICE:  relation 16400 was reindexed
NOTICE:  relation 16402 was reindexed
NOTICE:  relation 16404 was reindexed
NOTICE:  relation 16406 was reindexed
NOTICE:  relation 16412 was reindexed
NOTICE:  relation 16418 was reindexed
NOTICE:  relation 16594 was reindexed
NOTICE:  relation 16596 was reindexed
NOTICE:  relation 16598 was reindexed
NOTICE:  relation 16384 was reindexed
NOTICE:  relation 16386 was reindexed
ERROR:  Cannot create unique index. Table contains non-unique values
backend>

Which relation is that? How can I find that relation?
When I try to use the database  this error shows up:

-bash-2.05b$ postgres -D /var/lib/pgsql/data/ -O -P CLIX1
LOG:  database system was shut down at 2004-08-15 23:04:53 CEST
LOG:  checkpoint record is at 11C/19619E54
LOG:  redo record is at 11C/19619E54; undo record is at 0/0; shutdown TRUE
LOG:  next transaction id: 51038367; next oid: 128363530
LOG:  database system is ready

POSTGRES backend interactive interface
$Revision: 1.307.2.1 $ $Date: 2003/01/01 21:57:18 $

backend> vacuum full analyze verbose
WARNING:  Index pg_class_oid_index: NUMBER OF INDEX' TUPLES (278) IS NOT
THE SAME AS HEAP' (281).
         Recreate the index.
WARNING:  Index pg_class_relname_nsp_index: NUMBER OF INDEX' TUPLES
(278) IS NOT THE SAME AS HEAP' (281).
         Recreate the index.
ERROR:  _mdfd_getrelnfd: cannot open relation
pg_statistic_relid_att_index: No such file or directory
backend>

TIA

Ulrich


Re: major database breakdown

From
Tom Lane
Date:
Ulrich Wisser <ulrich.wisser@relevanttraffic.se> writes:
> backend> REINDEX DATABASE "CLIX1"
> NOTICE:  relation 16416 was reindexed
> NOTICE:  relation 1255 was reindexed
> ...
> NOTICE:  relation 16384 was reindexed
> NOTICE:  relation 16386 was reindexed
> ERROR:  Cannot create unique index. Table contains non-unique values
> backend>

> Which relation is that? How can I find that relation?

I think this process will run in order of entries in pg_class.  Try
    SELECT oid, relname FROM pg_class
    WHERE relkind = 'r' AND relnamespace = 11
and see if the OIDs seem to match up with the NOTICEs.  If so you can
probably assume that the next one after 16386 is the problem.

Another approach is just to individually reindex the tables that are
giving you trouble (pg_class, pg_statistic).  Also keep in mind that
pg_statistic is noncritical data --- if all else fails, just do a
    DELETE FROM pg_statistic
and then reindex it.

            regards, tom lane

Re: major database breakdown

From
Ulrich Wisser
Date:
Hi Tom,

I did delete everything from pg_statistic. It helped somewhat but
still there is something strange. Please see the transcript of my
session below. How can I fix these broken indexes? I can't drop and
recreate them.

Would be nice to get back online.

Thanks

Ulrich




bash-2.05b$ postgres -D /var/lib/pgsql/data-broken/ -O -P CLIX1
LOG:  database system was shut down at 2004-08-16 11:21:57 CEST
LOG:  checkpoint record is at 11C/363F610C
LOG:  redo record is at 11C/363F610C; undo record is at 0/0; shutdown TRUE
LOG:  next transaction id: 51039682; next oid: 128363647
LOG:  database system is ready

POSTGRES backend interactive interface
$Revision: 1.307.2.1 $ $Date: 2003/01/01 21:57:18 $

backend> delete from pg_statistic
blank
          1: ctid        (typeid = 27, len = 6, typmod = -1, byval = f)
         ----
backend> vacuum full analyze verbose
ERROR:  _mdfd_getrelnfd: cannot open relation
pg_statistic_relid_att_index: No such file or directory
backend> drop index pg_statistic_relid_att_index
ERROR:  Cannot drop index pg_statistic_relid_att_index because it is
required by the database system
backend> reindex table pg_statistic force
WARNING:  cannot unlink 16977/128363646: No such file or directory
backend> reindex table pg_statistic force
backend> vacuum full analyze verbose
WARNING:  Index pg_statistic_relid_att_index: NUMBER OF INDEX' TUPLES
(0) IS NOT THE SAME AS HEAP' (182).
         Recreate the index.
WARNING:  Index pg_statistic_relid_att_index: NUMBER OF INDEX' TUPLES
(0) IS NOT THE SAME AS HEAP' (182).
         Recreate the index.
WARNING:  Index pg_toast_16408_index: NUMBER OF INDEX' TUPLES (0) IS NOT
THE SAME AS HEAP' (5).
         Recreate the index.
backend>
backend> reindex index pg_statistic_relid_att_index force
backend> reindex index pg_toast_16408_index force
ERROR:  Relation "pg_toast_16408_index" does not exist
backend> vacuum full analyze verbose
WARNING:  Index pg_statistic_relid_att_index: NUMBER OF INDEX' TUPLES
(159) IS NOT THE SAME AS HEAP' (341).
         Recreate the index.
WARNING:  Index pg_statistic_relid_att_index: NUMBER OF INDEX' TUPLES
(146) IS NOT THE SAME AS HEAP' (341).
         Recreate the index.
WARNING:  Index pg_toast_16408_index: NUMBER OF INDEX' TUPLES (6) IS NOT
THE SAME AS HEAP' (11).
         Recreate the index.




Re: major database breakdown

From
Tom Lane
Date:
Ulrich Wisser <ulrich.wisser@relevanttraffic.se> writes:
> I did delete everything from pg_statistic. It helped somewhat but
> still there is something strange. Please see the transcript of my
> session below. How can I fix these broken indexes? I can't drop and
> recreate them.

I am not sure what's going on with pg_statistic, but it seems like
maybe there are some tuples that aren't getting deleted.  Try
    delete from pg_statistic
    vacuum verbose pg_statistic
(In a standalone backend it seems you need "set server_min_messages = debug"
to see anything from the vacuum verbose.)  If that shows a nonzero
number of remaining tuples then try "truncate pg_statistic" instead.
(I think you'll need to have started the backend with -O to be allowed
to do this.  Also it might be interesting to look at the remaining rows
with pg_filedump before you truncate.)

I suspect that the ultimate answer may be "dump and reload the
database" :-(.  If there are undeletable rows in pg_statistic, the only
explanation I can think of is transaction ID corruption (ie, their xmax
values are in the future) and the problem may affect other data too.

            regards, tom lane