Thread: major database breakdown
Hi, last night I tried to reindex my database with some really ugly results. I can no longer use the database. backend> REINDEX DATABASE "CLIX1" NOTICE: relation 16416 was reindexed NOTICE: relation 1255 was reindexed NOTICE: relation 16410 was reindexed NOTICE: relation 1247 was reindexed NOTICE: relation 1249 was reindexed NOTICE: relation 1259 was reindexed NOTICE: relation 16388 was reindexed NOTICE: relation 16390 was reindexed NOTICE: relation 16392 was reindexed NOTICE: relation 16394 was reindexed NOTICE: relation 16396 was reindexed NOTICE: relation 16398 was reindexed NOTICE: relation 16400 was reindexed NOTICE: relation 16402 was reindexed NOTICE: relation 16404 was reindexed NOTICE: relation 16406 was reindexed NOTICE: relation 16412 was reindexed NOTICE: relation 16418 was reindexed NOTICE: relation 16594 was reindexed NOTICE: relation 16596 was reindexed NOTICE: relation 16598 was reindexed NOTICE: relation 16384 was reindexed NOTICE: relation 16386 was reindexed ERROR: Cannot create unique index. Table contains non-unique values backend> Which relation is that? How can I find that relation? When I try to use the database this error shows up: -bash-2.05b$ postgres -D /var/lib/pgsql/data/ -O -P CLIX1 LOG: database system was shut down at 2004-08-15 23:04:53 CEST LOG: checkpoint record is at 11C/19619E54 LOG: redo record is at 11C/19619E54; undo record is at 0/0; shutdown TRUE LOG: next transaction id: 51038367; next oid: 128363530 LOG: database system is ready POSTGRES backend interactive interface $Revision: 1.307.2.1 $ $Date: 2003/01/01 21:57:18 $ backend> vacuum full analyze verbose WARNING: Index pg_class_oid_index: NUMBER OF INDEX' TUPLES (278) IS NOT THE SAME AS HEAP' (281). Recreate the index. WARNING: Index pg_class_relname_nsp_index: NUMBER OF INDEX' TUPLES (278) IS NOT THE SAME AS HEAP' (281). Recreate the index. ERROR: _mdfd_getrelnfd: cannot open relation pg_statistic_relid_att_index: No such file or directory backend> TIA Ulrich
Ulrich Wisser <ulrich.wisser@relevanttraffic.se> writes: > backend> REINDEX DATABASE "CLIX1" > NOTICE: relation 16416 was reindexed > NOTICE: relation 1255 was reindexed > ... > NOTICE: relation 16384 was reindexed > NOTICE: relation 16386 was reindexed > ERROR: Cannot create unique index. Table contains non-unique values > backend> > Which relation is that? How can I find that relation? I think this process will run in order of entries in pg_class. Try SELECT oid, relname FROM pg_class WHERE relkind = 'r' AND relnamespace = 11 and see if the OIDs seem to match up with the NOTICEs. If so you can probably assume that the next one after 16386 is the problem. Another approach is just to individually reindex the tables that are giving you trouble (pg_class, pg_statistic). Also keep in mind that pg_statistic is noncritical data --- if all else fails, just do a DELETE FROM pg_statistic and then reindex it. regards, tom lane
Hi Tom, I did delete everything from pg_statistic. It helped somewhat but still there is something strange. Please see the transcript of my session below. How can I fix these broken indexes? I can't drop and recreate them. Would be nice to get back online. Thanks Ulrich bash-2.05b$ postgres -D /var/lib/pgsql/data-broken/ -O -P CLIX1 LOG: database system was shut down at 2004-08-16 11:21:57 CEST LOG: checkpoint record is at 11C/363F610C LOG: redo record is at 11C/363F610C; undo record is at 0/0; shutdown TRUE LOG: next transaction id: 51039682; next oid: 128363647 LOG: database system is ready POSTGRES backend interactive interface $Revision: 1.307.2.1 $ $Date: 2003/01/01 21:57:18 $ backend> delete from pg_statistic blank 1: ctid (typeid = 27, len = 6, typmod = -1, byval = f) ---- backend> vacuum full analyze verbose ERROR: _mdfd_getrelnfd: cannot open relation pg_statistic_relid_att_index: No such file or directory backend> drop index pg_statistic_relid_att_index ERROR: Cannot drop index pg_statistic_relid_att_index because it is required by the database system backend> reindex table pg_statistic force WARNING: cannot unlink 16977/128363646: No such file or directory backend> reindex table pg_statistic force backend> vacuum full analyze verbose WARNING: Index pg_statistic_relid_att_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (182). Recreate the index. WARNING: Index pg_statistic_relid_att_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (182). Recreate the index. WARNING: Index pg_toast_16408_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (5). Recreate the index. backend> backend> reindex index pg_statistic_relid_att_index force backend> reindex index pg_toast_16408_index force ERROR: Relation "pg_toast_16408_index" does not exist backend> vacuum full analyze verbose WARNING: Index pg_statistic_relid_att_index: NUMBER OF INDEX' TUPLES (159) IS NOT THE SAME AS HEAP' (341). Recreate the index. WARNING: Index pg_statistic_relid_att_index: NUMBER OF INDEX' TUPLES (146) IS NOT THE SAME AS HEAP' (341). Recreate the index. WARNING: Index pg_toast_16408_index: NUMBER OF INDEX' TUPLES (6) IS NOT THE SAME AS HEAP' (11). Recreate the index.
Ulrich Wisser <ulrich.wisser@relevanttraffic.se> writes: > I did delete everything from pg_statistic. It helped somewhat but > still there is something strange. Please see the transcript of my > session below. How can I fix these broken indexes? I can't drop and > recreate them. I am not sure what's going on with pg_statistic, but it seems like maybe there are some tuples that aren't getting deleted. Try delete from pg_statistic vacuum verbose pg_statistic (In a standalone backend it seems you need "set server_min_messages = debug" to see anything from the vacuum verbose.) If that shows a nonzero number of remaining tuples then try "truncate pg_statistic" instead. (I think you'll need to have started the backend with -O to be allowed to do this. Also it might be interesting to look at the remaining rows with pg_filedump before you truncate.) I suspect that the ultimate answer may be "dump and reload the database" :-(. If there are undeletable rows in pg_statistic, the only explanation I can think of is transaction ID corruption (ie, their xmax values are in the future) and the problem may affect other data too. regards, tom lane