Thread: unique index corruption
Hi all,
I am wondering if autovacuum will periodically rebuild indexes?
If not, how advisable is it to reindex periodically?
We recently had a case of unique index corruption which ended up allowing duplicate
primary key IDs to get inserted and caused widespread data model integrity havoc.
The system ran for several days before it was noticed.
After the problem was discovered we found that a reindex would fail because there were
duplicate ID values.
Our thought is that if we had a scheduled cron job that would periodically do a reindex this
corruption might have been caught sooner by detecting the reindex failure.
If a reindex is something that should be done frequently as part of regular maintenance
why isn't there a mode of autovacuum that does this automatically? Or maybe there is and
I just don't know about it..?
Thank you.
On Wed, Jul 24, 2013 at 1:50 PM, pg noob <pgnube@gmail.com> wrote: > > Hi all, > > In PostgreSQL 8.4... > > I am wondering if autovacuum will periodically rebuild indexes? it will not. REINDEX requires a heavy lock and for most applications it would be just plain untenable to be run without some type of application coordination, especially if the index is large. > If not, how advisable is it to reindex periodically? no harm in that at all -- notwithstanding the locking issues. > We recently had a case of unique index corruption which ended up allowing > duplicate > primary key IDs to get inserted and caused widespread data model integrity > havoc. > > The system ran for several days before it was noticed. > > After the problem was discovered we found that a reindex would fail because > there were > duplicate ID values. > > Our thought is that if we had a scheduled cron job that would periodically > do a reindex this > corruption might have been caught sooner by detecting the reindex failure. > > If a reindex is something that should be done frequently as part of regular > maintenance > why isn't there a mode of autovacuum that does this automatically? Or maybe > there is and > I just don't know about it..? REINDEX to remove corruption is essentially dealing with the symptoms of a problem and IMNSHO is not a real long term solution. First thing to figure out is if we have a postgres bug or some other failure (in particular, hardware). Unusual circumstances, for example any unplanned shutdowns around the time corruption started to appear, might give some clues in terms of figuring where the real problem lies. Essentially this is not supposed to happen. merlin
On Wed, Jul 24, 2013 at 11:50 AM, pg noob <pgnube@gmail.com> wrote: > In PostgreSQL 8.4... > > I am wondering if autovacuum will periodically rebuild indexes? It doesn't rebuild indexes, it marks empty index pages for reuse. > If not, how advisable is it to reindex periodically? Here described the recommendations and explanations for 8.4 http://www.postgresql.org/docs/8.4/static/routine-reindex.html > After the problem was discovered we found that a reindex would fail because > there were > duplicate ID values. > > Our thought is that if we had a scheduled cron job that would periodically > do a reindex this > corruption might have been caught sooner by detecting the reindex failure. Periodical reindex is a very dubious technique to monitor database corruption. Honestly, I have never heard of any standard or recommended practice of doing it. However, there is a tool that pretends to do so (https://github.com/tvondra/pg_check), but I do not now what state it is currently and if it is production ready. > If a reindex is something that should be done frequently as part of regular > maintenance > why isn't there a mode of autovacuum that does this automatically? Or maybe > there is and > I just don't know about it..? It is not necessary to reindex to be a part of regular maintenance. The main goal of autovacuum is to effectively reuse space and to update statistics. If autovacuum is configured properly reindex is not required. However, if you have some high/bulk-update/delete operations autovacuum might not manage with bloat, and in this case you can use this tool pgcompactor (https://code.google.com/p/pgtoolkit/) in conjunction with pgstattuple extension or pg_repack (https://github.com/reorg/pg_repack). -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA Profile: http://www.linkedin.com/in/grayhemp Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979 Skype: gray-hemp Jabber: gray.ru@gmail.com