Thread: Data corruption / recover

Data corruption / recover

From
Daniel Paval
Date:
Hi,

We're dealing with the following situation (PG version is 7.4.30): somehow, a DB has gone missing from pg_database (psql \l listing doesn't show it anymore). More, while you can still \c to it, a \d table listing only shows a fraction of the tables. A \d on the existing tables shows missing columns as well.. However, all the tables (still visible or not) may be queried by selects via psql and the actual data and columns in the result sets seem just fine.

We backed up the data folder and for the last 24h tried to recover the situation to a point where pg_dump works. Unfortunately we haven't been fully successful yet although we tried tons of suggestions in previous similar posts. Reindex didn't help, but a "VACUUM FULL FREEZE [ANALYZE]" did free 70% (14GB to 4) of the used disk space. Of course, the database wasn't vacuumed regularly as it should have been so:

WARNING:  some databases have not been vacuumed in over 2 billion transactions
DETAIL:  You may have already suffered transaction-wraparound data loss.

A full vacuum without the freeze option does make all databases, tables and users visible, but duplicated, i.e. \l then shows the database twice and a \d on the database lists all tables twice. The postgres user appears twice in pg_user, etc. pg_dump won't work if the database is not listed and won't either if the postgres user is duplicated..

Does this sound familiar to anyone? It's a pity that all the data seems to be there, but we're not able to recover and restore it to a clean DB. We'd appreciate any suggestions.

Thank you!

Re: Data corruption / recover

From
Tom Lane
Date:
Daniel Paval <dani@scorpionsoftware.ro> writes:
> We're dealing with the following situation (PG version is 7.4.30):

7.4?  I sure hope you're going to migrate onto something less obsolete
once you get out of this problem.  In any still-in-support release
series, autovacuum would have saved you from this.

> somehow, a DB has gone missing from pg_database (psql \l listing doesn't
> show it anymore). More, while you can still \c to it, a \d table listing
> only shows a fraction of the tables.

Transaction IDs in the system catalogs have wrapped around, so that some
rows appear to be "in the future" and thus invisible.

> A full vacuum _without_ the freeze option does make all databases,
> tables and users visible, but duplicated, i.e. \l then shows the
> database twice and a \d on the database lists all tables twice.

Not too surprising.  You'll need to examine each pair of duplicate
catalog rows, figure out which version is newer, and then delete the
older version.  (Use the ctid column to disambiguate.)  More than
likely, there are some duplicates in your data as well as in the system
catalogs, so I'm afraid you've got quite a bit of manual repair in front
of you.

            regards, tom lane


Re: Data corruption / recover

From
Scott Marlowe
Date:
On Thu, Feb 28, 2013 at 7:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Daniel Paval <dani@scorpionsoftware.ro> writes:
>> We're dealing with the following situation (PG version is 7.4.30):
>
> 7.4?  I sure hope you're going to migrate onto something less obsolete
> once you get out of this problem.  In any still-in-support release
> series, autovacuum would have saved you from this.

Also don't later versions refuse to start in anything other than
single user mode once they get close to wrap around?


Re: Data corruption / recover

From
Daniel Paval
Date:
On 28.02.2013 18:48, Scott Marlowe wrote:
> On Thu, Feb 28, 2013 at 7:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Daniel Paval <dani@scorpionsoftware.ro> writes:
>>> We're dealing with the following situation (PG version is 7.4.30):
>> 7.4?  I sure hope you're going to migrate onto something less obsolete
>> once you get out of this problem.  In any still-in-support release
>> series, autovacuum would have saved you from this.
> Also don't later versions refuse to start in anything other than
> single user mode once they get close to wrap around?
>
>
Yes, I found out that newer versions play on the safe side and won't
allow wrap-around at all.