Thread: Loss of data and info from system tables!!

Loss of data and info from system tables!!

From
Noel Faux
Date:
Hi all,

Sorry if you have received this before. I've tried to post the complete
mail, but it is not going through.
Warning this is a long e-mail and is across two posts.
I'm hoping someone out there can help.  If this is not the most
appropriate mailing list, please point me in the direction of one which is.

Recently I noticed that my database was missing a huge amount of data.
Which made me very ill!!
before we get to the problem here's the system info:
Postgres 7.4.2
IBM x345 server
OS 'Red Hat Enterprise Linux AS release 3'
It's 2-way box with hyper threading enabled based on Intel(R) Xeon(TM)
CPU 2.40GHz processor.
There are total 2G memory.
Using IBM serveraid controller

The most import thing for me is data recovery and data integrity! It
most of my PhD!! :'(
Ok the problem.
On the 27th March I notice that one of my web sites was giving me some
weird errors.  When I looked at the SQL statements generating these
error I found that there were no results from these queries.  When I
selected on the table 'protein_database' I discovered that the database
the web site refers to did not exist!! This was all fine 3 days prior!
After further selects on more tables it became apparent there was data
loss across a large number of tables. Now I feel extremely ill!  It
seemed that data which was added to the database when it was initially
created was lost.  As data which I have recently added was still there.
So I though, do a dump of the database now and may be I can use an old
dump (Dec '04) with the new dump and get close to a complete state.
Well to my surprise pg_dump told me that the database did not exist.  So
I logged in via psql and did 'SELECT * FROM pg_database;' which returned
'another persons database' but not mine.  I then performed  'SELECT *
FROM pg_user;'  and it returned 'postgres' and 'another user' but not
me.  Well that was a surprise, how I'm able to log in if I'm not in the
user table??

The rest of the original post to follow...

Many thanks
Noel

--
PhD student
Department of Biochemistry and Molecular Biology
Monash University
Clayton
Victoria 3800
Australia
Ph: +61 3 9905 1418
e-mail: noel.faux@med.monash.edu.au
web site: http;//vbc.med.monash.edu.au/~fauxn/


Attachment

Re: Loss of data and info from system tables!!

From
Michael Fuhr
Date:
On Thu, Mar 31, 2005 at 03:12:53AM +0000, Noel Faux wrote:
>
> On the 27th March I notice that one of my web sites was giving me some
> weird errors.  When I looked at the SQL statements generating these
> error I found that there were no results from these queries.  When I
> selected on the table 'protein_database' I discovered that the database
> the web site refers to did not exist!! This was all fine 3 days prior!
> After further selects on more tables it became apparent there was data
> loss across a large number of tables. Now I feel extremely ill!  It
> seemed that data which was added to the database when it was initially
> created was lost.  As data which I have recently added was still there.

Are you doing regular VACUUMs of the database?  What's the result of
the following query?

SELECT oid, xmin, cmin, xmax, cmax, datname,
       age(datvacuumxid) AS vxid,
       age(datfrozenxid) AS fxid
FROM pg_database;

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Loss of data and info from system tables!!

From
Noel Faux
Date:
Michael Fuhr wrote:
On Thu, Mar 31, 2005 at 03:12:53AM +0000, Noel Faux wrote: 
On the 27th March I notice that one of my web sites was giving me some 
weird errors.  When I looked at the SQL statements generating these 
error I found that there were no results from these queries.  When I 
selected on the table 'protein_database' I discovered that the database 
the web site refers to did not exist!! This was all fine 3 days prior! 
After further selects on more tables it became apparent there was data 
loss across a large number of tables. Now I feel extremely ill!  It 
seemed that data which was added to the database when it was initially 
created was lost.  As data which I have recently added was still there.     
Are you doing regular VACUUMs of the database? 
Only after the completion of large jobs (inserts) and when ever data is deleted and new tables and indexes are added to the schema. So the last one was some time at the start of the year. We generally only vacuum tables which are affected by deletes, inserts and updates.  All vacuums performed as  'vacuum analyze full'.
 What's the result of
the following query?

SELECT oid, xmin, cmin, xmax, cmax, datname,      age(datvacuumxid) AS vxid,      age(datfrozenxid) AS fxid
FROM pg_database; 
monashprotein=> SELECT oid, xmin, cmin, xmax, cmax, datname, age(datvacuumxid) AS vxid, age(datfrozenxid) AS fxid FROM pg_database;
    oid     |   xmin   | cmin | xmax | cmax | datname |    vxid    |    fxid
------------+----------+------+------+------+---------+------------+-------------
 1021343158 | 63521101 |    0 |    0 |    0 | lcm     | 1485252959 | -1736329086
(1 row)
my database is missing 'monashprotein' under datname.

Hope this helps.

Cheers
Noel
-- 
PhD student
Department of Biochemistry and Molecular Biology
Monash University
Clayton 
Victoria 3800
Australia
Ph: +61 3 9905 1418
e-mail: noel.faux@med.monash.edu.au
web site: http;//vbc.med.monash.edu.au/~fauxn/
Attachment

Re: Loss of data and info from system tables!!

From
Michael Fuhr
Date:
On Thu, Mar 31, 2005 at 04:41:04AM +0000, Noel Faux wrote:
> Michael Fuhr wrote:
> >
> >Are you doing regular VACUUMs of the database?
> >
> Only after the completion of large jobs (inserts) and when ever data is
> deleted and new tables and indexes are added to the schema. So the last
> one was some time at the start of the year. We generally only vacuum
> tables which are affected by deletes, inserts and updates.  All vacuums
> performed as  'vacuum analyze full'.

Oh dear.  You might wish to read the "Routine Database Maintenance
Tasks" chapter in the documentation, in particular the "Routine
Vacuuming" section, and *especially* "Preventing transaction ID
wraparound failures":

http://www.postgresql.org/docs/7.4/interactive/maintenance.html

> monashprotein=> SELECT oid, xmin, cmin, xmax, cmax, datname,
> age(datvacuumxid) AS vxid, age(datfrozenxid) AS fxid FROM pg_database;
>    oid     |   xmin   | cmin | xmax | cmax | datname |    vxid    |    fxid
> ------------+----------+------+------+------+---------+------------+-------------
> 1021343158 | 63521101 |    0 |    0 |    0 | lcm     | 1485252959 | -1736329086
> (1 row)
> my database is missing 'monashprotein' under datname.

I'd suggest searching the archives for messages that talk about
recovering from transaction ID wraparound (I think that's what's
happened -- somebody please correct me if I'm mistaken).  I've seen
it talked about but haven't had to do it myself, so any advice I
could offer would just be repeating what others have suggested.
Maybe somebody will come along who's actually gone through it and
offer to help.  Or, if you're lucky, somebody will correct my
diagnosis....

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Loss of data and info from system tables!!

From
Tom Lane
Date:
Noel Faux <noel.faux@med.monash.edu.au> writes:
> We generally only vacuum
> tables which are affected by deletes, inserts and updates.

You can't do that as an exclusive practice :-(.  In particular I suppose
that you never vacuumed your system catalogs at all, and now you are
behind the eight ball because transaction IDs wrapped around.  See
http://www.postgresql.org/docs/7.4/static/maintenance.html#VACUUM-FOR-WRAPAROUND

            regards, tom lane

Re: Loss of data and info from system tables!!

From
Noel Faux
Date:
Tom Lane wrote:
Noel Faux <noel.faux@med.monash.edu.au> writes: 
We generally only vacuum 
tables which are affected by deletes, inserts and updates.   
You can't do that as an exclusive practice :-(.  In particular I suppose
that you never vacuumed your system catalogs at all, and now you are
behind the eight ball because transaction IDs wrapped around.  See
http://www.postgresql.org/docs/7.4/static/maintenance.html#VACUUM-FOR-WRAPAROUND 
Thanks heaps for the info.  So from reading the following post http://archives.postgresql.org/pgsql-hackers/2005-02/msg00407.php
the only way to retrieve the data is by 'could use pg_resetxlog to back up the NextXID counter enough to
make your tables and databases reappear (and thereby lose the effects of
however many recent transactions you back up over).

Once you've found a NextXID setting you like, I'd suggest an immediate
pg_dumpall/initdb/reload to make sure you have a consistent set of data.
Don't VACUUM, or indeed modify the DB at all, until you have gotten a
satisfactory dump.'

How much of the data am I likely to get back??
Is there any other way to recovery the data??

Many thanks
Noel
		regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your     joining column's datatypes do not match 


-- 
PhD student
Department of Biochemistry and Molecular Biology
Monash University
Clayton 
Victoria 3800
Australia
Ph: +61 3 9905 1418
e-mail: noel.faux@med.monash.edu.au
web site: http;//vbc.med.monash.edu.au/~fauxn/
Attachment