Thread: Loss of data and info from system tables!!
Hi all, Sorry if you have received this before. I've tried to post the complete mail, but it is not going through. Warning this is a long e-mail and is across two posts. I'm hoping someone out there can help. If this is not the most appropriate mailing list, please point me in the direction of one which is. Recently I noticed that my database was missing a huge amount of data. Which made me very ill!! before we get to the problem here's the system info: Postgres 7.4.2 IBM x345 server OS 'Red Hat Enterprise Linux AS release 3' It's 2-way box with hyper threading enabled based on Intel(R) Xeon(TM) CPU 2.40GHz processor. There are total 2G memory. Using IBM serveraid controller The most import thing for me is data recovery and data integrity! It most of my PhD!! :'( Ok the problem. On the 27th March I notice that one of my web sites was giving me some weird errors. When I looked at the SQL statements generating these error I found that there were no results from these queries. When I selected on the table 'protein_database' I discovered that the database the web site refers to did not exist!! This was all fine 3 days prior! After further selects on more tables it became apparent there was data loss across a large number of tables. Now I feel extremely ill! It seemed that data which was added to the database when it was initially created was lost. As data which I have recently added was still there. So I though, do a dump of the database now and may be I can use an old dump (Dec '04) with the new dump and get close to a complete state. Well to my surprise pg_dump told me that the database did not exist. So I logged in via psql and did 'SELECT * FROM pg_database;' which returned 'another persons database' but not mine. I then performed 'SELECT * FROM pg_user;' and it returned 'postgres' and 'another user' but not me. Well that was a surprise, how I'm able to log in if I'm not in the user table?? The rest of the original post to follow... Many thanks Noel -- PhD student Department of Biochemistry and Molecular Biology Monash University Clayton Victoria 3800 Australia Ph: +61 3 9905 1418 e-mail: noel.faux@med.monash.edu.au web site: http;//vbc.med.monash.edu.au/~fauxn/
Attachment
On Thu, Mar 31, 2005 at 03:12:53AM +0000, Noel Faux wrote: > > On the 27th March I notice that one of my web sites was giving me some > weird errors. When I looked at the SQL statements generating these > error I found that there were no results from these queries. When I > selected on the table 'protein_database' I discovered that the database > the web site refers to did not exist!! This was all fine 3 days prior! > After further selects on more tables it became apparent there was data > loss across a large number of tables. Now I feel extremely ill! It > seemed that data which was added to the database when it was initially > created was lost. As data which I have recently added was still there. Are you doing regular VACUUMs of the database? What's the result of the following query? SELECT oid, xmin, cmin, xmax, cmax, datname, age(datvacuumxid) AS vxid, age(datfrozenxid) AS fxid FROM pg_database; -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Michael Fuhr wrote:
oid | xmin | cmin | xmax | cmax | datname | vxid | fxid
------------+----------+------+------+------+---------+------------+-------------
1021343158 | 63521101 | 0 | 0 | 0 | lcm | 1485252959 | -1736329086
(1 row)
my database is missing 'monashprotein' under datname.
Hope this helps.
Cheers
Noel
Only after the completion of large jobs (inserts) and when ever data is deleted and new tables and indexes are added to the schema. So the last one was some time at the start of the year. We generally only vacuum tables which are affected by deletes, inserts and updates. All vacuums performed as 'vacuum analyze full'.On Thu, Mar 31, 2005 at 03:12:53AM +0000, Noel Faux wrote:On the 27th March I notice that one of my web sites was giving me some weird errors. When I looked at the SQL statements generating these error I found that there were no results from these queries. When I selected on the table 'protein_database' I discovered that the database the web site refers to did not exist!! This was all fine 3 days prior! After further selects on more tables it became apparent there was data loss across a large number of tables. Now I feel extremely ill! It seemed that data which was added to the database when it was initially created was lost. As data which I have recently added was still there.Are you doing regular VACUUMs of the database?
monashprotein=> SELECT oid, xmin, cmin, xmax, cmax, datname, age(datvacuumxid) AS vxid, age(datfrozenxid) AS fxid FROM pg_database;What's the result of the following query? SELECT oid, xmin, cmin, xmax, cmax, datname, age(datvacuumxid) AS vxid, age(datfrozenxid) AS fxid FROM pg_database;
oid | xmin | cmin | xmax | cmax | datname | vxid | fxid
------------+----------+------+------+------+---------+------------+-------------
1021343158 | 63521101 | 0 | 0 | 0 | lcm | 1485252959 | -1736329086
(1 row)
my database is missing 'monashprotein' under datname.
Hope this helps.
Cheers
Noel
-- PhD student Department of Biochemistry and Molecular Biology Monash University Clayton Victoria 3800 Australia Ph: +61 3 9905 1418 e-mail: noel.faux@med.monash.edu.au web site: http;//vbc.med.monash.edu.au/~fauxn/
Attachment
On Thu, Mar 31, 2005 at 04:41:04AM +0000, Noel Faux wrote: > Michael Fuhr wrote: > > > >Are you doing regular VACUUMs of the database? > > > Only after the completion of large jobs (inserts) and when ever data is > deleted and new tables and indexes are added to the schema. So the last > one was some time at the start of the year. We generally only vacuum > tables which are affected by deletes, inserts and updates. All vacuums > performed as 'vacuum analyze full'. Oh dear. You might wish to read the "Routine Database Maintenance Tasks" chapter in the documentation, in particular the "Routine Vacuuming" section, and *especially* "Preventing transaction ID wraparound failures": http://www.postgresql.org/docs/7.4/interactive/maintenance.html > monashprotein=> SELECT oid, xmin, cmin, xmax, cmax, datname, > age(datvacuumxid) AS vxid, age(datfrozenxid) AS fxid FROM pg_database; > oid | xmin | cmin | xmax | cmax | datname | vxid | fxid > ------------+----------+------+------+------+---------+------------+------------- > 1021343158 | 63521101 | 0 | 0 | 0 | lcm | 1485252959 | -1736329086 > (1 row) > my database is missing 'monashprotein' under datname. I'd suggest searching the archives for messages that talk about recovering from transaction ID wraparound (I think that's what's happened -- somebody please correct me if I'm mistaken). I've seen it talked about but haven't had to do it myself, so any advice I could offer would just be repeating what others have suggested. Maybe somebody will come along who's actually gone through it and offer to help. Or, if you're lucky, somebody will correct my diagnosis.... -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Noel Faux <noel.faux@med.monash.edu.au> writes: > We generally only vacuum > tables which are affected by deletes, inserts and updates. You can't do that as an exclusive practice :-(. In particular I suppose that you never vacuumed your system catalogs at all, and now you are behind the eight ball because transaction IDs wrapped around. See http://www.postgresql.org/docs/7.4/static/maintenance.html#VACUUM-FOR-WRAPAROUND regards, tom lane
Tom Lane wrote:
the only way to retrieve the data is by 'could use pg_resetxlog to back up the NextXID counter enough to
make your tables and databases reappear (and thereby lose the effects of
however many recent transactions you back up over).
Once you've found a NextXID setting you like, I'd suggest an immediate
pg_dumpall/initdb/reload to make sure you have a consistent set of data.
Don't VACUUM, or indeed modify the DB at all, until you have gotten a
satisfactory dump.'
How much of the data am I likely to get back??
Is there any other way to recovery the data??
Many thanks
Noel
Thanks heaps for the info. So from reading the following post http://archives.postgresql.org/pgsql-hackers/2005-02/msg00407.phpNoel Faux <noel.faux@med.monash.edu.au> writes:We generally only vacuum tables which are affected by deletes, inserts and updates.You can't do that as an exclusive practice :-(. In particular I suppose that you never vacuumed your system catalogs at all, and now you are behind the eight ball because transaction IDs wrapped around. See http://www.postgresql.org/docs/7.4/static/maintenance.html#VACUUM-FOR-WRAPAROUND
the only way to retrieve the data is by 'could use pg_resetxlog to back up the NextXID counter enough to
make your tables and databases reappear (and thereby lose the effects of
however many recent transactions you back up over).
Once you've found a NextXID setting you like, I'd suggest an immediate
pg_dumpall/initdb/reload to make sure you have a consistent set of data.
Don't VACUUM, or indeed modify the DB at all, until you have gotten a
satisfactory dump.'
How much of the data am I likely to get back??
Is there any other way to recovery the data??
Many thanks
Noel
regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
-- PhD student Department of Biochemistry and Molecular Biology Monash University Clayton Victoria 3800 Australia Ph: +61 3 9905 1418 e-mail: noel.faux@med.monash.edu.au web site: http;//vbc.med.monash.edu.au/~fauxn/