Thread: Disappearing Records
Hi all What is the first thing you would do, when you find that your system has been losing information? Information is there at one stage, and later it's not. I tried checking the logs, but there isn't a delete or drop there anywhere, nor anything else that seems relevent. I tried googling for various rewordings of "PostgreSQL disappearing records", but didn't get anything useful. Since this isn't something I can recreate, I'm not sure what to do about it. The fact that I never really used PostgreSQL that much before may be a hindering factor, but I can't think why information would just mysteriously disappear. Regards Rory
Do you have any cascading deletes that could be doing this by performing a delete on a different table and cascading to the table in question? Terry Rory Browne wrote: > Hi all > > What is the first thing you would do, when you find that your system > has been losing information? Information is there at one stage, and > later it's not. > > I tried checking the logs, but there isn't a delete or drop there > anywhere, nor anything else that seems relevent. I tried googling for > various rewordings of "PostgreSQL disappearing records", but didn't > get anything useful. > > Since this isn't something I can recreate, I'm not sure what to do about it. > > The fact that I never really used PostgreSQL that much before may be a > hindering factor, but I can't think why information would just > mysteriously disappear. > > Regards > Rory > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- Terry Fielder terry@greatgulfhomes.com Associate Director Software Development and Deployment Great Gulf Homes / Ashton Woods Homes Fax: (416) 441-9085
Rory Browne <rory.browne@gmail.com> writes: > What is the first thing you would do, when you find that your system > has been losing information? Information is there at one stage, and > later it's not. Has your system been used long enough that it could be subject to transaction ID wraparound? regards, tom lane
Hi, Has your system been used long enough that it could be subject to transaction ID wraparound? what is this can you give me more information on this or some pointers from where I can get more information on this and how to solve this. because I too has experinced this problem disappearinf records. regards Venki -------Original Message------- Rory Browne <rory.browne@gmail.com> writes: > What is the first thing you would do, when you find that your system > has been losing information? Information is there at one stage, and > later it's not. Has your system been used long enough that it could be subject to transaction ID wraparound? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? | |||
Execute the following SQL on your pg cluster: SELECT datname, age(datfrozenxid) FROM pg_database; datname | age --------------+------------ bp_live | 1075940691 template1 | 1130066178 template0 | 56361936 (3 rows) Apart from template0 which is a special case (provided its frozen and readonly (which it is by default)), you want the numbers in the age column to be less than 2 billion. This is achieved by vacuuming EACH database including template1 regularly. You don't need to perform a full vacuum either. You just need to do it regularly. From my understanding, if numbers in that column have gone negative than you have already experienced transaction wraparound. This may then be seen as "data loss" or missing records. It's also worth checking your pg server log in pgdata/data/serverlog to see if you are seeing messages like this: 2005-10-27 05:55:55 WARNING: some databases have not been vacuumed in 2129225822 transactions HINT: Better vacuum them within 18257825 transactions, or you may have a wraparound failure. Here's a excerpt from the pg 7.4 manual: <quote> With the standard freezing policy, the age column will start at one billion for a freshly-vacuumed database. When the age approaches two billion, the database must be vacuumed again to avoid risk of wraparound failures. Recommended practice is to vacuum each database at least once every half-a-billion (500 million) transactions, so as to provide plenty of safety margin. To help meet this rule, each database-wide VACUUM automatically delivers a warning if there are any pg_database entries showing an age of more than 1.5 billion transactions, for example: play=# VACUUM; WARNING: some databases have not been vacuumed in 1613770184 transactions HINT: Better vacuum them within 533713463 transactions, or you may have a wraparound failure. VACUUM </quote> If you have suffered data loss for this reason, then you'll need to get help from the developers to see whether it can be recovered, or what you can do to reconstruct the data. Good luck! John Venki wrote: > > > Hi, > > Has your system been used long enough that it could be subject to > transaction ID wraparound? > > > > what is this can you give me more information on this or some pointers from > where I can get more information on this and how to solve this. because I > too has experinced this problem disappearinf records. > > > > > > regards > > Venki > > -------Original Message------- > > > > From: Tom Lane > > Date: 11/01/05 20:30:51 > > To: Rory Browne > > Cc: pgsql-general@postgresql.org > > Subject: Re: [GENERAL] Disappearing Records > > > > Rory Browne <rory.browne@gmail.com> writes: > > >>What is the first thing you would do, when you find that your system > > >>has been losing information? Information is there at one stage, and > > >>later it's not. > > > > > Has your system been used long enough that it could be subject to > > transaction ID wraparound? > > > > regards, tom lane > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Have you searched our list archives? > > > > http://archives.postgresql.org
On Wed, Nov 02, 2005 at 07:40:29AM +0000, John Sidney-Woollett wrote: > If you have suffered data loss for this reason, then you'll need to get > help from the developers to see whether it can be recovered, or what you > can do to reconstruct the data. The really nasty thing about it is that because the records are now considered really old, as soon as you do run VACUUM it'll start removing the rows you want to save... What you need is a tool to go through and change any transaction ID significantly in the future and freeze it... You know, like VACUUM... Oh well.. In 8.1 autovacuum will make this kind of thing obsolete. Have a nice day. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Hi, >The really nasty thing about it is that because the records are now >considered really old, as soon as you do run VACUUM it'll start >removing the rows you want to save... So does this mean that when we do a vacuum for the first time there will still be data loss or Am I wrong in this? regards Venki -------Original Message------- On Wed, Nov 02, 2005 at 07:40:29AM +0000, John Sidney-Woollett wrote: > If you have suffered data loss for this reason, then you'll need to get > help from the developers to see whether it can be recovered, or what you > can do to reconstruct the data. The really nasty thing about it is that because the records are now considered really old, as soon as you do run VACUUM it'll start removing the rows you want to save... What you need is a tool to go through and change any transaction ID significantly in the future and freeze it... You know, like VACUUM... Oh well.. In 8.1 autovacuum will make this kind of thing obsolete. Have a nice day. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. | |||
On Wed, Nov 02, 2005 at 05:00:35PM +0530, Venki wrote: > >The really nasty thing about it is that because the records are now > >considered really old, as soon as you do run VACUUM it'll start > >removing the rows you want to save... > So does this mean that when we do a vacuum for the first time there will > still be data loss or Am I wrong in this? VACUUM cannot recover data from transaction wraparound. But we havn't even determined if this has happened as you have not yet posted the output of this query: SELECT datname, age(datfrozenxid) FROM pg_database; Have a ncie day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Thanks guys for your suggestions, but the problem turned out to be my lack of experience(with PostgreSQL), combined with a bug in our PHP Code. Coming from a MySQL background, I assumed that if you "select x from y", then y would be the name of a table. It turned out that in the case that y was actually a view. It has a complex defination, but for explanitory sake, lets say it was defined as: select u.username, g.groupname from users u, groups g where u.group_id=g.id (assuming users are in exactly one group) If the group_id field in the users table was corrupted, and set to a value that isn't in the groups table, then that view wouldn't return anything. Something like that(except that our view wasn't quite as simple) happened to me.
Rory Browne <rory.browne@gmail.com> writes: > select u.username, g.groupname from users u, groups g where u.group_id=g.id > (assuming users are in exactly one group) > > If the group_id field in the users table was corrupted, and set to a > value that isn't in the groups table, then that view wouldn't return > anything. That's why foreign key constraints are good. :) -Doug
Martijn van Oosterhout <kleptog@svana.org> writes: > The really nasty thing about it is that because the records are now > considered really old, as soon as you do run VACUUM it'll start > removing the rows you want to save... You sure about that? I think VACUUM just tests for "committed or not". I'm too lazy to set up a test case, but it's possible that VACUUM FREEZE would resurrect wrapped-around tuples, or could be made to with only a small code tweak. regards, tom lane
On Wed, Nov 02, 2005 at 09:46:38AM -0500, Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > The really nasty thing about it is that because the records are now > > considered really old, as soon as you do run VACUUM it'll start > > removing the rows you want to save... > > You sure about that? I think VACUUM just tests for "committed or not". > > I'm too lazy to set up a test case, but it's possible that VACUUM FREEZE > would resurrect wrapped-around tuples, or could be made to with only a > small code tweak. Well, that would be really nice to be able to tell people. But looking at the code of HeapTupleSatisfiesVacuum it spends a lot of time checking hint bits. While we might be able to fiddle the one function, checking all the places involving the hint bits could be nasty. Or not. Making a test case would be interesting, I'll look into it sometime I guess... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Martijn van Oosterhout <kleptog@svana.org> writes: > On Wed, Nov 02, 2005 at 09:46:38AM -0500, Tom Lane wrote: >> You sure about that? I think VACUUM just tests for "committed or not". >> >> I'm too lazy to set up a test case, but it's possible that VACUUM FREEZE >> would resurrect wrapped-around tuples, or could be made to with only a >> small code tweak. > Well, that would be really nice to be able to tell people. But looking > at the code of HeapTupleSatisfiesVacuum it spends a lot of time > checking hint bits. While we might be able to fiddle the one function, > checking all the places involving the hint bits could be nasty. Or not. You're missing the forest for the trees. The hint bits don't do anything except save a visit to pg_clog. It's still going to come back with HEAPTUPLE_LIVE. The question is whether VACUUM can or should be tweaked to substitute FrozenTransactionId when the xmin is "in the future". Looking at the code, I think that actually a regular, non-FREEZE VACUUM would do the "right thing" for tuples up to about 1 billion xacts past wrap, which is probably enough. So the answer may be "just VACUUM". I'm still too lazy to test it though. regards, tom lane
On Wed, Nov 02, 2005 at 10:28:56AM -0500, Tom Lane wrote: > Looking at the code, I think that actually a regular, non-FREEZE VACUUM > would do the "right thing" for tuples up to about 1 billion xacts past > wrap, which is probably enough. So the answer may be "just VACUUM". > I'm still too lazy to test it though. Well, I tested on 8.1beta which gets downright obnoxious about wraparound. It refuses to do anything unless you're running in a standalone backend. Anyway, your theory seems correct, for this version anyway. The xmin is changed to FrozenTransactionId so it will be visible forever more. By plan or by accident, this certainly helps those people who run into this, as long as previous versions work like this also... Oh, the large positive numbers seems odd to me. I would have thought: WARNING: database "test" must be vacuumed within -49 transactions would be clearer (and more alarming) than: WARNING: database "test" must be vacuumed within 4294967247 transactions Have a nice day, test=# insert into test values (1); INSERT 0 1 test=# select ctid, xmin, xmax, cmin, cmax, * from test; ctid | xmin | xmax | cmin | cmax | value -------+-------+------+------+------+------- (0,1) | 20138 | 0 | 0 | 0 | 1 (1 row) test=# \q -- resetxlog to a new value, create the clog so you can actually start backend> select ctid, xmin, xmax, cmin, cmax, * from test; WARNING: database "template1" must be vacuumed within 45 transactions HINT: To avoid a database shutdown, execute a full-database VACUUM in "template1". 1: ctid (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax (typeid = 29, len = 4, typmod = -1, byval = t) 6: value (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,1)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "20138" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "1" (typeid = 23, len = 4, typmod = -1, byval = t) ---- backend> insert into test values (2); WARNING: database "template1" must be vacuumed within 44 transactions HINT: To avoid a database shutdown, execute a full-database VACUUM in "template1". 1: value (typeid = 23, len = 4, typmod = -1, byval = t) ---- backend> select ctid, xmin, xmax, cmin, cmax, * from test; WARNING: database "template1" must be vacuumed within 43 transactions HINT: To avoid a database shutdown, execute a full-database VACUUM in "template1". 1: ctid (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax (typeid = 29, len = 4, typmod = -1, byval = t) 6: value (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,1)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "20138" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "1" (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,2)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "2147484102" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "2" (typeid = 23, len = 4, typmod = -1, byval = t) ---- -- It's complaining about template1, so just out and fix them first -- After coming back, it's decided it's already wrapped around, even -- though I did a VACUUM FREEZE before adding the values. The really -- large numbers are misleading also. backend> insert into test values(3); WARNING: database "test" must be vacuumed within 4294967248 transactions HINT: To avoid a database shutdown, execute a full-database VACUUM in "test". 1: value (typeid = 23, len = 4, typmod = -1, byval = t) ---- backend> select ctid, xmin, xmax, cmin, cmax, * from test; WARNING: database "test" must be vacuumed within 4294967247 transactions HINT: To avoid a database shutdown, execute a full-database VACUUM in "test". 1: ctid (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax (typeid = 29, len = 4, typmod = -1, byval = t) 6: value (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,1)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "20138" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "1" (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,2)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "2147484102" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "2" (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,3)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "2147503777" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "3" (typeid = 23, len = 4, typmod = -1, byval = t) ---- -- Hit enter a few times backend> select ctid, xmin, xmax, cmin, cmax, * from test; WARNING: database "test" must be vacuumed within 4294967238 transactions HINT: To avoid a database shutdown, execute a full-database VACUUM in "test". 1: ctid (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax (typeid = 29, len = 4, typmod = -1, byval = t) 6: value (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,2)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "2147484102" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "2" (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,3)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "2147503777" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "3" (typeid = 23, len = 4, typmod = -1, byval = t) ---- -- Oh no, my data's disappearing! backend> vacuum test; WARNING: database "test" must be vacuumed within 4294967232 transactions HINT: To avoid a database shutdown, execute a full-database VACUUM in "test". WARNING: database "test" must be vacuumed within 4294967231 transactions HINT: To avoid a database shutdown, execute a full-database VACUUM in "test". WARNING: database "test" must be vacuumed within 4294967230 transactions HINT: To avoid a database shutdown, execute a full-database VACUUM in "test". backend> select ctid, xmin, xmax, cmin, cmax, * from test; WARNING: database "test" must be vacuumed within 4294967229 transactions HINT: To avoid a database shutdown, execute a full-database VACUUM in "test". 1: ctid (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax (typeid = 29, len = 4, typmod = -1, byval = t) 6: value (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,1)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "2" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "1" (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,2)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "2147484102" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "2" (typeid = 23, len = 4, typmod = -1, byval = t) ---- 1: ctid = "(0,3)" (typeid = 27, len = 6, typmod = -1, byval = f) 2: xmin = "2147503777" (typeid = 28, len = 4, typmod = -1, byval = t) 3: xmax = "0" (typeid = 28, len = 4, typmod = -1, byval = t) 4: cmin = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 5: cmax = "0" (typeid = 29, len = 4, typmod = -1, byval = t) 6: value = "3" (typeid = 23, len = 4, typmod = -1, byval = t) ---- -- Yay, it's back!!! > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
In article <m23bmff78i.fsf@Douglas-McNaughts-Powerbook.local>, Douglas McNaught <doug@mcnaught.org> writes: > Rory Browne <rory.browne@gmail.com> writes: >> select u.username, g.groupname from users u, groups g where u.group_id=g.id >> (assuming users are in exactly one group) >> >> If the group_id field in the users table was corrupted, and set to a >> value that isn't in the groups table, then that view wouldn't return >> anything. > That's why foreign key constraints are good. :) Well, Rory already confessed that he came from a MySQL background :-)
Hi Martijn van Oosterhout , This is the output that I get by running the query SELECT datname, age(datfrozenxid) FROM pg_database; datname | age ----------------+---------- MyProd | 10014107 MyProdtest | 10014107 template1 | 10014107 template0 | 10014107 MyDb | 10014107 (5 rows) Regards Venki -------Original Message------- From: Martijn van Oosterhout Date: 11/02/05 17:47:26 To: Venki Subject: Re: [GENERAL] Disappearing Records On Wed, Nov 02, 2005 at 05:00:35PM +0530, Venki wrote: > >The really nasty thing about it is that because the records are now > >considered really old, as soon as you do run VACUUM it'll start > >removing the rows you want to save... > So does this mean that when we do a vacuum for the first time there will > still be data loss or Am I wrong in this? VACUUM cannot recover data from transaction wraparound. But we havn't even determined if this has happened as you have not yet posted the output of this query: SELECT datname, age(datfrozenxid) FROM pg_database; Have a ncie day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. | |||