Thread: data loss after vacuum
I'm not sure if this is the right list to send this, but any help would be appreciated. We recently encountered a problem running postgres where, after a vacuum, all the data in one of our tables was gone. Now, I guess technically we don't know for sure if it was indeed vacuum that caused the data loss, but it seems likely. Data inserted minutes before the vacuum is gone, while data inserted minutes after is still in the table. This is the relevant part of the vacuum command's output: VACUUM FULL VERBOSE ANALYZE ... NOTICE: --Relation cartitems-- NOTICE: Pages 10101: Changed 3, reaped 20, Empty 0, New 0; Tup 690632: Vac 27, Keep/VTL 0/0, UnUsed 18, MinLen 96, MaxLen134; Re-using: Fr ee/Avail. Space 708848/327904; EndEmpty/Avail. Pages 0/1602. CPU 1.24s/0.45u sec elapsed 2.88 sec. NOTICE: Index _cartitems_index: Pages 3241; Tuples 690632: Deleted 27. CPU 0.38s/0.77u sec elapsed 1.64 sec. NOTICE: Index cartitems_orderstatus_index: Pages 2057; Tuples 690632: Deleted 27. CPU 0.19s/0.88u sec elapsed 1.71 sec. NOTICE: Rel cartitems: Pages: 10101 --> 10081; Tuple(s) moved: 663. CPU 0.05s/0.13u sec elapsed 1.06 sec. NOTICE: Index _cartitems_index: Pages 3241; Tuples 690632: Deleted 663. CPU 0.30s/0.67u sec elapsed 2.32 sec. NOTICE: Index cartitems_orderstatus_index: Pages 2060; Tuples 690632: Deleted 663. CPU 0.17s/0.64u sec elapsed 1.08 sec. NOTICE: Analyzing cartitems All the data in table "cartitems" from before the vacuum run is gone. Data inserted after the vacuum run looks fine. The table's data file size was still around 80MB, so I thought maybe this might be a transaction wraparound problem (even though we run vacuum every night), but when I looked at the file contents, it was almost completely null'ed, so it looks like the data is really gone (though shouldn't a full vacuum reclaim the space?). Does anyone have an idea what the problem might be? We do have backups of the data, but it would still be nice to know what caused this. - Allan
Allan Tong <actong@www.quateams.com> writes: > I'm not sure if this is the right list to send this, but any help > would be appreciated. We recently encountered a problem running > postgres where, after a vacuum, all the data in one of our tables > was gone. Now, I guess technically we don't know for sure if it > was indeed vacuum that caused the data loss, but it seems likely. The vacuum output shows that it thought it was removing only 27 out of the nearly 700K rows. So I don't think vacuum is directly to blame. However, it would very possibly have rewritten many of the pages in your table, as a byproduct of moving rows, updating tuple commit bits, etc. > ... when I looked at the file contents, it was almost > completely null'ed, so it looks like the data is really gone (though > shouldn't a full vacuum reclaim the space?). You mean the pages were all-zero? It sounds to me like a serious hardware failure, or possibly kernel/filesystem misfeasance. Postgres would certainly not have written zeroes, but apparently what got dropped onto the disk platter was zeroes. Such failures are uncommon, but by no means un-heard-of. I'd suggest running some read/write disk tests to start with. Also check for kernel errata. regards, tom lane
Tom Lane wrote: > Allan Tong <actong@www.quateams.com> writes: > > I'm not sure if this is the right list to send this, but any help > > would be appreciated. We recently encountered a problem running > > postgres where, after a vacuum, all the data in one of our tables > > was gone. Now, I guess technically we don't know for sure if it > > was indeed vacuum that caused the data loss, but it seems likely. > > The vacuum output shows that it thought it was removing only 27 out > of the nearly 700K rows. So I don't think vacuum is directly to > blame. However, it would very possibly have rewritten many of the > pages in your table, as a byproduct of moving rows, updating tuple > commit bits, etc. > > > ... when I looked at the file contents, it was almost > > completely null'ed, so it looks like the data is really gone (though > > shouldn't a full vacuum reclaim the space?). > > You mean the pages were all-zero? It sounds to me like a serious > hardware failure, or possibly kernel/filesystem misfeasance. Postgres > would certainly not have written zeroes, but apparently what got dropped > onto the disk platter was zeroes. Such failures are uncommon, but > by no means un-heard-of. Isn't that what IDE gives you when it maps a new block to replace a corrupted one? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073