Truncation failure in autovacuum results in data corruption (duplicate keys) - Mailing list pgsql-hackers
From | MauMau |
---|---|
Subject | Truncation failure in autovacuum results in data corruption (duplicate keys) |
Date | |
Msg-id | 5BBC590AE8DF4ED1A170E4D48F1B53AC@tunaPC Whole thread Raw |
Responses |
Re: Truncation failure in autovacuum results in data corruption (duplicate keys)
|
List | pgsql-hackers |
Hello, It seems to me that our customer might have hit an unresolved data corruption issue which is already known in this ML, but I can't figure out why this happened. I'd appreciate if you could give me your thoughts. Depending on the severity of this issue and the customer's request, I think I'll submit a patch to solve the unresolved issue. The customer is using PostgreSQL 9.2 on Windows. Autovacuum failed to truncate the pages at the end of a table, which is probably due to anti-virus software. FYI, automatic checkpoint was in progress when this error occurred. ERROR: could not truncate file "base/aaa/bbb" to 58 blocks: Permission denied After a while, an application got a unique key violation error. The customer says that there shouldn't be any duplicate keys. ERROR: duplicate key value violates unique constraint "pk_xxx" The output of pg_dump on that table certainly includes multiple couple of rows with the same primary key values... data corruption. Another Japanese user, who is not our customer, hit the same problem with 9.4, which was not solved (note: the mail is in Japanese). He said he repeatedly encountered the same error even after REINDEXing with 9.4, but it doesn't happen with 9.1. I wonder if there's something introduced in 9.2 which causes the issue, such as index-only scan stuff: https://ml.postgresql.jp/pipermail/pgsql-jp/2016-October/016865.html The problem with truncation failure was found in 2010. The user reported the problem as another phenomenon on 9.0. The problem could not be solved even by leading hackers here -- Tom, Robert, Alvaro, Heikki, Greg Stark, etc. TODO https://wiki.postgresql.org/wiki/Todo ---------------------------------------- Restructure truncation logic to be more resistant to failure This also involves not writing dirty buffers for a truncated or dropped relation http://archives.postgresql.org/pgsql-hackers/2010-08/msg01032.php ---------------------------------------- Tom's comments in above thread ---------------------------------------- Imagine that we have some rows at the end of a table, we delete them, we vacuum before the next checkpoint. Vacuum decides it can now truncate away the last pages, but fails to do so. The original page state is still on disk, which means we have lost the fact of the deletion --- the rows are now effectively live again, though their index entries are probably gone. ... Still, we have a live issue with heap truncation during plain VACUUM. However, the scope of the problem seems a lot less than I was thinking. Maybe write-the-buffers-first is a sufficient longterm solution. ... So it seems like the only case where there is really grounds for PANIC on failure is the VACUUM case. And there we might apply Heikki's idea of trying to zero the untruncatable pages first. ... I'm thinking that we need some sort of what-to-do-on-error flag passed into RelationTruncate, plus at least order-of-operations fixes in several other places, if not a wholesale refactoring of this whole call stack. But I'm running out of steam and don't have a concrete proposal to make right now. In any case, we've got more problems here than just the original one of forgetting dirty buffers too soon. ---------------------------------------- However, I have a question. How does the truncation failure in autovacuum lead to duplicate keys? The failed-to-be-truncated pages should only contain dead tuples, so pg_dump's table scan should ignore dead tuples in those pages. Regards MauMau
pgsql-hackers by date: