Thread: BUG #5727: Indexes broken in streaming replication
The following bug has been logged online: Bug reference: 5727 Logged by: Jan Kantert Email address: jan-postgres@kantert.net PostgreSQL version: 9.0.1 Operating system: Ubuntu 10.04 x86_64 2.6.32-22-server #33-Ubuntu SMP x86_64 GNU/Linux Description: Indexes broken in streaming replication Details: Hi, we have set up streaming replication. It works fine in normal cases. We found out that one query did not work anymore on our slaves. We have verified that the slaves were up to date and contained all data. master=# SELECT user_id FROM users WHERE LOWER(login) = LOWER('my_login'); user_id --------- 1234 (1 row) slave=# SELECT user_id FROM users WHERE LOWER(login) = LOWER('my_login'); user_id --------- (0 rows) This seemed to be strange. It turned out that it worked if we change the LOWER(login) to login: slave=# SELECT user_id FROM users WHERE login = LOWER('my_login'); user_id --------- 1234 (1 row) We found out that there existed an index on LOWER(login). So we dropped the index. Our query worked on master and slave as long as there existed no indexes: master=# DROP INDEX index_user_lower_login; DROP INDEX master=# SELECT user_id FROM users WHERE LOWER(login) = LOWER('my_login'); user_id --------- 1234 (1 row) slave=# SELECT user_id FROM users WHERE LOWER(login) = LOWER('my_login'); user_id --------- 1234 (1 row) After we created the index again, we saw strange problems on the slave: master=# CREATE INDEX index_user_lower_login ON users USING hash (lower(login::text)); CREATE INDEX master=# SELECT user_id FROM users WHERE LOWER(login) = LOWER('my_login'); user_id --------- 1234 (1 row) slave=# SELECT user_id FROM users WHERE LOWER(login) = LOWER('my_login'); ERROR: could not read block 0 in file "base/16408/98928848": read only 0 of 8192 bytes If we remove the index, it will work again. Looks like some kind of bug in the replication. Regards, Jan Our Postgresbuild: PostgreSQL 9.0.1 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit
"Jan Kantert" <jan-postgres@kantert.net> writes: > After we created the index again, we saw strange problems on the slave: > master=# CREATE INDEX index_user_lower_login ON users USING hash > (lower(login::text)); Hash indexes are not replicated. There's seldom any very good reason to use them in practice, because they also have no WAL protection and don't perform very well anyway. Why did you pick a hash index for a production application? regards, tom lane
On 26.10.2010 20:04, Jan Kantert wrote: > we have set up streaming replication. It works fine in normal cases. We > found out that one query did not work anymore on our slaves. We have > verified that the slaves were up to date and contained all data. >... > master=# CREATE INDEX index_user_lower_login ON users USING hash > (lower(login::text)); Hash indexes are not WAL-logged, and therefore are not replicated either. For the same reason, they are not crash-safe, and won't work after recovery from a continuous WAL archive. Don't use them. There's a note in the docs about this: > Note: Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX aftera database crash. For this reason, hash index use is presently discouraged. though it doesn't explicitly mention replication. Perhaps it should be updated to something like: Note: Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash. They are also not replicated over streaming or file-based replication. For this reason, hash index use is presently discouraged. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > There's a note in the docs about this: >> Note: Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX aftera database crash. For this reason, hash index use is presently discouraged. > though it doesn't explicitly mention replication. Perhaps it should be > updated to something like: Yeah, I too just noticed that that caveat hadn't been updated. > Note: Hash index operations are not presently WAL-logged, so hash > indexes might need to be rebuilt with REINDEX after a database crash. > They are also not replicated over streaming or file-based replication. > For this reason, hash index use is presently discouraged. s/this reason/these reasons/. Also maybe we should change this to a <caution> or <warning> instead of just a <note>? Otherwise +1. regards, tom lane
On 26.10.2010 20:47, Tom Lane wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >> There's a note in the docs about this: > >>> Note: Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX aftera database crash. For this reason, hash index use is presently discouraged. > >> though it doesn't explicitly mention replication. Perhaps it should be >> updated to something like: > > Yeah, I too just noticed that that caveat hadn't been updated. > >> Note: Hash index operations are not presently WAL-logged, so hash >> indexes might need to be rebuilt with REINDEX after a database crash. >> They are also not replicated over streaming or file-based replication. >> For this reason, hash index use is presently discouraged. > > s/this reason/these reasons/. Also maybe we should change this to a > <caution> or<warning> instead of just a<note>? Otherwise +1. Will do. I also noticed that the hot standby docs are not up-to-date on this: > 25.5.5. Caveats > > There are several limitations of Hot Standby. These can and probably will be fixed in future releases: > > * Operations on hash indexes are not presently WAL-logged, so replay will not update these indexes. Hash indexes will notbe used for query plans during recovery. The initial patch indeed had a special-case in the planner to ignore hash indexes during hot standby, but it was left out because the lack of WAL-logging is a general problem with hash indexes, not a hot standby issue. I'd be tempted to remove that caveat altogether for the same reason, but it's probably good to have that note there and just remove the sentence about query plans: * Operations on hash indexes are not presently WAL-logged, so replay will not update these indexes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> * Operations on hash indexes are not presently WAL-logged, so replay will not update these indexes. Hash indexes willnot be used for query plans during recovery. > The initial patch indeed had a special-case in the planner to ignore > hash indexes during hot standby, but it was left out because the lack of > WAL-logging is a general problem with hash indexes, not a hot standby > issue. Yeah, and also the index would still be broken after the slave exits hot standby and becomes live; so that hack didn't cure the problem anyway. > I'd be tempted to remove that caveat altogether for the same > reason, but it's probably good to have that note there and just remove > the sentence about query plans: Agreed. regards, tom lane
On Tue, 2010-10-26 at 14:08 -0400, Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > >> * Operations on hash indexes are not presently WAL-logged, so replay will not update these indexes. Hash indexes willnot be used for query plans during recovery. > > > The initial patch indeed had a special-case in the planner to ignore > > hash indexes during hot standby, but it was left out because the lack of > > WAL-logging is a general problem with hash indexes, not a hot standby > > issue. > > Yeah, and also the index would still be broken after the slave exits hot > standby and becomes live; so that hack didn't cure the problem anyway. OK, that's a good argument. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services