Thread: (Re)building index using itself or another index of the same table
Hi, Our customer encountered a curious scenario. They have a table with GIN index on expression, which performs multiple joins with this table itself. These joins employ another btree index for efficiency. VACUUM FULL on this table fails with error like ERROR: could not read block 3534 in file "base/41366676/56697497": read only 0 of 8192 bytes It happens because order in which indexes are rebuilt is not specified. GIN index is being rebuilt when btree index is not reconstructed yet; an attempt to use old index with rewritten heap crashes. A problem of similar nature can be reproduced with the following stripped-down scenario: CREATE TABLE pears(f1 int primary key, f2 int); INSERT INTO pears SELECT i, i+1 FROM generate_series(1, 100) i; CREATE OR REPLACE FUNCTION pears_f(i int) RETURNS int LANGUAGE SQL IMMUTABLE AS $$ SELECT f1 FROM pears WHERE pears.f2 = 42 $$; CREATE index ON pears ((pears_f(f1))); Here usage of not-yet-created index on pears_f(f1) for its own construction is pointless, however planner in principle considers it in get_relation_info, tries to get btree height (_bt_getrootheight) -- and fails. There is already a mechanism which prevents usage of indexes during reindex -- ReindexIsProcessingIndex et al. However, to the contrary of what index.c:3664 comment say, these protect only indexes on system catalogs, not user tables: the only real caller is genam.c. Attached patch extends it: the same check is added to get_relation_info. Also SetReindexProcessing is cocked in index_create to defend from index self usage during creation as in stripped example above. There are some other still unprotected callers of index_build; concurrent index creation doesn't need it because index is 'not indisvalid' during the build, and in RelationTruncateIndexes table is empty, so it looks like it can be omitted. One might argue that function selecting from table can hardly be called immutable, and immutability is required for index expressions. However, if user is sure table contents doesn't change, why not? Also, the possiblity of triggering "could not read block" error with plain SQL is definitely not nice. From 5942a3a5b2c90056119b9873c81f30dfa9e003af Mon Sep 17 00:00:00 2001 From: Arseny Sher <sher-ars@yandex.ru> Date: Thu, 12 Sep 2019 17:35:16 +0300 Subject: [PATCH] Avoid touching user indexes while they are being (re)built. Existing ReindexIsProcessingIndex check is consulted only in genam.c and thus enforced only for system catalogs. Check it also in the planner, so that indexes which are currently being rebuilt are never used. Also cock SetReindexProcessing in index_create to defend from index self usage during its creation. Without this, VACUUM FULL or just CREATE INDEX might fail with something like ERROR: could not read block 3534 in file "base/41366676/56697497": read only 0 of 8192 bytes if there are indexes which usage can be considered during these very indexes (re)building, i.e. index expression scans indexed table. --- src/backend/catalog/index.c | 22 ++++++++++++++++++++-- src/backend/optimizer/util/plancat.c | 5 +++++ src/test/regress/expected/create_index.out | 12 ++++++++++++ src/test/regress/sql/create_index.sql | 13 +++++++++++++ 4 files changed, 50 insertions(+), 2 deletions(-) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 3e1d40662d..5bc764ce46 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -1174,7 +1174,22 @@ index_create(Relation heapRelation, } else { - index_build(heapRelation, indexRelation, indexInfo, false, true); + /* ensure SetReindexProcessing state isn't leaked */ + PG_TRY(); + { + /* Suppress use of the target index while building it */ + SetReindexProcessing(heapRelationId, indexRelationId); + + index_build(heapRelation, indexRelation, indexInfo, false, true); + } + PG_CATCH(); + { + /* Make sure flag gets cleared on error exit */ + ResetReindexProcessing(); + PG_RE_THROW(); + } + PG_END_TRY(); + ResetReindexProcessing(); } /* @@ -1379,7 +1394,10 @@ index_concurrently_build(Oid heapRelationId, indexInfo->ii_Concurrent = true; indexInfo->ii_BrokenHotChain = false; - /* Now build the index */ + /* + * Now build the index + * SetReindexProcessing is not required since indisvalid is false anyway + */ index_build(heapRel, indexRelation, indexInfo, false, true); /* Close both the relations, but keep the locks */ diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index cf1761401d..9d58cd2574 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -27,6 +27,7 @@ #include "access/xlog.h" #include "catalog/catalog.h" #include "catalog/dependency.h" +#include "catalog/index.h" #include "catalog/heap.h" #include "catalog/pg_am.h" #include "catalog/pg_proc.h" @@ -193,6 +194,10 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, nkeycolumns; int i; + /* Don't chase own tail */ + if (ReindexIsProcessingIndex(indexoid)) + continue; + /* * Extract info from the relation descriptor for the index. */ diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out index 324db1b6ae..1706964277 100644 --- a/src/test/regress/expected/create_index.out +++ b/src/test/regress/expected/create_index.out @@ -1323,6 +1323,18 @@ create unique index hash_f8_index_1 on hash_f8_heap(abs(random)); create unique index hash_f8_index_2 on hash_f8_heap((seqno + 1), random); create unique index hash_f8_index_3 on hash_f8_heap(random) where seqno > 1000; -- +-- Create an index which might consider using this very index during the build. +-- +-- primary key ensures relhasindex is set +CREATE TABLE pears(f1 int primary key, f2 int); +INSERT INTO pears SELECT i, i+1 FROM generate_series(1, 100) i; +CREATE FUNCTION pears_f(i int) RETURNS int LANGUAGE SQL IMMUTABLE AS $$ + SELECT f1 FROM pears WHERE pears.f2 = 42 +$$; +CREATE index ON pears ((pears_f(f1))); +DROP TABLE pears; +DROP FUNCTION pears_f; +-- -- Try some concurrent index builds -- -- Unfortunately this only tests about half the code paths because there are diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql index f96bebf410..76a781f6b0 100644 --- a/src/test/regress/sql/create_index.sql +++ b/src/test/regress/sql/create_index.sql @@ -446,6 +446,19 @@ create unique index hash_f8_index_2 on hash_f8_heap((seqno + 1), random); create unique index hash_f8_index_3 on hash_f8_heap(random) where seqno > 1000; -- +-- Create an index which might consider using this very index during the build. +-- +-- primary key ensures relhasindex is set +CREATE TABLE pears(f1 int primary key, f2 int); +INSERT INTO pears SELECT i, i+1 FROM generate_series(1, 100) i; +CREATE FUNCTION pears_f(i int) RETURNS int LANGUAGE SQL IMMUTABLE AS $$ + SELECT f1 FROM pears WHERE pears.f2 = 42 +$$; +CREATE index ON pears ((pears_f(f1))); +DROP TABLE pears; +DROP FUNCTION pears_f; + +-- -- Try some concurrent index builds -- -- Unfortunately this only tests about half the code paths because there are -- 2.11.0 -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Arseny Sher <a.sher@postgrespro.ru> writes: > A problem of similar nature can be reproduced with the following > stripped-down scenario: > CREATE TABLE pears(f1 int primary key, f2 int); > INSERT INTO pears SELECT i, i+1 FROM generate_series(1, 100) i; > CREATE OR REPLACE FUNCTION pears_f(i int) RETURNS int LANGUAGE SQL IMMUTABLE AS $$ > SELECT f1 FROM pears WHERE pears.f2 = 42 > $$; > CREATE index ON pears ((pears_f(f1))); We've seen complaints about this sort of thing before, and rejected them because, as you say, that function is NOT immutable. When you lie to the system like that, you should not be surprised if things break. > There is already a mechanism which prevents usage of indexes during > reindex -- ReindexIsProcessingIndex et al. However, to the contrary of > what index.c:3664 comment say, these protect only indexes on system > catalogs, not user tables: the only real caller is genam.c. > Attached patch extends it: the same check is added to > get_relation_info. Also SetReindexProcessing is cocked in index_create > to defend from index self usage during creation as in stripped example > above. There are some other still unprotected callers of index_build; > concurrent index creation doesn't need it because index is > 'not indisvalid' during the build, and in RelationTruncateIndexes > table is empty, so it looks like it can be omitted. I have exactly no faith that this fixes things enough to make such cases supportable. And I have no interest in opening that can of worms anyway. I'd rather put in some code to reject database accesses in immutable functions. > One might argue that function selecting from table can hardly be called > immutable, and immutability is required for index expressions. However, > if user is sure table contents doesn't change, why not? If the table contents never change, why are you doing VACUUM FULL on it? regards, tom lane
On Thu, Sep 12, 2019 at 11:08:28AM -0400, Tom Lane wrote: >Arseny Sher <a.sher@postgrespro.ru> writes: >> A problem of similar nature can be reproduced with the following >> stripped-down scenario: > >> CREATE TABLE pears(f1 int primary key, f2 int); >> INSERT INTO pears SELECT i, i+1 FROM generate_series(1, 100) i; >> CREATE OR REPLACE FUNCTION pears_f(i int) RETURNS int LANGUAGE SQL IMMUTABLE AS $$ >> SELECT f1 FROM pears WHERE pears.f2 = 42 >> $$; >> CREATE index ON pears ((pears_f(f1))); > >We've seen complaints about this sort of thing before, and rejected >them because, as you say, that function is NOT immutable. When you >lie to the system like that, you should not be surprised if things >break. > >> There is already a mechanism which prevents usage of indexes during >> reindex -- ReindexIsProcessingIndex et al. However, to the contrary of >> what index.c:3664 comment say, these protect only indexes on system >> catalogs, not user tables: the only real caller is genam.c. >> Attached patch extends it: the same check is added to >> get_relation_info. Also SetReindexProcessing is cocked in index_create >> to defend from index self usage during creation as in stripped example >> above. There are some other still unprotected callers of index_build; >> concurrent index creation doesn't need it because index is >> 'not indisvalid' during the build, and in RelationTruncateIndexes >> table is empty, so it looks like it can be omitted. > >I have exactly no faith that this fixes things enough to make such >cases supportable. And I have no interest in opening that can of >worms anyway. I'd rather put in some code to reject database >accesses in immutable functions. > Same here. My hunch is a non-trivaial fraction of applications using this "trick" is silently broken in various subtle ways. >> One might argue that function selecting from table can hardly be called >> immutable, and immutability is required for index expressions. However, >> if user is sure table contents doesn't change, why not? > >If the table contents never change, why are you doing VACUUM FULL on it? > It's possible the columns referenced by the index expression are not changing, but some additional columns are updated. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On Thu, Sep 12, 2019 at 11:08:28AM -0400, Tom Lane wrote: >>I have exactly no faith that this fixes things enough to make such >>cases supportable. And I have no interest in opening that can of >>worms anyway. I'd rather put in some code to reject database >>accesses in immutable functions. >> > > Same here. My hunch is a non-trivaial fraction of applications using > this "trick" is silently broken in various subtle ways. Ok, I see the point. However, "could not read block" error might seem quite scary to the users; it looks like data corruption. How about ERRORing out then in get_relation_info instead of skipping reindexing indexes, like in attached? Even if this doesn't cover all cases, at least one scenario observed in the field would have better error message. Rejecting database access completely in immutable functions would be unfortunate for our particular case, because this GIN index on expression joining the very indexed table multiple times (and using thus btree index) is, well, useful. Here is a brief description of the case. Indexed table stores postal addresses, which are of hierarchical nature (e.g. country-region-city-street-house). Single row is one element of any depth (e.g. region or house); each row stores link to its parent in parent_guid column, establishing thus the hierarchy (e.g. house has link to the street). The task it to get the full address by typing random parts of it (imagine typing hints in Google Maps). For that, FTS is used. GIN index is built on full addresses, and to get the full address table is climbed up about six times (hierarchy depth) by following parent_guid chain. We could materialize full addresses in the table and eliminate the need to form them in the index expression, but that would seriously increase amount of required storage -- GIN doesn't store indexed columns fully, and thus it is cheaper to 'materialize' full addresses inside it only. Surely this is a hack which cheats the system. We might imagine creating some functionality (kinda index referring to multiple rows of the table -- or even rows of different tables) making it unneccessary, but such functionality doesn't exist today, and the hack is useful, if you understand the risk. >>> One might argue that function selecting from table can hardly be called >>> immutable, and immutability is required for index expressions. However, >>> if user is sure table contents doesn't change, why not? >> >>If the table contents never change, why are you doing VACUUM FULL on it? >> > > It's possible the columns referenced by the index expression are not > changing, but some additional columns are updated. Yeah. Also table can be CLUSTERed without VACUUM FULL. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company From f5b9c433bf387a9ddbe318dfea2b96c02c4a945e Mon Sep 17 00:00:00 2001 From: Arseny Sher <sher-ars@yandex.ru> Date: Thu, 12 Sep 2019 17:35:16 +0300 Subject: [PATCH] ERROR out early on attempt to touch user indexes while they are being (re)built. Existing ReindexIsProcessingIndex check is consulted only in genam.c and thus enforced only for system catalogs. Check it also in the planner, so that indexes which are currently being rebuilt are never tried. Also cock SetReindexProcessing in index_create to defend from index self usage during its creation. Without this, VACUUM FULL or just CREATE INDEX might fail with something like ERROR: could not read block 3534 in file "base/41366676/56697497": read only 0 of 8192 bytes if there are indexes which usage can be considered during these very indexes (re)building, i.e. index expression scans indexed table. Such error might seem scary, so catch this earlier. --- src/backend/catalog/index.c | 22 ++++++++++++++++++++-- src/backend/optimizer/util/plancat.c | 9 +++++++++ src/test/regress/expected/create_index.out | 15 +++++++++++++++ src/test/regress/sql/create_index.sql | 13 +++++++++++++ 4 files changed, 57 insertions(+), 2 deletions(-) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 3e1d40662d..5bc764ce46 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -1174,7 +1174,22 @@ index_create(Relation heapRelation, } else { - index_build(heapRelation, indexRelation, indexInfo, false, true); + /* ensure SetReindexProcessing state isn't leaked */ + PG_TRY(); + { + /* Suppress use of the target index while building it */ + SetReindexProcessing(heapRelationId, indexRelationId); + + index_build(heapRelation, indexRelation, indexInfo, false, true); + } + PG_CATCH(); + { + /* Make sure flag gets cleared on error exit */ + ResetReindexProcessing(); + PG_RE_THROW(); + } + PG_END_TRY(); + ResetReindexProcessing(); } /* @@ -1379,7 +1394,10 @@ index_concurrently_build(Oid heapRelationId, indexInfo->ii_Concurrent = true; indexInfo->ii_BrokenHotChain = false; - /* Now build the index */ + /* + * Now build the index + * SetReindexProcessing is not required since indisvalid is false anyway + */ index_build(heapRel, indexRelation, indexInfo, false, true); /* Close both the relations, but keep the locks */ diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index cf1761401d..e42471cac8 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -27,6 +27,7 @@ #include "access/xlog.h" #include "catalog/catalog.h" #include "catalog/dependency.h" +#include "catalog/index.h" #include "catalog/heap.h" #include "catalog/pg_am.h" #include "catalog/pg_proc.h" @@ -199,6 +200,14 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, indexRelation = index_open(indexoid, lmode); index = indexRelation->rd_index; + if (ReindexIsProcessingIndex(indexoid)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("cannot use index %s which is currently being reindexed", + RelationGetRelationName(indexRelation)), + errhint("Probably index expressions include functions " + "accessing indexed table itself, but they must be immutable."))); + /* * Ignore invalid indexes, since they can't safely be used for * queries. Note that this is OK because the data structure we diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out index 324db1b6ae..f9aeb97ed8 100644 --- a/src/test/regress/expected/create_index.out +++ b/src/test/regress/expected/create_index.out @@ -1323,6 +1323,21 @@ create unique index hash_f8_index_1 on hash_f8_heap(abs(random)); create unique index hash_f8_index_2 on hash_f8_heap((seqno + 1), random); create unique index hash_f8_index_3 on hash_f8_heap(random) where seqno > 1000; -- +-- Create an index which might consider using this very index during the build. +-- +-- primary key ensures relhasindex is set +CREATE TABLE pears(f1 int primary key, f2 int); +INSERT INTO pears SELECT i, i+1 FROM generate_series(1, 100) i; +CREATE FUNCTION pears_f(i int) RETURNS int LANGUAGE SQL IMMUTABLE AS $$ + SELECT f1 FROM pears WHERE pears.f2 = 42 +$$; +CREATE index ON pears ((pears_f(f1))); +ERROR: cannot use index pears_pears_f_idx which is currently being reindexed +HINT: Probably index expressions include functions accessing indexed table itself, but they must be immutable. +CONTEXT: SQL function "pears_f" during startup +DROP TABLE pears; +DROP FUNCTION pears_f; +-- -- Try some concurrent index builds -- -- Unfortunately this only tests about half the code paths because there are diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql index f96bebf410..76a781f6b0 100644 --- a/src/test/regress/sql/create_index.sql +++ b/src/test/regress/sql/create_index.sql @@ -446,6 +446,19 @@ create unique index hash_f8_index_2 on hash_f8_heap((seqno + 1), random); create unique index hash_f8_index_3 on hash_f8_heap(random) where seqno > 1000; -- +-- Create an index which might consider using this very index during the build. +-- +-- primary key ensures relhasindex is set +CREATE TABLE pears(f1 int primary key, f2 int); +INSERT INTO pears SELECT i, i+1 FROM generate_series(1, 100) i; +CREATE FUNCTION pears_f(i int) RETURNS int LANGUAGE SQL IMMUTABLE AS $$ + SELECT f1 FROM pears WHERE pears.f2 = 42 +$$; +CREATE index ON pears ((pears_f(f1))); +DROP TABLE pears; +DROP FUNCTION pears_f; + +-- -- Try some concurrent index builds -- -- Unfortunately this only tests about half the code paths because there are -- 2.11.0