Thread: REINDEX CONCURRENTLY causes ALTER TABLE to fail
Hi everyone, Consider the following statement sequence: CREATE TABLE t0(c0 INTEGER , c1 BOOLEAN); INSERT INTO t0(c0, c1) VALUES(1369652450, FALSE), (414515746, TRUE), (897778963, FALSE); CREATE UNIQUE INDEX i0 ON t0((1 / t0.c0)) WHERE ('-H') >= (t0.c1::TEXT) COLLATE "C"; REINDEX TABLE CONCURRENTLY t0; ALTER TABLE t0 ALTER c1 TYPE TEXT; -- could not create unique index "i0" DETAIL: Key ((1 / c0))=(0) is duplicated. The REINDEX TABLE CONCURRENTLY causes the ALTER TABLE to fail, which is unexpected. Without the CONCURRENTLY, the ALTER TABLE works as expected. When first executing a concurrent reindex, which is followed by a non-current one, the error still occurs, which might suggest that the concurrent reindex breaks some internal state. I tested this on trunk. On the latest release version, the CONCURRENTLY option is not available.
On Wed, Jul 17, 2019 at 02:53:54PM +0200, Manuel Rigger wrote: > The REINDEX TABLE CONCURRENTLY causes the ALTER TABLE to fail, which > is unexpected. Without the CONCURRENTLY, the ALTER TABLE works as > expected. When first executing a concurrent reindex, which is followed > by a non-current one, the error still occurs, which might suggest that > the concurrent reindex breaks some internal state. Thanks. The issue comes from the handling of the collation part within the index predicates in pg_index.indpred. I can see a slight difference in the tree with a switch from COLLATE to REFLABEL for example. In your test case, if you actually remove the collation part, this fails similarly: =# CREATE UNIQUE INDEX i0 ON t0((1 / t0.c0)) WHERE ('-H') >= (t0.c1::TEXT); ERROR: 23505: could not create unique index "i0" DETAIL: Key ((1 / c0))=(0) is duplicated. The root of the problem comes from the fact that we rely on BuildIndexInfo() for the index information. This uses RelationGetIndexExpressions() to fetch the set of expressions and RelationGetIndexPredicate() for the predicates. However, the issue is that those routines apply some extra flattening operations for the planner when storing this data in the relation cache, causing the expressions stored in the catalogs for the copy to be changed. So what we store are the expressions that the planner uses. No wonder that this gets broken. There are a couple of approaches that we could do to fix that. The first one I could think about is to change the relcache level so as we don't apply planner-level optimizations when creating a concurrent index entry. Another one, which is less invasive, is to just update the list of expressions and predicates after calling BuildIndexInfo(). Still that means overriding the contents of the relcache with what has been optimized for the planner to what we want to use for the reindex build and I think that this weakens the logic of index_build. -- Michael
Attachment
On Thu, Jul 18, 2019 at 12:02:32PM +0900, Michael Paquier wrote: > There are a couple of approaches that we could do to fix that. The > first one I could think about is to change the relcache level so as we > don't apply planner-level optimizations when creating a concurrent > index entry. Another one, which is less invasive, is to just update > the list of expressions and predicates after calling > BuildIndexInfo(). Still that means overriding the contents of the > relcache with what has been optimized for the planner to what we want > to use for the reindex build and I think that this weakens the logic > of index_build. I have done some work on that, and for now I am finishing with the attached. This takes the approach of updating BuildIndexInfo() so as the optimizations for the planner are not done. That's not completely right yet though as even if it passes your initial test case, this does not allow the index expressions and predicates to have an exact match. I have designed a test case for this purpose which stores the contents of pg_node_tree before and after the REINDEX and even if the collation gets right, all the location fields from pg_node_tree get reset. The solution is not complete, still the test case is quite useful. So even if the solution happens to not be like the attached, I'd rather keep the test case and perhaps extend it. -- Michael
Attachment
Manuel Rigger <rigger.manuel@gmail.com> writes: > Consider the following statement sequence: > CREATE TABLE t0(c0 INTEGER , c1 BOOLEAN); > INSERT INTO t0(c0, c1) VALUES(1369652450, FALSE), (414515746, TRUE), > (897778963, FALSE); > CREATE UNIQUE INDEX i0 ON t0((1 / t0.c0)) WHERE ('-H') >= > (t0.c1::TEXT) COLLATE "C"; > REINDEX TABLE CONCURRENTLY t0; > ALTER TABLE t0 ALTER c1 TYPE TEXT; -- could not create unique index > "i0" DETAIL: Key ((1 / c0))=(0) is duplicated. > The REINDEX TABLE CONCURRENTLY causes the ALTER TABLE to fail, which > is unexpected. BTW, a note for anybody trying to follow along at home --- this example does not reproduce if the database's default collation is "C". regards, tom lane
Michael Paquier <michael@paquier.xyz> writes: > I have done some work on that, and for now I am finishing with the > attached. This takes the approach of updating BuildIndexInfo() so as > the optimizations for the planner are not done. This seems like a really bad idea. As you have it, whether the relcache contains optimized or unoptimized expressions will be basically chance, depending on which sort of call happened first after the last cache flush. That will break things, because for most use-cases it's *not* optional to apply eval_const_expressions. Even without that, changing the APIs of the functions you've touched here seems pretty invasive. I think the correct fix is to nuke the code in REINDEX CONCURRENTLY that thinks it should use any of this infrastructure to fetch the existing index properties, and just have it read the catalog directly instead. regards, tom lane
On Mon, Jul 22, 2019 at 05:41:26PM -0400, Tom Lane wrote: > I think the correct fix is to nuke the code in REINDEX CONCURRENTLY that > thinks it should use any of this infrastructure to fetch the existing > index properties, and just have it read the catalog directly instead. Okay. Sharing a maximum the build of IndexInfo done as part of DefineIndex is something that we should do then. Attached is a patch that I have been working on, introducing a new makeIndexInfo() in makefuncs.c. Still as IndexInfo is not a parse node but an exec node, I am including execnodes.h in makefuncs.h for now (perhaps somebody has a better idea?). This fills the index information of the new entry directly from the catalog of the old entry for expressions and predicates. I have let BuildIndexInfo out of that as ii_ReadyForInserts can change for system catalogs during bootstrap. I have also reworked the test case to just use pg_get_indexdef() which is enough to show the original bug as after the initial REINDEX CONCURRENTLY the collation part goes missing when testing on HEAD. Predicates need to properly be formatted with an implicit-AND format by the way. Thoughts? -- Michael
Attachment
On Tue, Jul 23, 2019 at 01:28:38PM +0900, Michael Paquier wrote: > Okay. Sharing a maximum the build of IndexInfo done as part of > DefineIndex is something that we should do then. Attached is a patch > that I have been working on, introducing a new makeIndexInfo() in > makefuncs.c. Still as IndexInfo is not a parse node but an exec node, > I am including execnodes.h in makefuncs.h for now (perhaps somebody > has a better idea?). This fills the index information of the new > entry directly from the catalog of the old entry for expressions and > predicates. I have been looking at this code today and extended the test cases with more expression and predicate indexes to stress more the patch, and it happens that I am still stuck for with makeIndexInfo(). The header comments of makefuncs.h and makefuncs.c mention primitive nodes, so I would need at least to update that part to mention execution nodes. Still I'd rather not have people scream at me as that could be considered invasive in terms of included dependencies for makefuncs.c :) If we don't have the makeIndexInfo wrapper, please note that we finish by filling in three times IndexInfo with roughly the same data in different places, so I would like to keep it. So, are there any opinions or ideas about that? -- Michael
Attachment
On Thu, Jul 25, 2019 at 07:31:13PM +0900, Michael Paquier wrote: > I have been looking at this code today and extended the test cases > with more expression and predicate indexes to stress more the patch, > and it happens that I am still stuck for with makeIndexInfo(). The > header comments of makefuncs.h and makefuncs.c mention primitive > nodes, so I would need at least to update that part to mention > execution nodes. Still I'd rather not have people scream at me as > that could be considered invasive in terms of included dependencies > for makefuncs.c :) After an extra round of polishing, I am finishing with the attached and the following change for makefuncs.h: * makefuncs.h - * prototypes for the creator functions (for primitive nodes) + * prototypes for the creator functions for various nodes Any objections with this version? Please note that the predicates and expressions are fetched directly from the catalogs, and that we does not rely on the relcache anymore. -- Michael
Attachment
On Fri, Jul 26, 2019 at 09:34:13AM +0900, Michael Paquier wrote: > Any objections with this version? Please note that the predicates and > expressions are fetched directly from the catalogs, and that we does > not rely on the relcache anymore. Hearing nothing, applied and back-patched down to v12. Thanks Manuel for the report! -- Michael
Attachment
Thanks a lot for fixing this, Michael! Best, Manuel On Mon, Jul 29, 2019 at 3:10 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Jul 26, 2019 at 09:34:13AM +0900, Michael Paquier wrote: > > Any objections with this version? Please note that the predicates and > > expressions are fetched directly from the catalogs, and that we does > > not rely on the relcache anymore. > > Hearing nothing, applied and back-patched down to v12. Thanks Manuel > for the report! > -- > Michael