Re: REINDEX CONCURRENTLY unexpectedly fails - Mailing list pgsql-bugs
From | Michael Paquier |
---|---|
Subject | Re: REINDEX CONCURRENTLY unexpectedly fails |
Date | |
Msg-id | 20191120035408.GA4243@paquier.xyz Whole thread Raw |
In response to | Re: REINDEX CONCURRENTLY unexpectedly fails (Andres Freund <andres@anarazel.de>) |
Responses |
Re: REINDEX CONCURRENTLY unexpectedly fails
Re: REINDEX CONCURRENTLY unexpectedly fails Re: REINDEX CONCURRENTLY unexpectedly fails |
List | pgsql-bugs |
On Tue, Nov 19, 2019 at 05:36:49PM -0800, Andres Freund wrote: > On 2019-11-15 17:07:06 +0900, Michael Paquier wrote: >> So, here is a patch to address all that. I have also added a test for >> REINDEX SCHEMA on a temporary schema. While looking at the problem, I >> have found a crash if trying to reindex concurrently an index or a >> table using a temporary relation from a different session because we >> have been lacking checks with RELATION_IS_OTHER_TEMP() in the >> concurrent code paths. Thanks for providing comments. The next set of minor releases is in February, so we have some room until that. For now I'll just add this patch to the next CF as a bug fix to keeo track of it. > Probably worth fixing separately? There is already a check for RELATION_IS_OTHER_TEMP() in the non-concurrent reindex path, so by forcing temp relations to take this path then the issue gets fixed automatically, with the error message from reindex_index() generated. >> Please note that I have not changed index_drop() for the concurrent >> mode. Per my checks this does not actually cause any direct bugs as >> this code path just takes care of dropping the dependencies with the >> index. There could be an argument for changing that on HEAD though, >> but I am not sure that this is worth the complication either. > > I'm extremely doubtful this works correctly. E.g., what prevents the > tids for index_concurrently_set_dead() from being recycled and pointing > to a different row after one of the internal transactions? ON COMMIT DELETE ROWS does a physical truncation of the relation files. And as DROP INDEX CONCURRENTLY cannot be run in a transaction block, you would never face a case where you have no past TIDs which could be referred to when setting the index as invalid. Now I don't actually object to enforce the non-concurrent path in index_drop() for *all* temporary relations. Anyway, that makes sense in itself on performance grounds, similarly to the create path, so did that by enforcing the flag in index_drop() (doDeletion would be tempting but I took the problem at its root). And added some tests for the drop path and an extra assertion. > I don't think "ignore concurrent index creation" is a good description - > they're still created. I'd rephrase it as "For temporary tables build > index non-concurrently", or something in that vein. > > I think we also need to mention somewhere that it's actually crucial to > ignore them, as otherwise we'd run into problems with on commit truncate > / drop. Sure. I have expanded the comment section on that. > Probably worthwhile to centralize check and comments into one place, to > avoid duplication / them diverging. Perhaps something like > IndexCreationSupportsConcurrent() or IndexCreationForceNonConcurrent()? Good idea, done that. It reduces the explanations of the patch to be in a single location. >> @@ -2771,6 +2797,9 @@ ReindexRelationConcurrently(Oid relationOid, int options) >> /* Open relation to get its indexes */ >> heapRelation = table_open(relationOid, ShareUpdateExclusiveLock); >> >> + /* Temporary tables are not processed concurrently */ >> + Assert(heapRelation->rd_rel->relpersistence != RELPERSISTENCE_TEMP); > > s/are not/can not/? Okay. >> +-- Temporary tables with concurrent builds >> +CREATE TEMP TABLE concur_temp (f1 int, f2 text); >> +INSERT INTO concur_temp VALUES (1, 'foo'), (2, 'bar'); >> +CREATE INDEX CONCURRENTLY concur_temp_ind ON concur_temp(f1); >> +DROP TABLE concur_temp; >> +-- On-commit actions >> +CREATE TEMP TABLE concur_temp (f1 int, f2 text) >> + ON COMMIT DELETE ROWS; >> +INSERT INTO concur_temp VALUES (1, 'foo'), (2, 'bar'); >> +CREATE INDEX CONCURRENTLY concur_temp_ind ON concur_temp(f1); >> +DROP TABLE concur_temp; >> -- > > I'd also add a test for ON COMMIT DROP. Considered that, but ON COMMIT DROP does not make sense because it requires a transaction context, which is why I did not add one. And it seems to me that there is not much value to just check after CIC or REINDEX's restriction to not run in a transaction block? I added tests for these two, but I am of the opinion that they don't bring much. >> diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml >> index 10881ab03a..e5d2b1a06e 100644 >> --- a/doc/src/sgml/ref/reindex.sgml >> +++ b/doc/src/sgml/ref/reindex.sgml >> @@ -162,6 +162,9 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR >> — see <xref linkend="sql-reindex-concurrently" >> endterm="sql-reindex-concurrently-title"/>. >> </para> >> + <para> >> + This option is ignored for temporary relations. >> + </para> >> </listitem> >> </varlistentry> >> > > I think either this needs to be more verbose, explaining that there's no > harm from the option being ignored, or it should be ignored as an > implementation detail. I think that documenting it is good for the end-user as well. Just attempted something in the updated version for all the sections of the docs involved. -- Michael
Attachment
pgsql-bugs by date: