Thread: REINDEX CONCURRENTLY 2.0
Hi all, Please find attached updated patches for the support of REINDEX CONCURRENTLY, renamed 2.0 for the occasion: - 20131114_1_index_drop_comments.patch, patch that updates some comments in index_drop. This updates only a couple of comments in index_drop but has not been committed yet. It should be IMO... - 20131114_2_WaitForOldsnapshots_refactor.patch, a refactoring patch providing a single API that can be used to wait for old snapshots - 20131114_3_reindex_concurrently.patch, providing the core feature. Patch 3 needs to have patch 2 applied first. Regression tests, isolation tests and documentation are included with the patch. This is the continuation of the previous thread that finished here: http://www.postgresql.org/message-id/CAB7nPqS+WYN021oQHd9GPe_5dSVcVXMvEBW_E2AV9OOEwggMHw@mail.gmail.com This patch has been added for this commit fest. Regards, -- Michael
Attachment
On 11/14/13, 9:40 PM, Michael Paquier wrote: > Hi all, > > Please find attached updated patches for the support of REINDEX > CONCURRENTLY, renamed 2.0 for the occasion: > - 20131114_1_index_drop_comments.patch, patch that updates some > comments in index_drop. This updates only a couple of comments in > index_drop but has not been committed yet. It should be IMO... > - 20131114_2_WaitForOldsnapshots_refactor.patch, a refactoring patch > providing a single API that can be used to wait for old snapshots > - 20131114_3_reindex_concurrently.patch, providing the core feature. > Patch 3 needs to have patch 2 applied first. Regression tests, > isolation tests and documentation are included with the patch. The third patch needs to be rebased, because of a conflict in index.c.
Hi, On 2013-11-15 11:40:17 +0900, Michael Paquier wrote: > - 20131114_3_reindex_concurrently.patch, providing the core feature. > Patch 3 needs to have patch 2 applied first. Regression tests, > isolation tests and documentation are included with the patch. Have you addressed my concurrency concerns from the last version? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Nov 16, 2013 at 5:09 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-11-15 11:40:17 +0900, Michael Paquier wrote: >> - 20131114_3_reindex_concurrently.patch, providing the core feature. >> Patch 3 needs to have patch 2 applied first. Regression tests, >> isolation tests and documentation are included with the patch. > > Have you addressed my concurrency concerns from the last version? I have added documentation in the patch with a better explanation about why those choices of implementation are made. Thanks, -- Michael
Attachment
On 2013-11-18 19:52:29 +0900, Michael Paquier wrote: > On Sat, Nov 16, 2013 at 5:09 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2013-11-15 11:40:17 +0900, Michael Paquier wrote: > >> - 20131114_3_reindex_concurrently.patch, providing the core feature. > >> Patch 3 needs to have patch 2 applied first. Regression tests, > >> isolation tests and documentation are included with the patch. > > > > Have you addressed my concurrency concerns from the last version? > I have added documentation in the patch with a better explanation > about why those choices of implementation are made. The dropping still isn't safe: After phase 4 we are in the state: old index: valid, live, !isdead new index: !valid, live, !isdead Then you do a index_concurrent_set_dead() from that state on in phase 5. There you do WaitForLockers(locktag, AccessExclusiveLock) before index_set_state_flags(INDEX_DROP_SET_DEAD). That's not sufficient. Consider what happens with the following sequence: 1) WaitForLockers(locktag, AccessExclusiveLock) -> GetLockConflicts() => virtualxact 1 -> VirtualXactLock(1) 2) virtualxact 2 starts, opens the *old* index since it's currently the only valid one. 3) virtualxact 1 finishes 4) index_concurrent_set_dead() does index_set_state_flags(DROP_SET_DEAD) 5) another transaction (vxid 3) starts inserting data into the relation, updates only the new index, the old index is dead 6) vxid 2 inserts data, updates only the old index. Since it had the index already open the cache invalidations won't beprocessed. Now the indexes are out of sync. There's entries only in the old index and there's entries only in the new index. Not good. I hate to repeat myself, but you really need to follow the current protocol for concurrently dropping indexes. Which involves *first* marking the index as invalid so it won't be used for querying anymore, then wait for everyone possibly still seing that entry to finish, and only *after* that mark the index as dead. You cannot argue away correctness concerns with potential deadlocks. c.f. http://www.postgresql.org/message-id/20130926103400.GA2471420@alap2.anarazel.de I am also still unconvinced that the logic in index_concurrent_swap() is correct. It very much needs to explain why no backend can see values that are inconsistent. E.g. what prevents a backend thinking the old and new indexes have the same relfilenode? MVCC snapshots don't seem to protect you against that. I am not sure there's a problem, but there certainly needs to more comments explaining why there are none. Something like the following might be possible: Backend 1: start reindex concurrently, till phase 4 Backend 2: ExecOpenIndices() -> RelationGetIndexList (that list is consistent due to mvcc snapshots!) Backend 2: -> index_open(old_index) (old relfilenode) Backend 1: index_concurrent_swap() -> CommitTransaction() -> ProcArrayEndTransaction() (changes visibleto others henceforth!) Backend 2: -> index_open(new_index) => no cache invalidations yet, gets the old relfilenode Backend 2: ExecInsertIndexTuples() => updates the same relation twice, corruptf Backend 1: stil. in CommitTransaction() -> AtEOXact_Inval() sends out invalidations Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Michael Paquier escribió: > Hi all, > > Please find attached updated patches for the support of REINDEX > CONCURRENTLY, renamed 2.0 for the occasion: > - 20131114_1_index_drop_comments.patch, patch that updates some > comments in index_drop. This updates only a couple of comments in > index_drop but has not been committed yet. It should be IMO... Pushed this one, thanks. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Sorry for the lateness of this... On 11/14/13, 8:40 PM, Michael Paquier wrote: > + /* > + * Phase 4 of REINDEX CONCURRENTLY > + * > + * Now that the concurrent indexes have been validated could be used, > + * we need to swap each concurrent index with its corresponding old index. > + * Note that the concurrent index used for swaping is not marked as valid > + * because we need to keep the former index and the concurrent index with > + * a different valid status to avoid an implosion in the number of indexes > + * a parent relation could have if this operation fails multiple times in > + * a row due to a reason or another. Note that we already know thanks to > + * validation step that > + */ > + Was there supposed to be more to that comment? In the loop right below it... + /* Swap the indexes and mark the indexes that have the old data as invalid */ + forboth(lc, indexIds, lc2, concurrentIndexIds) ... + CacheInvalidateRelcacheByRelid(relOid); Do we actually need to invalidate the cache on each case? Is it because we're grabbing a new transaction each time through? -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Hi, Thanks for your comments. On Fri, Jan 10, 2014 at 9:59 AM, Jim Nasby <jim@nasby.net> wrote: > Sorry for the lateness of this... > > On 11/14/13, 8:40 PM, Michael Paquier wrote: >> >> + /* >> + * Phase 4 of REINDEX CONCURRENTLY >> + * >> + * Now that the concurrent indexes have been validated could be >> used, >> + * we need to swap each concurrent index with its corresponding >> old index. >> + * Note that the concurrent index used for swaping is not marked >> as valid >> + * because we need to keep the former index and the concurrent >> index with >> + * a different valid status to avoid an implosion in the number of >> indexes >> + * a parent relation could have if this operation fails multiple >> times in >> + * a row due to a reason or another. Note that we already know >> thanks to >> + * validation step that >> + */ >> + > > > Was there supposed to be more to that comment? Not really, it seems that this chunk came out after writing multiple successive versions of this patch. > In the loop right below it... > > + /* Swap the indexes and mark the indexes that have the old data as > invalid */ > + forboth(lc, indexIds, lc2, concurrentIndexIds) > ... > + CacheInvalidateRelcacheByRelid(relOid); > > Do we actually need to invalidate the cache on each case? Is it because > we're grabbing a new transaction each time through? This is to force a refresh of the cached plans that have been using the old index before transaction of step 4 began. I have realigned this patch with latest head (d2458e3)... In case someone is interested at some point... Regards, -- Michael
Attachment
On Tue, Jan 21, 2014 at 10:12 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > I have realigned this patch with latest head (d2458e3)... In case > someone is interested at some point... Attached is a patch for REINDEX CONCURRENTLY rebased on HEAD (d7938a4), as some people are showing interest in it by reading recent discussions. Patch compiles and passes regression as well as isolation tests.. -- Michael
Attachment
On 2014-08-27 11:00:56 +0900, Michael Paquier wrote: > On Tue, Jan 21, 2014 at 10:12 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: > > I have realigned this patch with latest head (d2458e3)... In case > > someone is interested at some point... > Attached is a patch for REINDEX CONCURRENTLY rebased on HEAD > (d7938a4), as some people are showing interest in it by reading recent > discussions. Patch compiles and passes regression as well as isolation > tests.. Can you add it to the next CF? I'll try to look earlier, but can't promise anything. I very much would like this to get committed in some form or another. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund <andres@2ndquadrant.com> wrote: > Can you add it to the next CF? I'll try to look earlier, but can't > promise anything. > > I very much would like this to get committed in some form or another. Added it here to keep track of it: https://commitfest.postgresql.org/action/patch_view?id=1563 Regards, -- Michael
On Wed, Aug 27, 2014 at 3:53 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
--
Michael
On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Can you add it to the next CF? I'll try to look earlier, but can't
> promise anything.
>
> I very much would like this to get committed in some form or another.
Added it here to keep track of it:
https://commitfest.postgresql.org/action/patch_view?id=1563
Attached is a fairly-refreshed patch that should be used as a base for the next commit fest. The following changes should be noticed:
- Use of AccessExclusiveLock when swapping relfilenodes of an index and its concurrent entry instead of ShareUpdateExclusiveLock for safety. At the limit of my understanding, that's the consensus reached until now.
- Cleanup of many comments and refresh of the documentation that was somewhat wrongly-formulated or shaped at some places
- Addition of support for autocommit off in psql for REINDEX [ TABLE | INDEX ] CONCURRENTLY
- Some more code cleanup..
I haven't been through the tab completion support for psql but looking at tab-completion.c this seems a bit tricky with the stuff related to CREATE INDEX CONCURRENTLY already present. Nothing huge though.
Regards,
Michael
Attachment
On 10/1/14, 2:00 AM, Michael Paquier wrote: > On Wed, Aug 27, 2014 at 3:53 PM, Michael Paquier <michael.paquier@gmail.com <mailto:michael.paquier@gmail.com>> wrote: > > On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund <andres@2ndquadrant.com <mailto:andres@2ndquadrant.com>> wrote: > > Can you add it to the next CF? I'll try to look earlier, but can't > > promise anything. > > > > I very much would like this to get committed in some form or another. > Added it here to keep track of it: > https://commitfest.postgresql.org/action/patch_view?id=1563 > > Attached is a fairly-refreshed patch that should be used as a base for the next commit fest. The following changes shouldbe noticed: > - Use of AccessExclusiveLock when swapping relfilenodes of an index and its concurrent entry instead of ShareUpdateExclusiveLockfor safety. At the limit of my understanding, that's the consensus reached until now. > - Cleanup of many comments and refresh of the documentation that was somewhat wrongly-formulated or shaped at some places > - Addition of support for autocommit off in psql for REINDEX [ TABLE | INDEX ] CONCURRENTLY > - Some more code cleanup.. > I haven't been through the tab completion support for psql but looking at tab-completion.c this seems a bit tricky withthe stuff related to CREATE INDEX CONCURRENTLY already present. Nothing huge though. Patch applies against current HEAD and builds, but I'm getting 37 failed tests (mostly parallel, but also misc and WITH;results attached). Is that expeccted? + <para> + In a concurrent index build, a new index whose storage will replace the one + to be rebuild is actually entered into the system catalogs in one + transaction, then two table scans occur in two more transactions. Once this + is performed, the old and fresh indexes are swapped by taking a lock + <literal>ACCESS EXCLUSIVE</>. Finally two additional transactions + are used to mark the concurrent index as not ready and then drop it. + </para> The "mark the concurrent index" bit is rather confusing; it sounds like it's referring to the new index instead of the old.Now that I've read the code I understand what's going on here between the concurrent index *entry* and the filenode swap,but I don't think the docs make this sufficiently clear to users. How about something like this: The following steps occur in a concurrent index build, each in a separate transaction. Note that if there are multiple indexesto be rebuilt then each step loops through all the indexes we're rebuilding, using a separate transaction for eachone. 1. A new "temporary" index definition is entered into the catalog. This definition is only used to build the new index, andwill be removed at the completion of the process. 2. A first pass index build is done. 3. A second pass is performed to add tuples that were added while the first pass build was running. 4. pg_class.relfilenode for the existing index definition and the "temporary" definition are swapped. This means that theexisting index definition now uses the index data that we stored during the build, and the "temporary" definition is usingthe old index data. 5. The "temporary" index definition is marked as dead. 6. The "temporary" index definition and it's data (which is now the data for the old index) are dropped. + * index_concurrent_create + * + * Create an index based on the given one that will be used for concurrent + * operations. The index is inserted into catalogs and needs to be built later + * on. This is called during concurrent index processing. The heap relation + * on which is based the index needs to be closed by the caller. Last bit presumably should be "on which the index is based". + /* Build the list of column names, necessary for index_create */ Instead of all this work wouldn't it be easier to create a version of index_create/ConstructTupleDescriptor that will usethe IndexInfo for the old index? ISTM index_concurrent_create() is doing a heck of a lot of work to marshal data intoone form just to have it get marshaled yet again. Worst case, if we do have to play this game, there should be a stand-alonefunction to get the columns/expressions for an existing index; you're duplicating a lot of code from pg_get_indexdef_worker(). index_concurrent_swap(): Perhaps it'd be better to create index_concurrent_swap_setup() and index_concurrent_swap_cleanup()and refactor the duplicated code out... the actual function would then become: ReindexRelationConcurrently() + * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be + * either an index or a table. If a table is specified, each step of REINDEX + * CONCURRENTLY is done in parallel with all the table's indexes as well as + * its dependent toast indexes. This comment is a bit misleading; we're not actually doing anything in parallel, right? AFAICT index_concurrent_build isgoing to block while each index is built the first time. + * relkind is an index, this index itself will be rebuilt. The locks taken + * parent relations and involved indexes are kept until this transaction + * is committed to protect against schema changes that might occur until + * the session lock is taken on each relation. This comment is a bit unclear to me... at minimum I think it should be "* on parent relations" instead of "* parent relations",but I think it needs to elaborate on why/when we're also taking session level locks. I also wordsmithed this comment a bit... * Here begins the process for concurrently rebuilding the index entries. * We need first to create an index which is based on the same data * as the former index except that it will be only registered in catalogs * and will be built later. It is possible to perform all the operations * on all the indexes at the same time for a parent relation including * indexes for its toast relation. and this one... * During this phase the concurrent indexes catch up with any new tuples that * were created during the previous phase. * * We once again wait until no transaction can have the table open with * the index marked as read-only for updates. Each index validation is done * in a separate transaction to minimize how long we hold an open transaction. + * a different valid status to avoid an implosion in the number of indexes + * a parent relation could have if this operation step fails multiple times + * in a row due to a reason or another. I'd change that to "explosion in the number of indexes a parent relation could have if this operation fails." Phase 4, 5 and 6 are rather confusing if you don't understand that each "concurrent index" entry is meant to be thrown away.I think the Phase 4 comment should elaborate on that. The comment in check_exclusion_constraint() is good; shouldn't the related comment on this line in index_create() mentionthat check_exclusion_constraint() needs to be changed if we ever support concurrent builds of exclusion indexes? if (concurrent && is_exclusion && !is_reindex) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
Attachment
Thanks for your input, Jim!
On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> Patch applies against current HEAD and builds, but I'm getting 37 failed
> tests (mostly parallel, but also misc and WITH; results attached). Is that
> expected?
This is caused by the recent commit 7b1c2a0 (that I actually participated in reviewing :p) because of a missing inclusion of ruleutils.h in index.c.
> The "mark the concurrent index" bit is rather confusing; it sounds like it's
> referring to the new index instead of the old. Now that I've read the code I
> understand what's going on here between the concurrent index *entry* and the
> filenode swap, but I don't think the docs make this sufficiently clear to
> users.
>
> How about something like this:
>
> The following steps occur in a concurrent index build, each in a separate
> transaction. Note that if there are multiple indexes to be rebuilt then each
> step loops through all the indexes we're rebuilding, using a separate
> transaction for each one.
>
On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> Patch applies against current HEAD and builds, but I'm getting 37 failed
> tests (mostly parallel, but also misc and WITH; results attached). Is that
> expected?
This is caused by the recent commit 7b1c2a0 (that I actually participated in reviewing :p) because of a missing inclusion of ruleutils.h in index.c.
> The "mark the concurrent index" bit is rather confusing; it sounds like it's
> referring to the new index instead of the old. Now that I've read the code I
> understand what's going on here between the concurrent index *entry* and the
> filenode swap, but I don't think the docs make this sufficiently clear to
> users.
>
> How about something like this:
>
> The following steps occur in a concurrent index build, each in a separate
> transaction. Note that if there are multiple indexes to be rebuilt then each
> step loops through all the indexes we're rebuilding, using a separate
> transaction for each one.
>
> 1. [blah]
Definitely a good idea! I took your text and made it more precise, listing the actions done for each step, the pg_index flags switched, using <orderedlist> to make the list of steps described in a separate paragraph more exhaustive for the user. At the same time I reworked the docs removing a part that was somewhat duplicated about dealing with the constraints having invalid index entries and how to drop them.
> + * index_concurrent_create
> + *
> + * Create an index based on the given one that will be used for concurrent
> + * operations. The index is inserted into catalogs and needs to be built
> later
> + * on. This is called during concurrent index processing. The heap relation
> + * on which is based the index needs to be closed by the caller.
>
> Last bit presumably should be "on which the index is based".
What about "Create a concurrent index based on the definition of the one provided by caller"?
> + /* Build the list of column names, necessary for index_create */
> Instead of all this work wouldn't it be easier to create a version of
> index_create/ConstructTupleDescriptor that will use the IndexInfo for the
> old index? ISTM index_concurrent_create() is doing a heck of a lot of work
> to marshal data into one form just to have it get marshaled yet again. Worst
> case, if we do have to play this game, there should be a stand-alone
> function to get the columns/expressions for an existing index; you're
> duplicating a lot of code from pg_get_indexdef_worker().
> + * index_concurrent_create
> + *
> + * Create an index based on the given one that will be used for concurrent
> + * operations. The index is inserted into catalogs and needs to be built
> later
> + * on. This is called during concurrent index processing. The heap relation
> + * on which is based the index needs to be closed by the caller.
>
> Last bit presumably should be "on which the index is based".
What about "Create a concurrent index based on the definition of the one provided by caller"?
> + /* Build the list of column names, necessary for index_create */
> Instead of all this work wouldn't it be easier to create a version of
> index_create/ConstructTupleDescriptor that will use the IndexInfo for the
> old index? ISTM index_concurrent_create() is doing a heck of a lot of work
> to marshal data into one form just to have it get marshaled yet again. Worst
> case, if we do have to play this game, there should be a stand-alone
> function to get the columns/expressions for an existing index; you're
> duplicating a lot of code from pg_get_indexdef_worker().
Yes, this definitely sucks and the approach creating a function to get all the column names is not productive as well. Then let's define an additional argument in index_create to pass a potential TupleDesc instead of this whole wart. I noticed as well that we need to actually reset attcacheoff to be able to use a fresh version of the tuple descriptor of the old index. I added a small API for this purpose in tupdesc.h called ResetTupleDescCache. Would it make sense instead to extend CreateTupleDescCopyConstr or CreateTupleDescCopy with a boolean flag?
> index_concurrent_swap(): Perhaps it'd be better to create
> index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and
> refactor the duplicated code out... the actual function would then become:
This sentence is not finished :) IMO, index_concurrent_swap looks good as is, taking as arguments the index and its concurrent entry, and swapping their relfilenode after taking AccessExclusiveLock that will be hold until the end of this transaction.> index_concurrent_swap(): Perhaps it'd be better to create
> index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and
> refactor the duplicated code out... the actual function would then become:
> ReindexRelationConcurrently()
>
> + * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
> + * either an index or a table. If a table is specified, each step of
> REINDEX
> + * CONCURRENTLY is done in parallel with all the table's indexes as well as
> + * its dependent toast indexes.
> This comment is a bit misleading; we're not actually doing anything in
> parallel, right? AFAICT index_concurrent_build is going to block while each
> index is built the first time.
Yes, parallel may be misleading. What is meant here is that each step of the process is done one by one on all the valid indexes a table may have.
> + * relkind is an index, this index itself will be rebuilt. The locks
> taken
> + * parent relations and involved indexes are kept until this
> transaction
> + * is committed to protect against schema changes that might occur
> until
> + * the session lock is taken on each relation.
>
> This comment is a bit unclear to me... at minimum I think it should be "* on
> parent relations" instead of "* parent relations", but I think it needs to
> elaborate on why/when we're also taking session level locks.
Hum, done as follows:
@@ -896,9 +896,11 @@ ReindexRelationConcurrently(Oid relationOid)
* If the relkind of given relation Oid is a table, all its valid indexes
* will be rebuilt, including its associated toast table indexes. If
* relkind is an index, this index itself will be rebuilt. The locks taken
- * parent relations and involved indexes are kept until this transaction
+ * on parent relations and involved indexes are kept until this transaction
* is committed to protect against schema changes that might occur until
- * the session lock is taken on each relation.
+ * the session lock is taken on each relation, session lock used to
+ * similarly protect from any schema change that could happen within the
+ * multiple transactions that are used during this process.
*/
> I also wordsmithed this comment a bit...
> * Here begins the process for concurrently rebuilding the index
> and this one...
> * During this phase the concurrent indexes catch up with any new
Slight differences indeed. Thanks and included.
> I'd change that to "explosion in the number of indexes a parent relation
> could have if this operation fails."
Well, implosion was more... I don't recall my state of mind when writing that. So changed the way you recommend.
> Phase 4, 5 and 6 are rather confusing if you don't understand that each
> "concurrent index" entry is meant to be thrown away. I think the Phase 4
> comment should elaborate on that.
OK, done.
> The comment in check_exclusion_constraint() is good; shouldn't the related
> comment on this line in index_create() mention that
> check_exclusion_constraint() needs to be changed if we ever support
> concurrent builds of exclusion indexes?
>
> if (concurrent && is_exclusion && !is_reindex)
OK, what about that then:
/*
- * This case is currently not supported, but there's no way to ask for it
- * in the grammar anyway, so it can't happen.
+ * This case is currently only supported during a concurrent index
+ * rebuild, but there is no way to ask for it in the grammar otherwise
+ * anyway. If support for exclusion constraints is added in the future,
+ * the check similar to this one in check_exclusion_constraint should as
+ * well be changed accordingly.
Updated patch is attached.
Thanks again.
Regards,
--
Michael
--
Michael
Attachment
On 10/30/14, 3:19 AM, Michael Paquier wrote: > Thanks for your input, Jim! > > On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>> wrote: > > Patch applies against current HEAD and builds, but I'm getting 37 failed > > tests (mostly parallel, but also misc and WITH; results attached). Is that > > expected? > This is caused by the recent commit 7b1c2a0 (that I actually participated in reviewing :p) because of a missing inclusionof ruleutils.h in index.c. > > The "mark the concurrent index" bit is rather confusing; it sounds like it's > > referring to the new index instead of the old. Now that I've read the code I > > understand what's going on here between the concurrent index *entry* and the > > filenode swap, but I don't think the docs make this sufficiently clear to > > users. > > > > How about something like this: > > > > The following steps occur in a concurrent index build, each in a separate > > transaction. Note that if there are multiple indexes to be rebuilt then each > > step loops through all the indexes we're rebuilding, using a separate > > transaction for each one. > > > > 1. [blah] > Definitely a good idea! I took your text and made it more precise, listing the actions done for each step, the pg_indexflags switched, using <orderedlist> to make the list of steps described in a separate paragraph more exhaustive forthe user. At the same time I reworked the docs removing a part that was somewhat duplicated about dealing with the constraintshaving invalid index entries and how to drop them. Awesome! Just a few items here: + Then a second pass is performed to add tuples that were added while + the first pass build was running. One the validation of the index s/One the/Once the/ > > + * index_concurrent_create > > + * > > + * Create an index based on the given one that will be used for concurrent > > + * operations. The index is inserted into catalogs and needs to be built > > later > > + * on. This is called during concurrent index processing. The heap relation > > + * on which is based the index needs to be closed by the caller. > > > > Last bit presumably should be "on which the index is based". > What about "Create a concurrent index based on the definition of the one provided by caller"? That's good too, but my comment was on the last sentence, not the first. > > + /* Build the list of column names, necessary for index_create */ > > Instead of all this work wouldn't it be easier to create a version of > > index_create/ConstructTupleDescriptor that will use the IndexInfo for the > > old index? ISTM index_concurrent_create() is doing a heck of a lot of work > > to marshal data into one form just to have it get marshaled yet again. Worst > > case, if we do have to play this game, there should be a stand-alone > > function to get the columns/expressions for an existing index; you're > > duplicating a lot of code from pg_get_indexdef_worker(). > Yes, this definitely sucks and the approach creating a function to get all the column names is not productive as well.Then let's define an additional argument in index_create to pass a potential TupleDesc instead of this whole wart. Inoticed as well that we need to actually reset attcacheoff to be able to use a fresh version of the tuple descriptor ofthe old index. I added a small API for this purpose in tupdesc.h called ResetTupleDescCache. Would it make sense insteadto extend CreateTupleDescCopyConstr or CreateTupleDescCopy with a boolean flag? Perhaps there'd be other places that would want to reset the stats, so I lean slightly that direction. The comment at the beginning of index_create needs to be modified to mention tupdesc and especially that setting tupdescover-rides indexColNames. > > index_concurrent_swap(): Perhaps it'd be better to create > > index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and > > refactor the duplicated code out... the actual function would then become: > This sentence is not finished :) IMO, index_concurrent_swap looks good as is, taking as arguments the index and its concurrententry, and swapping their relfilenode after taking AccessExclusiveLock that will be hold until the end of thistransaction. Fair enough. > > ReindexRelationConcurrently() > > > > + * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be > > + * either an index or a table. If a table is specified, each step of > > REINDEX > > + * CONCURRENTLY is done in parallel with all the table's indexes as well as > > + * its dependent toast indexes. > > This comment is a bit misleading; we're not actually doing anything in > > parallel, right? AFAICT index_concurrent_build is going to block while each > > index is built the first time. > Yes, parallel may be misleading. What is meant here is that each step of the process is done one by one on all the validindexes a table may have. New comment looks good. > > + * relkind is an index, this index itself will be rebuilt. The locks > > taken > > + * parent relations and involved indexes are kept until this > > transaction > > + * is committed to protect against schema changes that might occur > > until > > + * the session lock is taken on each relation. > > > > This comment is a bit unclear to me... at minimum I think it should be "* on > > parent relations" instead of "* parent relations", but I think it needs to > > elaborate on why/when we're also taking session level locks. > Hum, done as follows: > @@ -896,9 +896,11 @@ ReindexRelationConcurrently(Oid relationOid) > * If the relkind of given relation Oid is a table, all its valid indexes > * will be rebuilt, including its associated toast table indexes. If > * relkind is an index, this index itself will be rebuilt. The locks taken > - * parent relations and involved indexes are kept until this transaction > + * on parent relations and involved indexes are kept until this transaction > * is committed to protect against schema changes that might occur until > - * the session lock is taken on each relation. > + * the session lock is taken on each relation, session lock used to > + * similarly protect from any schema change that could happen within the > + * multiple transactions that are used during this process. > */ Cool. > OK, what about that then: > /* > - * This case is currently not supported, but there's no way to ask for it > - * in the grammar anyway, so it can't happen. > + * This case is currently only supported during a concurrent index > + * rebuild, but there is no way to ask for it in the grammar otherwise > + * anyway. If support for exclusion constraints is added in the future, > + * the check similar to this one in check_exclusion_constraint should as > + * well be changed accordingly. > > Updated patch is attached. Works for me. Keep in mind I'm not super familiar with the guts of index creation, so it'd be good for someone else to look at that bit(especially index_concurrent_create and ReindexRelationConcurrently). -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 10/30/14, 3:19 AM, Michael Paquier wrote: > On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>> wrote: > > Patch applies against current HEAD and builds, but I'm getting 37 failed > > tests (mostly parallel, but also misc and WITH; results attached). Is that > > expected? > This is caused by the recent commit 7b1c2a0 (that I actually participated in reviewing :p) because of a missing inclusionof ruleutils.h in index.c. Sorry, forgot to mention patch now passes make check cleanly. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > Updated patch is attached. Please find attached an updated patch with the following things changed: - Addition of tab completion in psql for all new commands - Addition of a call to WaitForLockers in index_concurrent_swap to ensure that there are no running transactions on the parent table running before exclusive locks are taken on the index and its concurrent entry. Previous patch versions created deadlocks because of that, issue spotted by the isolation tests integrated in the patch. - Isolation tests for reindex concurrently are re-enabled by default. Regards, -- Michael
Attachment
On 10/1/14 3:00 AM, Michael Paquier wrote: > - Use of AccessExclusiveLock when swapping relfilenodes of an index and > its concurrent entry instead of ShareUpdateExclusiveLock for safety. At > the limit of my understanding, that's the consensus reached until now. I'm very curious about this point. I looked through all the previous discussions, and the only time I saw this mentioned was at the very beginning when it was said that we could review the patch while ignoring this issue and fix it later with MVCC catalog access. Then it got very technical, but it was never explicitly concluded whether it was possible to fix this or not. Also, in the thread "Concurrently option for reindexdb" you pointed out that requiring an exclusive lock isn't really concurrent and proposed an option like --minimum-locks. I will point out again that we specifically invented DROP INDEX CONCURRENTLY because holding an exclusive lock even briefly isn't good enough. If REINDEX cannot work without an exclusive lock, we should invent some other qualifier, like WITH FEWER LOCKS. It's still useful, but we shouldn't give people the idea that they can throw away their custom CREATE INDEX CONCURRENTLY + DROP INDEX CONCURRENTLY scripts.
On Wed, Nov 5, 2014 at 8:49 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Updated patch is attached.
Please find attached an updated patch with the following things changed:
- Addition of tab completion in psql for all new commands
- Addition of a call to WaitForLockers in index_concurrent_swap to
ensure that there are no running transactions on the parent table
running before exclusive locks are taken on the index and its
concurrent entry. Previous patch versions created deadlocks because of
that, issue spotted by the isolation tests integrated in the patch.
- Isolation tests for reindex concurrently are re-enabled by default.
Regards,
It looks like this needs another rebase, I get failures on index.c, toasting.c, indexcmds.c, and index.h
Thanks,
Jeff
On Tue, Nov 11, 2014 at 3:24 AM, Jeff Janes <jeff.janes@gmail.com> wrote: > > On Wed, Nov 5, 2014 at 8:49 PM, Michael Paquier <michael.paquier@gmail.com> wrote: >> >> On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier >> <michael.paquier@gmail.com> wrote: >> > Updated patch is attached. >> Please find attached an updated patch with the following things changed: >> - Addition of tab completion in psql for all new commands >> - Addition of a call to WaitForLockers in index_concurrent_swap to >> ensure that there are no running transactions on the parent table >> running before exclusive locks are taken on the index and its >> concurrent entry. Previous patch versions created deadlocks because of >> that, issue spotted by the isolation tests integrated in the patch. >> - Isolation tests for reindex concurrently are re-enabled by default. >> Regards, > > > It looks like this needs another rebase, I get failures on index.c, toasting.c, indexcmds.c, and index.h Indeed. There are some conflicts created by the recent modification of index_create. Here is a rebased patch. -- Michael
Attachment
On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > If REINDEX cannot work without an exclusive lock, we should invent some > other qualifier, like WITH FEWER LOCKS. What he said. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >> If REINDEX cannot work without an exclusive lock, we should invent some >> other qualifier, like WITH FEWER LOCKS. > > What he said. But more to the point .... why, precisely, can't this work without an AccessExclusiveLock? And can't we fix that instead of setting for something clearly inferior? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-11-12 16:11:58 -0500, Robert Haas wrote: > On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > >> If REINDEX cannot work without an exclusive lock, we should invent some > >> other qualifier, like WITH FEWER LOCKS. > > > > What he said. I'm unconvinced. A *short* exclusive lock (just to update two pg_class row), still gives most of the benefits of CONCURRENTLY. Also, I do think we can get rid of that period in the not too far away future. > But more to the point .... why, precisely, can't this work without an > AccessExclusiveLock? And can't we fix that instead of setting for > something clearly inferior? It's nontrivial to fix, but I think we can fix it at some point. I just think we should get the *major* part of the feature before investing lots of time making it even better. There's *very* frequent questions about having this. And people do really dangerous stuff (like manually updating pg_class.relfilenode and such) to cope. The problem is that it's very hard to avoid the wrong index's relfilenode being used when swapping the relfilenodes between two indexes. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Nov 12, 2014 at 4:39 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-11-12 16:11:58 -0500, Robert Haas wrote: >> On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >> >> If REINDEX cannot work without an exclusive lock, we should invent some >> >> other qualifier, like WITH FEWER LOCKS. >> > >> > What he said. > > I'm unconvinced. A *short* exclusive lock (just to update two pg_class > row), still gives most of the benefits of CONCURRENTLY. I am pretty doubtful about that. It's still going to require you to wait for all transactions to drain out of the table while new ones are blocked from entering. Which sucks. Unless all of your transactions are very short, but that's not necessarily typical. > The problem is that it's very hard to avoid the wrong index's > relfilenode being used when swapping the relfilenodes between two > indexes. How about storing both the old and new relfilenodes in the same pg_class entry? 1. Take a snapshot. 2. Index all the tuples in that snapshot. 3. Publish the new relfilenode to an additional pg_class column, relnewfilenode or similar. 4. Wait until everyone can see step #3. 5. Rescan the table and add any missing tuples to the index. 6. Set some flag in pg_class to mark the relnewfilenode as active and relfilenode as not to be used for queries. 7. Wait until everyone can see step #6. 8. Set some flag in pg_class to mark relfilenode as not even to be opened. 9. Wait until everyone can see step #8. 10. Drop old relfilenode. 11. Clean up by setting relfilenode = relnewfilenode, relfilenode = 0. This is basically CREATE INDEX CONCURRENTLY (without the first step where we out-wait people who might create now-invalid HOT chains, because that can't arise with a REINDEX of an existing index) plus DROP INDEX CONCURRENTLY. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-11-12 18:23:38 -0500, Robert Haas wrote: > On Wed, Nov 12, 2014 at 4:39 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-11-12 16:11:58 -0500, Robert Haas wrote: > >> On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote: > >> > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > >> >> If REINDEX cannot work without an exclusive lock, we should invent some > >> >> other qualifier, like WITH FEWER LOCKS. > >> > > >> > What he said. > > > > I'm unconvinced. A *short* exclusive lock (just to update two pg_class > > row), still gives most of the benefits of CONCURRENTLY. > > I am pretty doubtful about that. It's still going to require you to > wait for all transactions to drain out of the table while new ones are > blocked from entering. Which sucks. Unless all of your transactions > are very short, but that's not necessarily typical. Yes, it sucks. But it beats not being able to reindex a relation with a primary key (referenced by a fkey) without waiting several hours by a couple magnitudes. And that's the current situation. > > The problem is that it's very hard to avoid the wrong index's > > relfilenode being used when swapping the relfilenodes between two > > indexes. > > How about storing both the old and new relfilenodes in the same pg_class entry? That's quite a cool idea [think a bit] But I think it won't work realistically. We have a *lot* of infrastructure that refers to indexes using it's primary key. I don't think we want to touch all those places to also disambiguate on some other factor. All the relevant APIs are either just passing around oids or relcache entries. There's also the problem that we'd really need two different pg_index rows to make things work. Alternatively we can duplicate the three relevant columns (indisready, indislive, indislive) in there for the different filenodes. But that's not entirely pretty. > 1. Take a snapshot. > 2. Index all the tuples in that snapshot. > 3. Publish the new relfilenode to an additional pg_class column, > relnewfilenode or similar. > 4. Wait until everyone can see step #3. Here all backends need to update both indexes, right? And all the indexing infrastructure can't deal with that without having separate oids & relcache entries. > 5. Rescan the table and add any missing tuples to the index. > 6. Set some flag in pg_class to mark the relnewfilenode as active and > relfilenode as not to be used for queries. > 7. Wait until everyone can see step #6. > 8. Set some flag in pg_class to mark relfilenode as not even to be opened. > 9. Wait until everyone can see step #8. > 10. Drop old relfilenode. > 11. Clean up by setting relfilenode = relnewfilenode, relfilenode = 0. Even that one isn't trivial - how do you deal with the fact that somebody looking at updating newrelfilenode might, in the midst of processing, see newrelfilenode = 0? I've earlier come up with a couple possible solutions, but I unfortunately found holes in all of them. And if I can find holes in them, there surely are more :(. I don't recall what the problem with just swapping the names was - but I'm pretty sure there was one... Hm. The index relation oids are referred to by constraints and dependencies. That's somewhat solvable. But I think there was something else as well... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund wrote: > On 2014-11-12 18:23:38 -0500, Robert Haas wrote: > > > The problem is that it's very hard to avoid the wrong index's > > > relfilenode being used when swapping the relfilenodes between two > > > indexes. > > > > How about storing both the old and new relfilenodes in the same pg_class entry? > > That's quite a cool idea > > [think a bit] > > But I think it won't work realistically. We have a *lot* of > infrastructure that refers to indexes using it's primary key. Hmm, can we make the relmapper do this job instead of having another pg_class column? Essentially the same sketch Robert proposed, instead we would initially set relfilenode=0 and have all onlookers use the relmapper to obtain the correct relfilenode; switching to the new relfilenode can be done atomically, and un-relmap the index once the process is complete. The difference from what Robert proposes is that the transient state is known to cause failures for anyone not prepared to deal with it, so it should be easy to spot what places need adjustment. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Nov 13, 2014 at 9:31 AM, Andres Freund <andres@2ndquadrant.com> wrote: > I don't recall what the problem with just swapping the names was - but > I'm pretty sure there was one... Hm. The index relation oids are > referred to by constraints and dependencies. That's somewhat > solvable. But I think there was something else as well... The reason given 2 years ago for not using relname was the fast that the oid of the index changes, and to it be refered by some pg_depend entries: http://www.postgresql.org/message-id/20121208133730.GA6422@awork2.anarazel.de http://www.postgresql.org/message-id/12742.1354977643@sss.pgh.pa.us Regards, -- Michael
On Thu, Nov 13, 2014 at 10:26 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Thu, Nov 13, 2014 at 9:31 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> I don't recall what the problem with just swapping the names was - but >> I'm pretty sure there was one... Hm. The index relation oids are >> referred to by constraints and dependencies. That's somewhat >> solvable. But I think there was something else as well... > The reason given 2 years ago for not using relname was the fast that > the oid of the index changes, and to it be refered by some pg_depend > entries: Feel free to correct: "and that it could be referred". -- Michael
On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund <andres@2ndquadrant.com> wrote: > But I think it won't work realistically. We have a *lot* of > infrastructure that refers to indexes using it's primary key. I don't > think we want to touch all those places to also disambiguate on some > other factor. All the relevant APIs are either just passing around oids > or relcache entries. I'm not quite following this. The whole point is to AVOID having two indexes. You have one index which may at times have two sets of physical storage. > There's also the problem that we'd really need two different pg_index > rows to make things work. Alternatively we can duplicate the three > relevant columns (indisready, indislive, indislive) in there for the > different filenodes. But that's not entirely pretty. I think what you would probably end up with is a single "char" or int2 column that defines the state of the index. Certain states would be valid only when relnewfilenode != 0. >> 1. Take a snapshot. >> 2. Index all the tuples in that snapshot. >> 3. Publish the new relfilenode to an additional pg_class column, >> relnewfilenode or similar. >> 4. Wait until everyone can see step #3. > > Here all backends need to update both indexes, right? Yes. > And all the > indexing infrastructure can't deal with that without having separate > oids & relcache entries. Why not? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 11/12/14 7:31 PM, Andres Freund wrote: > Yes, it sucks. But it beats not being able to reindex a relation with a > primary key (referenced by a fkey) without waiting several hours by a > couple magnitudes. And that's the current situation. That's fine, but we have, for better or worse, defined CONCURRENTLY := does not take exclusive locks. Use a different adverb for an in-between facility.
On November 13, 2014 10:23:41 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote: >On 11/12/14 7:31 PM, Andres Freund wrote: >> Yes, it sucks. But it beats not being able to reindex a relation with >a >> primary key (referenced by a fkey) without waiting several hours by a >> couple magnitudes. And that's the current situation. > >That's fine, but we have, for better or worse, defined CONCURRENTLY := >does not take exclusive locks. Use a different adverb for an >in-between >facility. I think that's not actually a service to our users. They'll have to adapt their scripts and knowledge when we get aroundto the more concurrent version. What exactly CONCURRENTLY means is already not strictly defined and differs betweenthe actions. I'll note that DROP INDEX CONCURRENTLY actually already internally acquires an AEL lock. Although it's a bit harder to seethe consequences of that. -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 11/13/14, 3:50 PM, Andres Freund wrote: > On November 13, 2014 10:23:41 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote: >> On 11/12/14 7:31 PM, Andres Freund wrote: >>> Yes, it sucks. But it beats not being able to reindex a relation with >> a >>> primary key (referenced by a fkey) without waiting several hours by a >>> couple magnitudes. And that's the current situation. >> >> That's fine, but we have, for better or worse, defined CONCURRENTLY := >> does not take exclusive locks. Use a different adverb for an >> in-between >> facility. > > I think that's not actually a service to our users. They'll have to adapt their scripts and knowledge when we get aroundto the more concurrent version. What exactly CONCURRENTLY means is already not strictly defined and differs betweenthe actions. It also means that if we ever found a way to get rid of the exclusive lock we'd then have an inconsistency anyway. Or we'dalso create REINDEX CONCURRENT at that time, and then have 2 command syntaxes to support. > I'll note that DROP INDEX CONCURRENTLY actually already internally acquires an AEL lock. Although it's a bit harder tosee the consequences of that. Having been responsible for a site where downtime was a 6 figure dollar amount per hour, I've spent a LOT of time worryingabout lock problems. The really big issue here isn't grabbing an exclusive lock; it's grabbing one at some randomtime when no one is there to actively monitor what's happening. (If you can't handle *any* exclusive locks, that alsomeans you can never do an ALTER TABLE ADD COLUMN either.) With that in mind, would it be possible to set this up so thatthe time-consuming process of building the new index file happens first, and then (optionally) some sort of DBA actionis required to actually do the relfilenode swap? I realize that's not the most elegant solution, but it's WAY betterthan this feature not hitting 9.5 and people having to hand-code a solution. Possible syntax: REINDEX CONCURRENTLY -- Does what current patch does REINDEX CONCURRENT BUILD -- Builds new files REINDEX CONCURRENT SWAP -- Swaps new files in This suffers from the syntax problems I mentioned above, but at least this way it's all limited to one command, and it probablyallows a lot more people to use it. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
On 2014-11-14 02:04:00 -0600, Jim Nasby wrote: > On 11/13/14, 3:50 PM, Andres Freund wrote: > Having been responsible for a site where downtime was a 6 figure > dollar amount per hour, I've spent a LOT of time worrying about lock > problems. The really big issue here isn't grabbing an exclusive lock; > it's grabbing one at some random time when no one is there to actively > monitor what's happening. (If you can't handle *any* exclusive locks, > that also means you can never do an ALTER TABLE ADD COLUMN either.) > With that in mind, would it be possible to set this up so that the > time-consuming process of building the new index file happens first, > and then (optionally) some sort of DBA action is required to actually > do the relfilenode swap? I realize that's not the most elegant > solution, but it's WAY better than this feature not hitting 9.5 and > people having to hand-code a solution. I don't think having a multi step version of the feature and it not making into 9.5 are synonymous. And I really don't want to make it even more complex before we have the basic version in. I think a split like your: > Possible syntax: > REINDEX CONCURRENTLY -- Does what current patch does > REINDEX CONCURRENT BUILD -- Builds new files > REINDEX CONCURRENT SWAP -- Swaps new files in could make sense, but it's really an additional feature ontop. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-11-13 11:41:18 -0500, Robert Haas wrote: > On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > But I think it won't work realistically. We have a *lot* of > > infrastructure that refers to indexes using it's primary key. I don't > > think we want to touch all those places to also disambiguate on some > > other factor. All the relevant APIs are either just passing around oids > > or relcache entries. > > I'm not quite following this. The whole point is to AVOID having two > indexes. You have one index which may at times have two sets of > physical storage. Right. But how are we going to refer to these different relfilenodes? All the indexing infrastructure just uses oids and/or Relation pointers to refer to the index. How would you hand down the knowledge which of the relfilenodes is supposed to be used in some callchain? There's ugly solutions like having a flag like 'bool rd_useotherfilenode' inside struct RelationData, but even ignoring the uglyness I don't think that'd work well - what if some function called inside the index code again starts a index lookup? I think I might just not getting your idea here? > > And all the > > indexing infrastructure can't deal with that without having separate > > oids & relcache entries. Hopefully explained above? Greetings, Andres Freund
On Fri, Nov 14, 2014 at 11:47 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-11-13 11:41:18 -0500, Robert Haas wrote: >> On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> > But I think it won't work realistically. We have a *lot* of >> > infrastructure that refers to indexes using it's primary key. I don't >> > think we want to touch all those places to also disambiguate on some >> > other factor. All the relevant APIs are either just passing around oids >> > or relcache entries. >> >> I'm not quite following this. The whole point is to AVOID having two >> indexes. You have one index which may at times have two sets of >> physical storage. > > Right. But how are we going to refer to these different relfilenodes? > All the indexing infrastructure just uses oids and/or Relation pointers > to refer to the index. How would you hand down the knowledge which of > the relfilenodes is supposed to be used in some callchain? If you've got a Relation, you don't need someone to tell you which physical storage to use; you can figure that out for yourself by looking at the Relation. If you've got an OID, you're probably going to go conjure up a Relation, and then you can do the same thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Nov 13, 2014 at 10:25 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Andres Freund wrote: >> On 2014-11-12 18:23:38 -0500, Robert Haas wrote: > >> > > The problem is that it's very hard to avoid the wrong index's >> > > relfilenode being used when swapping the relfilenodes between two >> > > indexes. >> > >> > How about storing both the old and new relfilenodes in the same pg_class entry? >> >> That's quite a cool idea >> >> [think a bit] >> >> But I think it won't work realistically. We have a *lot* of >> infrastructure that refers to indexes using it's primary key. > > Hmm, can we make the relmapper do this job instead of having another > pg_class column? Essentially the same sketch Robert proposed, instead > we would initially set relfilenode=0 and have all onlookers use the > relmapper to obtain the correct relfilenode; switching to the new > relfilenode can be done atomically, and un-relmap the index once the > process is complete. > The difference from what Robert proposes is that the transient state is > known to cause failures for anyone not prepared to deal with it, so it > should be easy to spot what places need adjustment. How would the failure handling actually work? Would we need some extra process to remove the extra relfilenodes? Note that in the current patch the temporary concurrent entry is kept as INVALID all the time, giving the user a path to remove them with DROP INDEX even in the case of invalid toast indexes in catalog pg_toast. Note that I am on the side of using the exclusive lock when swapping relfilenodes for now in any case, that's what pg_repack does by renaming the indexes, and people use it. -- Michael
13.11.2014, 23:50, Andres Freund kirjoitti: > On November 13, 2014 10:23:41 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote: >> On 11/12/14 7:31 PM, Andres Freund wrote: >>> Yes, it sucks. But it beats not being able to reindex a relation with >> a >>> primary key (referenced by a fkey) without waiting several hours by a >>> couple magnitudes. And that's the current situation. >> >> That's fine, but we have, for better or worse, defined CONCURRENTLY := >> does not take exclusive locks. Use a different adverb for an >> in-between >> facility. > > I think that's not actually a service to our users. They'll have to adapt their scripts and knowledge when we get aroundto the more concurrent version. What exactly CONCURRENTLY means is already not strictly defined and differs betweenthe actions. > > I'll note that DROP INDEX CONCURRENTLY actually already internally acquires an AEL lock. Although it's a bit harder tosee the consequences of that. If the short-lived lock is the only blocker for this feature at the moment could we just require an additional qualifier for CONCURRENTLY (FORCE?) until the lock can be removed, something like: tmp# REINDEX INDEX CONCURRENTLY tmp_pkey; ERROR: REINDEX INDEX CONCURRENTLY is not fully concurrent; use REINDEX INDEX CONCURRENTLY FORCE to perform reindex with a short-lived lock. tmp=# REINDEX INDEX CONCURRENTLY FORCE tmp_pkey; REINDEX It's not optimal, but currently there's no way to reindex a primary key anywhere close to concurrently and a short lock would be a huge improvement over the current situation. / Oskari
On Tue, Dec 23, 2014 at 5:54 PM, Oskari Saarenmaa <os@ohmu.fi> wrote: > > If the short-lived lock is the only blocker for this feature at the > moment could we just require an additional qualifier for CONCURRENTLY > (FORCE?) until the lock can be removed, something like: > =# [blah] FWIW, I'd just keep only CONCURRENTLY with no fancy additional keywords even if we cheat on it, as long as it is precised in the documentation that an exclusive lock is taken for a very short time, largely shorter than what a normal REINDEX would do btw. > It's not optimal, but currently there's no way to reindex a primary key > anywhere close to concurrently and a short lock would be a huge > improvement over the current situation. Yep. -- Michael
On 2014-11-12 16:11:58 -0500, Robert Haas wrote: > On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > >> If REINDEX cannot work without an exclusive lock, we should invent some > >> other qualifier, like WITH FEWER LOCKS. > > > > What he said. > > But more to the point .... why, precisely, can't this work without an > AccessExclusiveLock? And can't we fix that instead of setting for > something clearly inferior? So, here's an alternative approach of how to get rid of the AEL locks. They're required because we want to switch the relfilenodes around. I've pretty much no confidence in any of the schemes anybody has come up to avoid that. So, let's not switch relfilenodes around. I think if we should instead just use the new index, repoint the dependencies onto the new oid, and then afterwards, when dropping, rename the new index one onto the old one. That means the oid of the index will change and some less than pretty grovelling around dependencies, but it still seems preferrable to what we're discussing here otherwise. Does anybody see a fundamental problem with that approach? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Feb 2, 2015 at 9:10 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-11-12 16:11:58 -0500, Robert Haas wrote: >> On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >> >> If REINDEX cannot work without an exclusive lock, we should invent some >> >> other qualifier, like WITH FEWER LOCKS. >> > >> > What he said. >> >> But more to the point .... why, precisely, can't this work without an >> AccessExclusiveLock? And can't we fix that instead of setting for >> something clearly inferior? > > So, here's an alternative approach of how to get rid of the AEL > locks. They're required because we want to switch the relfilenodes > around. I've pretty much no confidence in any of the schemes anybody has > come up to avoid that. > > So, let's not switch relfilenodes around. > > I think if we should instead just use the new index, repoint the > dependencies onto the new oid, and then afterwards, when dropping, > rename the new index one onto the old one. That means the oid of the > index will change and some less than pretty grovelling around > dependencies, but it still seems preferrable to what we're discussing > here otherwise. > > Does anybody see a fundamental problem with that approach? I'm not sure whether that will work out, but it seems worth a try. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 02/02/2015 03:10 PM, Andres Freund wrote: > I think if we should instead just use the new index, repoint the > dependencies onto the new oid, and then afterwards, when dropping, > rename the new index one onto the old one. That means the oid of the > index will change and some less than pretty grovelling around > dependencies, but it still seems preferrable to what we're discussing > here otherwise. I think that sounds like a good plan. The oid change does not seem like a too big deal to me, especially since that is what users will get now too. Do you still think this is the right way to solve this? I have attached my work in progress patch which implements and is very heavily based on Michael's previous work. There are some things left to do but I think I should have a patch ready for the next commitfest if people still like this type of solution. I also changed index_set_state_flags() to be transactional since I wanted the old index to become invalid at exactly the same time as the new becomes valid. From reviewing the code that seems like a safe change. A couple of bike shedding questions: - Is the syntax "REINDEX <type> CONCUURENTLY <object>" ok? - What should we do with REINDEX DATABASE CONCURRENTLY and the system catalog? I so not think we can reindex the system catalog concurrently safely, so what should REINDEX DATABASE do with the catalog indexes? Skip them, reindex them while taking locks, or just error out? - What level of information should be output in VERBOSE mode? What remains to be implemented: - Support for exclusion constraints - Look more into how I handle constraints (currently the temporary index too seems to have the PRIMARY KEY flag) - Support for the VERBOSE flag - More testing to catch bugs Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Sun, Feb 12, 2017 at 6:44 AM, Andreas Karlsson <andreas@proxel.se> wrote: > On 02/02/2015 03:10 PM, Andres Freund wrote: >> I think if we should instead just use the new index, repoint the >> dependencies onto the new oid, and then afterwards, when dropping, >> rename the new index one onto the old one. That means the oid of the >> index will change and some less than pretty grovelling around >> dependencies, but it still seems preferrable to what we're discussing >> here otherwise. > > I think that sounds like a good plan. The oid change does not seem like a > too big deal to me, especially since that is what users will get now too. Do > you still think this is the right way to solve this? That hurts mainly system indexes. Perhaps users with broken system indexes are not going to care about concurrency anyway. Thinking now about it I don't see how that would not work, but I did not think deeply about this problem lately. > I have attached my work in progress patch which implements and is very > heavily based on Michael's previous work. There are some things left to do > but I think I should have a patch ready for the next commitfest if people > still like this type of solution. Cool to see a rebase of this patch. It's been a long time... > I also changed index_set_state_flags() to be transactional since I wanted > the old index to become invalid at exactly the same time as the new becomes > valid. From reviewing the code that seems like a safe change. > > A couple of bike shedding questions: > > - Is the syntax "REINDEX <type> CONCUURENTLY <object>" ok? Yeah, that's fine. At least that's what has been concluded in previous threads. > - What should we do with REINDEX DATABASE CONCURRENTLY and the system > catalog? I so not think we can reindex the system catalog concurrently > safely, so what should REINDEX DATABASE do with the catalog indexes? Skip > them, reindex them while taking locks, or just error out? System indexes cannot have their OIDs changed as they are used in syscache lookups. So just logging a warning looks fine to me, and the price to pay to avoid taking an exclusive lock even for a short amount of time. > - What level of information should be output in VERBOSE mode? Er, something like that as well, no? DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. > What remains to be implemented: > - Support for exclusion constraints > - Look more into how I handle constraints (currently the temporary index too > seems to have the PRIMARY KEY flag) > - Support for the VERBOSE flag > - More testing to catch bugs This is a crasher: create table aa (a int primary key); reindex (verbose) schema concurrently public ; For invalid indexes sometimes snapshots are still active (after issuing the previous crash for example): =# reindex (verbose) table concurrently aa; WARNING: XX002: cannot reindex concurrently invalid index "public.aa_pkey_cct", skipping LOCATION: ReindexRelationConcurrently, indexcmds.c:2119 WARNING: 01000: snapshot 0x7fde12003038 still active -- Michael
On 02/13/2017 06:31 AM, Michael Paquier wrote: >> - What should we do with REINDEX DATABASE CONCURRENTLY and the system >> catalog? I so not think we can reindex the system catalog concurrently >> safely, so what should REINDEX DATABASE do with the catalog indexes? Skip >> them, reindex them while taking locks, or just error out? > > System indexes cannot have their OIDs changed as they are used in > syscache lookups. So just logging a warning looks fine to me, and the > price to pay to avoid taking an exclusive lock even for a short amount > of time. Good idea, I think I will add one line of warning if it finds any system index in the schema. >> - What level of information should be output in VERBOSE mode? > > Er, something like that as well, no? > DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. REINDEX (VERBOSE) currently prints one such line per index, which does not really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes on a relation at the same time. It is not immediately obvious how this should work. Maybe one such detail line per table? > This is a crasher: > create table aa (a int primary key); > reindex (verbose) schema concurrently public ; > > For invalid indexes sometimes snapshots are still active (after > issuing the previous crash for example): > =# reindex (verbose) table concurrently aa; > WARNING: XX002: cannot reindex concurrently invalid index > "public.aa_pkey_cct", skipping > LOCATION: ReindexRelationConcurrently, indexcmds.c:2119 > WARNING: 01000: snapshot 0x7fde12003038 still active Thanks for testing the patch! The crash was caused by things being allocated in the wrong memory context when reindexing multiple tables and therefore freed on the first intermediate commit. I have created a new memory context to handle this in which I only allocate the lists which need to survive between transactions.. Hm, when writing the above I just realized why ReindexTable/ReindexIndex did not suffer from the same bug. It is because the first transaction there allocated in the PortalHeapMemory context which survives commit. I really need to look at if there is a clean way to handle memory contexts in my patch. I also found the snapshot still active bug, it seems to have been caused by REINDEX TABLE CONCURRENTLY leaving an open snapshot which cannot be popped by PortalRunUtility(). Thanks again! Andreas
On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson <andreas@proxel.se> wrote: > On 02/13/2017 06:31 AM, Michael Paquier wrote: >> Er, something like that as well, no? >> DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. > > REINDEX (VERBOSE) currently prints one such line per index, which does not > really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes > on a relation at the same time. It is not immediately obvious how this > should work. Maybe one such detail line per table? Hard to recall this thing in details with the time and the fact that a relation is reindexed by processing all the indexes once at each step. Hm... What if ReindexRelationConcurrently() actually is refactored in such a way that it processes all the steps for each index individually? This way you can monitor the time it takes to build completely each index, including its . This operation would consume more transactions but in the event of a failure the amount of things to clean up is really reduced particularly for relations with many indexes. This would as well reduce VERBOSE to print one line per index rebuilt. -- Michael
On Tue, Feb 14, 2017 at 12:56 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > This way you can monitor the time it takes to build > completely each index, including its . You can ignore the terms "including its" here. -- Michael
On 02/14/2017 04:56 AM, Michael Paquier wrote: > On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson <andreas@proxel.se> wrote: >> On 02/13/2017 06:31 AM, Michael Paquier wrote: >>> Er, something like that as well, no? >>> DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. >> >> REINDEX (VERBOSE) currently prints one such line per index, which does not >> really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes >> on a relation at the same time. It is not immediately obvious how this >> should work. Maybe one such detail line per table? > > Hard to recall this thing in details with the time and the fact that a > relation is reindexed by processing all the indexes once at each step. > Hm... What if ReindexRelationConcurrently() actually is refactored in > such a way that it processes all the steps for each index > individually? This way you can monitor the time it takes to build > completely each index, including its . This operation would consume > more transactions but in the event of a failure the amount of things > to clean up is really reduced particularly for relations with many > indexes. This would as well reduce VERBOSE to print one line per index > rebuilt. I am actually thinking about going the opposite direction (by reducing the number of times we call WaitForLockers), because it is not just about consuming transaction IDs, we also do not want to wait too many times for transactions to commit. I am leaning towards only calling WaitForLockersMultiple three times per table. 1. Between building and validating the new indexes. 2. Between setting the old indexes to invalid and setting them to dead 3. Between setting the old indexes to dead and dropping them Right now my patch loops over the indexes in step 2 and 3 and waits for lockers once per index. This seems rather wasteful. I have thought about that the code might be cleaner if we just looped over all indexes (and as a bonus the VERBOSE output would be more obvious), but I do not think it is worth waiting for lockers all those extra times. Andreas
On 02/17/2017 01:53 PM, Andreas Karlsson wrote: > I am actually thinking about going the opposite direction (by reducing > the number of times we call WaitForLockers), because it is not just > about consuming transaction IDs, we also do not want to wait too many > times for transactions to commit. I am leaning towards only calling > WaitForLockersMultiple three times per table. > > 1. Between building and validating the new indexes. > 2. Between setting the old indexes to invalid and setting them to dead > 3. Between setting the old indexes to dead and dropping them > > Right now my patch loops over the indexes in step 2 and 3 and waits for > lockers once per index. This seems rather wasteful. > > I have thought about that the code might be cleaner if we just looped > over all indexes (and as a bonus the VERBOSE output would be more > obvious), but I do not think it is worth waiting for lockers all those > extra times. Thinking about this makes me wonder about why you decided to use a transaction per index in many of the steps rather than a transaction per step. Most steps should be quick. The only steps I think the makes sense to have a transaction per table are. 1) When building indexes to avoid long running transactions. 2) When validating the new indexes, also to avoid long running transactions. But when swapping the indexes or when dropping the old indexes I do not see any reason to not just use one transaction per step since we do not even have to wait for any locks (other than WaitForLockers which we just want to call once anyway since all indexes relate to the same table). Andreas
On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson <andreas@proxel.se> wrote: > Thinking about this makes me wonder about why you decided to use a > transaction per index in many of the steps rather than a transaction per > step. Most steps should be quick. The only steps I think the makes sense to > have a transaction per table are. I don't recall all the details to be honest :) > 1) When building indexes to avoid long running transactions. > 2) When validating the new indexes, also to avoid long running transactions. > > But when swapping the indexes or when dropping the old indexes I do not see > any reason to not just use one transaction per step since we do not even > have to wait for any locks (other than WaitForLockers which we just want to > call once anyway since all indexes relate to the same table). Perhaps, this really needs a careful lookup. By the way, as this patch is showing up for the first time in this development cycle, would it be allowed in the last commit fest? That's not a patch in the easy category, far from that, but it does not present a new concept. -- Michael
On Fri, Feb 17, 2017 at 11:05:31PM +0900, Michael Paquier wrote: > On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson <andreas@proxel.se> wrote: > > Thinking about this makes me wonder about why you decided to use a > > transaction per index in many of the steps rather than a transaction per > > step. Most steps should be quick. The only steps I think the makes sense to > > have a transaction per table are. > > I don't recall all the details to be honest :) > > > 1) When building indexes to avoid long running transactions. > > 2) When validating the new indexes, also to avoid long running transactions. > > > > But when swapping the indexes or when dropping the old indexes I do not see > > any reason to not just use one transaction per step since we do not even > > have to wait for any locks (other than WaitForLockers which we just want to > > call once anyway since all indexes relate to the same table). > > Perhaps, this really needs a careful lookup. > > By the way, as this patch is showing up for the first time in this > development cycle, would it be allowed in the last commit fest? That's > not a patch in the easy category, far from that, but it does not > present a new concept. FYI, the thread started on 2013-11-15. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Tue, Feb 28, 2017 at 5:29 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Fri, Feb 17, 2017 at 11:05:31PM +0900, Michael Paquier wrote: >> On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson <andreas@proxel.se> wrote: >> > Thinking about this makes me wonder about why you decided to use a >> > transaction per index in many of the steps rather than a transaction per >> > step. Most steps should be quick. The only steps I think the makes sense to >> > have a transaction per table are. >> >> I don't recall all the details to be honest :) >> >> > 1) When building indexes to avoid long running transactions. >> > 2) When validating the new indexes, also to avoid long running transactions. >> > >> > But when swapping the indexes or when dropping the old indexes I do not see >> > any reason to not just use one transaction per step since we do not even >> > have to wait for any locks (other than WaitForLockers which we just want to >> > call once anyway since all indexes relate to the same table). >> >> Perhaps, this really needs a careful lookup. >> >> By the way, as this patch is showing up for the first time in this >> development cycle, would it be allowed in the last commit fest? That's >> not a patch in the easy category, far from that, but it does not >> present a new concept. > > FYI, the thread started on 2013-11-15. I don't object to the addition of this patch in next CF as this presents no new concept. Still per the complications this patch and because it is a complicated patch I was wondering if people are fine to include it in this last CF. -- Michael
Michael Paquier <michael.paquier@gmail.com> writes: > I don't object to the addition of this patch in next CF as this > presents no new concept. Still per the complications this patch and > because it is a complicated patch I was wondering if people are fine > to include it in this last CF. The March CF is already looking pretty daunting. We can try to include this but I won't be too surprised if it gets punted to a future CF. regards, tom lane
On Mon, Feb 27, 2017 at 05:31:21PM -0500, Tom Lane wrote: > Michael Paquier <michael.paquier@gmail.com> writes: > > I don't object to the addition of this patch in next CF as this > > presents no new concept. Still per the complications this patch and > > because it is a complicated patch I was wondering if people are fine > > to include it in this last CF. > > The March CF is already looking pretty daunting. We can try to include > this but I won't be too surprised if it gets punted to a future CF. Yeah, that was my reaction too. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Hi, Here is a third take on this feature, heavily based on Michael Paquier's 2.0 patch. This time the patch does not attempt to preserve the index oids, but instead creates new indexes and moves all dependencies from the old indexes to the new before dropping the old ones. The only downside I can see to this approach is that we no logner will able to reindex catalog tables concurrently, but in return it should be easier to confirm that this approach can be made work. This patch relies on that we can change the indisvalid flag of indexes transactionally, and as far as I can tell this is the case now that we have MVCC for the catalog updates. The code does some extra intermediate commits when building the indexes to avoid long running transactions. How REINDEX CONCURRENTLY operates: For each table: 1. Create new indexes without populating them, and lock the tables and indexes for the session. 2. After waiting for all running transactions populate each index in a separate transaction and set them to ready. 3. After waiting again for all running transactions validate each index in a separate transaction (but not setting them to valid just yet). 4. Swap all dependencies over from each old index to the new index and rename the old and the new indexes (from the <name> to <name>_ccold and <name>_new to <name>), and set isprimary and isexclusion flags. Here we also mark the new indexes as valid and the old indexes as invalid. 5. After waiting for all running transactions we change each index from invalid to dead. 6. After waiting for all running transactions we drop each index. 7. Drop all session locks. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 2/28/17 11:21 AM, Andreas Karlsson wrote: > The only downside I can see to this approach is that we no logner will > able to reindex catalog tables concurrently, but in return it should be > easier to confirm that this approach can be made work. Another downside is any stored regclass fields will become invalid. Admittedly that's a pretty unusual use case, but it'd be nice if there was at least a way to let users fix things during the rename phase (perhaps via an event trigger). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson <andreas@proxel.se> wrote: > For each table: > > 1. Create new indexes without populating them, and lock the tables and > indexes for the session. + /* + * Copy contraint flags for old index. This is safe because the old index + * guaranteed uniquness. + */ + newIndexForm->indisprimary = oldIndexForm->indisprimary; + oldIndexForm->indisprimary = false; + newIndexForm->indisexclusion = oldIndexForm->indisexclusion; + oldIndexForm->indisexclusion = false; [...] + deleteDependencyRecordsForClass(RelationRelationId, newIndexOid, + RelationRelationId, DEPENDENCY_AUTO); + deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid, + ConstraintRelationId, DEPENDENCY_INTERNAL); + + // TODO: pg_depend for old index? There is a lot of mumbo-jumbo in the patch to create the exact same index definition as the original one being reindexed, and that's a huge maintenance burden for the future. You can blame me for that in the current patch. I am wondering if it would not just be better to generate a CREATE INDEX query string and then use the SPI to create the index, and also do the following extensions at SQL level: - Add a sort of WITH NO DATA clause where the index is created, so the index is created empty, and is marked invalid and not ready. - Extend pg_get_indexdef_string() with an optional parameter to enforce the index name to something else, most likely it should be extended with the WITH NO DATA/INVALID clause, which should just be a storage parameter by the way. By doing something like that what the REINDEX CONCURRENTLY code path should just be careful about is that it chooses an index name that avoids any conflicts. -- Michael
On 2017-03-01 19:25:23 -0600, Jim Nasby wrote: > On 2/28/17 11:21 AM, Andreas Karlsson wrote: > > The only downside I can see to this approach is that we no logner will > > able to reindex catalog tables concurrently, but in return it should be > > easier to confirm that this approach can be made work. > > Another downside is any stored regclass fields will become invalid. > Admittedly that's a pretty unusual use case, but it'd be nice if there was > at least a way to let users fix things during the rename phase (perhaps via > an event trigger). I'm fairly confident that we don't want to invoke event triggers inside the CIC code... I'm also fairly confident that between index oids stored somewhere being invalidated - what'd be a realistic use case of that - and not having reindex concurrently, just about everyone will choose the former. Regards, Andres
On 03/02/2017 02:25 AM, Jim Nasby wrote: > On 2/28/17 11:21 AM, Andreas Karlsson wrote: >> The only downside I can see to this approach is that we no logner will >> able to reindex catalog tables concurrently, but in return it should be >> easier to confirm that this approach can be made work. > > Another downside is any stored regclass fields will become invalid. > Admittedly that's a pretty unusual use case, but it'd be nice if there > was at least a way to let users fix things during the rename phase > (perhaps via an event trigger). Good point, but I agree with Andres here. Having REINDEX CONCURRENTLY issue event triggers seems strange to me. While it does create and drop indexes as part of its implementation, it is actually just an index maintenance job. Andreas
On Thu, Mar 2, 2017 at 11:48 AM, Andres Freund <andres@anarazel.de> wrote: > On 2017-03-01 19:25:23 -0600, Jim Nasby wrote: >> On 2/28/17 11:21 AM, Andreas Karlsson wrote: >> > The only downside I can see to this approach is that we no logner will >> > able to reindex catalog tables concurrently, but in return it should be >> > easier to confirm that this approach can be made work. >> >> Another downside is any stored regclass fields will become invalid. >> Admittedly that's a pretty unusual use case, but it'd be nice if there was >> at least a way to let users fix things during the rename phase (perhaps via >> an event trigger). > > I'm fairly confident that we don't want to invoke event triggers inside > the CIC code... I'm also fairly confident that between index oids > stored somewhere being invalidated - what'd be a realistic use case of > that - and not having reindex concurrently, just about everyone will > choose the former. Maybe. But it looks to me like this patch is going to have considerably more than its share of user-visible warts, and I'm not very excited about that. I feel like what we ought to be doing is keeping the index OID the same and changing the relfilenode to point to a newly-created one, and I attribute our failure to make that design work thus far to insufficiently aggressive hacking. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On March 4, 2017 1:16:56 AM PST, Robert Haas <robertmhaas@gmail.com> wrote: >Maybe. But it looks to me like this patch is going to have >considerably more than its share of user-visible warts, and I'm not >very excited about that. I feel like what we ought to be doing is >keeping the index OID the same and changing the relfilenode to point >to a newly-created one, and I attribute our failure to make that >design work thus far to insufficiently aggressive hacking. We literally spent years and a lot of emails waiting for that to happen. Users now hack up solutions like this in userspace,obviously a bad solution. I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
On Sat, Mar 4, 2017 at 12:34 PM, Andres Freund <andres@anarazel.de> wrote: > I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart. I, again, give that a firm "maybe". If the warts end up annoying 1% of the users who try to use this feature, then you're right. If they end up making a substantial percentage of people who try to use this feature give up on it, then we've added a bunch of complexity and future code maintenance for little real gain. I'm not ruling out the possibility that you're 100% correct, but I'm not nearly as convinced of that as you are. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Mar 4, 2017 at 9:34 AM, Andres Freund <andres@anarazel.de> wrote: > On March 4, 2017 1:16:56 AM PST, Robert Haas <robertmhaas@gmail.com> wrote: >>Maybe. But it looks to me like this patch is going to have >>considerably more than its share of user-visible warts, and I'm not >>very excited about that. I feel like what we ought to be doing is >>keeping the index OID the same and changing the relfilenode to point >>to a newly-created one, and I attribute our failure to make that >>design work thus far to insufficiently aggressive hacking. > > We literally spent years and a lot of emails waiting for that to happen. Users now hack up solutions like this in userspace,obviously a bad solution. > > I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart. IMHO, REINDEX itself is implemented in a way that is conceptually pure, and yet quite user hostile. I tend to tell colleagues that ask about REINDEX something along the lines of: Just assume that REINDEX is going to block out even SELECT statements referencing the underlying table. It might not be that bad for you in practice, but the details are arcane such that it might as well be that simple most of the time. Even if you have time to listen to me explain it all, which you clearly don't, you're still probably not going to be able to apply what you've learned in a way that helps you. -- Peter Geoghegan
On 03/05/2017 07:56 PM, Robert Haas wrote: > On Sat, Mar 4, 2017 at 12:34 PM, Andres Freund <andres@anarazel.de> wrote: >> I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart. > > I, again, give that a firm "maybe". If the warts end up annoying 1% > of the users who try to use this feature, then you're right. If they > end up making a substantial percentage of people who try to use this > feature give up on it, then we've added a bunch of complexity and > future code maintenance for little real gain. I'm not ruling out the > possibility that you're 100% correct, but I'm not nearly as convinced > of that as you are. I agree that these warts are annoying but I also think the limitations can be explained pretty easily to users (e.g. by explaining it in the manual how REINDEX CONCURRENTLY creates a new index to replace the old one), and I think that is the important question about if a useful feature with warts should be merged or not. Does it make things substantially harder for the average user to grok? And I would argue that his feature is useful for quite many, based on my experience running a semi-large database. Index bloat happens and without REINDEX CONCURRENTLY it can be really annoying to solve, especially for primary keys. Certainly more people have problems with index bloat than the number of people who store index oids in their database. Andreas
On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas@proxel.se> wrote: > And I would argue that his feature is useful for quite many, based on my > experience running a semi-large database. Index bloat happens and without > REINDEX CONCURRENTLY it can be really annoying to solve, especially for > primary keys. Certainly more people have problems with index bloat than the > number of people who store index oids in their database. Yeah, but that's not the only wart, I think. For example, I believe (haven't looked at this patch series in a while) that the patch takes a lock and later escalates the lock level. If so, that could lead to doing a lot of work to build the index and then getting killed by the deadlock detector. Also, if by any chance you think (or use any software that thinks) that OIDs for system objects are a stable identifier, this will be the first case where that ceases to be true. If the system is shut down or crashes or the session is killed, you'll be left with stray objects with names that you've never typed into the system. I'm sure you're going to say "don't worry, none of that is any big deal" and maybe you're right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017-03-07 21:48:23 -0500, Robert Haas wrote: > On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas@proxel.se> wrote: > > And I would argue that his feature is useful for quite many, based on my > > experience running a semi-large database. Index bloat happens and without > > REINDEX CONCURRENTLY it can be really annoying to solve, especially for > > primary keys. Certainly more people have problems with index bloat than the > > number of people who store index oids in their database. > > Yeah, but that's not the only wart, I think. I don't really see any other warts that don't correspond to CREATE/DROP INDEX CONCURRENTLY. > For example, I believe (haven't looked at this patch series in a > while) that the patch takes a lock and later escalates the lock level. It shouldn't* - that was required precisely because we had to switch the relfilenodes when the oid stayed the same. Otherwise in-progress index lookups could end up using the wrong relfilenodes and/or switch in the middle of a lookup. * excepting the exclusive lock DROP INDEX CONCURRENTLY style dropping uses after marking the index as dead - but that shouldn'tbe much of a concern? > Also, if by any chance you think (or use any software that thinks) > that OIDs for system objects are a stable identifier, this will be the > first case where that ceases to be true. Can you come up with an halfway realistic scenario why an index oid, not a table, constraint, sequence oid, would be relied upon? > If the system is shut down or crashes or the session is killed, you'll > be left with stray objects with names that you've never typed into the > system. Given how relatively few complaints we have about CIC's possibility of ending up with invalid indexes - not that there are none - and it's widespread usage, I'm not too concerned about this. Greetings, Andres Freund
On 03/08/2017 03:48 AM, Robert Haas wrote: > On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas@proxel.se> wrote: >> And I would argue that his feature is useful for quite many, based on my >> experience running a semi-large database. Index bloat happens and without >> REINDEX CONCURRENTLY it can be really annoying to solve, especially for >> primary keys. Certainly more people have problems with index bloat than the >> number of people who store index oids in their database. > > Yeah, but that's not the only wart, I think. The only two potential issues I see with the patch are: 1) That the index oid changes visibly to external users. 2) That the code for moving the dependencies will need to be updated when adding new things which refer to an index oid. Given how useful I find REINDEX CONCURRENTLY I think these warts are worth it given that the impact is quite limited. I am of course biased since if I did not believe this I would not pursue this solution in the first place. > For example, I believe > (haven't looked at this patch series in a while) that the patch takes > a lock and later escalates the lock level. If so, that could lead to > doing a lot of work to build the index and then getting killed by the > deadlock detector. This version of the patch no longer does that. For my use case escalating the lock would make this patch much less interesting. The highest lock level taken is the same one as the initial one (SHARE UPDATE EXCLUSIVE). The current patch does on a high level (very simplified) this: 1. CREATE INDEX CONCURRENTLY ind_new; 2. Atomically move all dependencies from ind to ind_new, rename ind to ind_old, and rename ind_new to ind. 3. DROP INDEX CONCURRENTLY ind_old; The actual implementation is a bit more complicated in reality, but no part escalates the lock level over what would be required by the steps for creating and dropping indexes concurrently > Also, if by any chance you think (or use any > software that thinks) that OIDs for system objects are a stable > identifier, this will be the first case where that ceases to be true. > If the system is shut down or crashes or the session is killed, you'll > be left with stray objects with names that you've never typed into the > system. I'm sure you're going to say "don't worry, none of that is > any big deal" and maybe you're right. Hm, I cannot think of any real life scenario where this will be an issue based on my personal experience with PostgreSQL, but if you can think of one please provide it. I will try to ponder some more on this myself. Andreas
On 3/8/17 9:34 AM, Andreas Karlsson wrote: >> Also, if by any chance you think (or use any >> software that thinks) that OIDs for system objects are a stable >> identifier, this will be the first case where that ceases to be true. >> If the system is shut down or crashes or the session is killed, you'll >> be left with stray objects with names that you've never typed into the >> system. I'm sure you're going to say "don't worry, none of that is >> any big deal" and maybe you're right. > > Hm, I cannot think of any real life scenario where this will be an issue > based on my personal experience with PostgreSQL, but if you can think of > one please provide it. I will try to ponder some more on this myself. The case I currently have is to allow tracking database objects similar to (but not the same) as how we track the objects that belong to an extension[1]. That currently depends on event triggers to keep names updated if they're changed, as well as making use of the reg* types. If an event trigger fired as part of the index rename (essentially treating it like an ALTER INDEX) then I should be able to work around that. The ultimate reason for doing this is to provide something similar to extensions (create a bunch of database objects that are all bound together), but also similar to classes in OO languages (so you can have multiple instances).[2] Admittedly, this is pretty off the beaten path and I certainly wouldn't hold up the patch because of it. I am hoping that it'd be fairly easy to fire an event trigger as if someone had just renamed the index. 1: https://github.com/decibel/object_reference 2: https://github.com/decibel/pg_classy -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com
On Wed, Mar 8, 2017 at 4:12 PM, Andres Freund <andres@anarazel.de> wrote: > Can you come up with an halfway realistic scenario why an index oid, not > a table, constraint, sequence oid, would be relied upon? Is there an implication for SIREAD locks? Predicate locks on index pages include the index OID in the tag. -- Thomas Munro http://www.enterprisedb.com
On Fri, Mar 10, 2017 at 9:36 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Wed, Mar 8, 2017 at 4:12 PM, Andres Freund <andres@anarazel.de> wrote: >> Can you come up with an halfway realistic scenario why an index oid, not >> a table, constraint, sequence oid, would be relied upon? > > Is there an implication for SIREAD locks? Predicate locks on index > pages include the index OID in the tag. Ah, yes, but that is covered by a call to TransferPredicateLocksToHeapRelation() in index_concurrent_set_dead(). -- Thomas Munro http://www.enterprisedb.com
On 03/02/2017 03:10 AM, Michael Paquier wrote: > On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson <andreas@proxel.se> wrote: > + /* > + * Copy contraint flags for old index. This is safe because the old index > + * guaranteed uniquness. > + */ > + newIndexForm->indisprimary = oldIndexForm->indisprimary; > + oldIndexForm->indisprimary = false; > + newIndexForm->indisexclusion = oldIndexForm->indisexclusion; > + oldIndexForm->indisexclusion = false; > [...] > + deleteDependencyRecordsForClass(RelationRelationId, newIndexOid, > + RelationRelationId, DEPENDENCY_AUTO); > + deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid, > + ConstraintRelationId, > DEPENDENCY_INTERNAL); > + > + // TODO: pg_depend for old index? Spotted one of my TODO comments there so I have attached a patch where I have cleaned up that function. I also fixed the the code to properly support triggers. > There is a lot of mumbo-jumbo in the patch to create the exact same > index definition as the original one being reindexed, and that's a > huge maintenance burden for the future. You can blame me for that in > the current patch. I am wondering if it would not just be better to > generate a CREATE INDEX query string and then use the SPI to create > the index, and also do the following extensions at SQL level: > - Add a sort of WITH NO DATA clause where the index is created, so the > index is created empty, and is marked invalid and not ready. > - Extend pg_get_indexdef_string() with an optional parameter to > enforce the index name to something else, most likely it should be > extended with the WITH NO DATA/INVALID clause, which should just be a > storage parameter by the way. > By doing something like that what the REINDEX CONCURRENTLY code path > should just be careful about is that it chooses an index name that > avoids any conflicts. Hm, I am not sure how much that would help since a lot of the mumb-jumbo is by necessity in the step where we move the constraints over from the old index to the new. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 03/13/2017 03:11 AM, Andreas Karlsson wrote: > I also fixed the the code to properly support triggers. And by "support triggers" I actually meant fixing the support for moving the foreign keys to the new index. Andreas
Hi, I had a look at this. On Mon, Mar 13, 2017 at 03:11:50AM +0100, Andreas Karlsson wrote: > Spotted one of my TODO comments there so I have attached a patch where I > have cleaned up that function. I also fixed the the code to properly support > triggers. The patch applies with quite a few offsets on top of current (2fd8685) master, I have not verified that those are all ok. Regression tests pass, also the included isolation tests. I hope that Michael will post a full review as he worked on the code extensively, but here are some some code comments, mostly on the comments (note that I'm not a native speaker, so I might be wrong on some of them as well): > diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml > index 3908ade37b..3449c0af73 100644 > --- a/doc/src/sgml/ref/reindex.sgml > +++ b/doc/src/sgml/ref/reindex.sgml > @@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea > An index build with the <literal>CONCURRENTLY</> option failed, leaving > an <quote>invalid</> index. Such indexes are useless but it can be > convenient to use <command>REINDEX</> to rebuild them. Note that > - <command>REINDEX</> will not perform a concurrent build. To build the > - index without interfering with production you should drop the index and > - reissue the <command>CREATE INDEX CONCURRENTLY</> command. > + <command>REINDEX</> will perform a concurrent build if <literal> > + CONCURRENTLY</> is specified. To build the index without interfering > + with production you should drop the index and reissue either the > + <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</> > + command. Indexes of toast relations can be rebuilt with <command>REINDEX > + CONCURRENTLY</>. I think the "To build the index[...]" part should be rephrased, the current diff makes it sound like you should drop the index first even if you reindex concurrently. What about "Note that <command>REINDEX</> will only perform a concurrent build if <literal> CONCURRENTLY</> is specified"? Anyway, this part is only about reindexing invalid indexes, so mentioning that reindex is not concurrently or the part about create- index-concurrently-then-rename only for this case is a bit weird, but this is a pre-existing condition. > diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c > index 8d42a347ea..c40ac0b154 100644 > --- a/src/backend/catalog/index.c > +++ b/src/backend/catalog/index.c > /* > + * index_concurrent_create_copy > + * > + * Create a concurrent index based on the definition of the one provided by > + * caller that will be used for concurrent operations. The index is inserted > + * into catalogs and needs to be built later on. This is called during > + * concurrent reindex processing. The heap relation on which is based the index > + * needs to be closed by the caller. > + */ That should be "The heap relation on which the index is based ..." I think. > + /* > + * We have to re-build the IndexInfo struct, since it was lost in > + * commit of transaction where this concurrent index was created > + * at the catalog level. > + */ > + indexInfo = BuildIndexInfo(indexRelation); Looks like indexInfo starts with lowercase, but the comment above has upper case `IndexInfo'. > +/* > + * index_concurrent_swap > + * > + * Swap name, dependencies and constraints of the old index over to the new > + * index, while marking the old index as invalid and the new as valid. > + */ The `while' looks slightly odd to me, ISTM this is just another operation this function performs, whereas "while" makes it sound like the marking happens concurrently; so maybe ". Also, mark the old index as invalid[...]"? > +void > +index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, const char *oldName) > +{ [...] > + /* > + * Copy contraint flags for old index. This is safe because the old index > + * guaranteed uniquness. > + */ "uniqueness". > + /* Mark old index as valid and new is invalid as index_set_state_flags */ "new as invalid". Also, this comment style is different to this one: > + /* > + * Move contstraints and triggers over to the new index > + */ I guess the latter could be changed to a one-line comment as the former, but maybe there is a deeper sense (locality of comment?) in this. > + /* make a modifiable copy */ I think comments should start capitalized? > +/* > + * index_concurrent_drop > + * > + * Drop a single index concurrently as the last step of an index concurrent > + * process. Deletion is done through performDeletion or dependencies of the > + * index would not get dropped. At this point all the indexes are already > + * considered as invalid and dead so they can be dropped without using any > + * concurrent options as it is sure that they will not interact with other > + * server sessions. > + */ I'd write "as it is certain" instead of "as it is sure", but I can't explain why. Maybe persons are sure, but situations are certain? > @@ -1483,41 +1939,13 @@ index_drop(Oid indexId, bool concurrent) > * Note: the reason we use actual lock acquisition here, rather than > * just checking the ProcArray and sleeping, is that deadlock is > * possible if one of the transactions in question is blocked trying > - * to acquire an exclusive lock on our table. The lock code will > + * to acquire an exclusive lock on our table. The lock code will Gratuitous whitespace change, seeing that other comments added in this patch have the extra whitespace after full stops as well. > diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c > index d0ee851215..9dce6420df 100644 > --- a/src/backend/catalog/pg_depend.c > +++ b/src/backend/catalog/pg_depend.c > @@ -377,6 +377,94 @@ changeDependencyFor(Oid classId, Oid objectId, > } > > /* > + * Adjust all dependency records to point to a different object of the same type > + * > + * refClassId/oldRefObjectId specify the old referenced object. > + * newRefObjectId is the new referenced object (must be of class refClassId). > + * > + * Returns the number of records updated. > + */ > +long > +changeDependencyForAll(Oid refClassId, Oid oldRefObjectId, > + Oid newRefObjectId) This one is mostly a copy-paste of changeDependencyFor(), did you consider refactoring that into handling the All case as well? > @@ -722,3 +810,58 @@ get_index_constraint(Oid indexId) > > return constraintId; > } > + > +/* > + * get_index_ref_constraints > + * Given the OID of an index, return the OID of all foreign key > + * constraints which reference the index. > + */ > +List * > +get_index_ref_constraints(Oid indexId) Same for this one, but there's two similar functions (get_constraint_index() and get_index_constraint()) already so I guess it's fine? > diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c > index 72bb06c760..7a51c25d98 100644 > --- a/src/backend/commands/indexcmds.c > +++ b/src/backend/commands/indexcmds.c > @@ -283,6 +285,87 @@ CheckIndexCompatible(Oid oldId, > return ret; > } > > + > +/* > + * WaitForOlderSnapshots > + * > + * Wait for transactions that might have older snapshot than the given xmin "an older snapshot" maybe? > + * limit, because it might not contain tuples deleted just before it has > + * been taken. Obtain a list of VXIDs of such transactions, and wait for them > + * individually. > + * > + * We can exclude any running transactions that have xmin > the xmin given; > + * their oldest snapshot must be newer than our xmin limit. > + * We can also exclude any transactions that have xmin = zero, since they Probably an empty line between the two paragraphs is in order, or just keep it one paragraph. > + * evidently have no live snapshot at all (and any one they might be in > + * process of taking is certainly newer than ours). Please rephrase what's in the paranthesis, I am not quite sure what it means, maybe "(and any they are currently taking is..."? > + * We can also exclude autovacuum processes and processes running manual > + * lazy VACUUMs, because they won't be fazed by missing index entries > + * either. (Manual ANALYZEs, however, can't be excluded because they > + * might be within transactions that are going to do arbitrary operations > + * later.) Weird punctuation, I think the full stop before the paranthesis should be put after it, the full stop at the end of the paranthesis be dropped and the beginning of the parenthesis should not be capitalized. Oh hrm, I now see that the patch just moves the comments, so maybe don't bother. > @@ -1739,7 +1735,7 @@ ChooseIndexColumnNames(List *indexElems) > * Recreate a specific index. > */ > Oid > -ReindexIndex(RangeVar *indexRelation, int options) > +ReindexIndex(RangeVar *indexRelation, int options, bool concurrent) > { > Oid indOid; > Oid heapOid = InvalidOid; > @@ -1751,8 +1747,9 @@ ReindexIndex(RangeVar *indexRelation, int options) > * obtain lock on table first, to avoid deadlock hazard. The lock level > * used here must match the index lock obtained in reindex_index(). > */ > - indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock, > - false, false, > + indOid = RangeVarGetRelidExtended(indexRelation, > + concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock, > + concurrent, concurrent, I find the way the bool is passed for the third and fourth argument a bit weird, but ok. I would still suggest to explain in the comment above why the two other arguments to RangeVarGetRelidExtended() (`missing_ok' and `nowait') are dependent on concurrent reindexing; it's not super-obvious to me. > > /* The lock level used here should match reindex_relation(). */ > - heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false, > + heapOid = RangeVarGetRelidExtended(relation, > + concurrent ? ShareUpdateExclusiveLock : ShareLock, > + concurrent, concurrent, > RangeVarCallbackOwnsTable, NULL); Same here. > + if (concurrent) > + result = ReindexRelationConcurrently(heapOid, options); > + else > + result = reindex_relation(heapOid, > + REINDEX_REL_PROCESS_TOAST | > + REINDEX_REL_CHECK_CONSTRAINTS, > + options); This somewhat begs the question why those two cases are so different (one is implemented in src/backend/catalog/index.c, the other in src/backend/commands/indexcmds.c, and their naming scheme is different). I guess that's ok, but it might also be a hint that ReindexRelationConcurrently() is implemented at the wrong level. > @@ -2011,3 +2040,597 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind, > > MemoryContextDelete(private_context); > } > + > + > +/* > + * ReindexRelationConcurrently > + * > + * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be > + * either an index or a table. If a table is specified, each phase is processed > + * one by done for each table's indexes as well as its dependent toast indexes "one by one". > +static bool > +ReindexRelationConcurrently(Oid relationOid, int options) > +{ [...] > + /* > + * Extract the list of indexes that are going to be rebuilt based on the > + * list of relation Oids given by caller. For each element in given list, > + * If the relkind of given relation Oid is a table, all its valid indexes capitalization, "If" is in the middle of a sentence. > + * will be rebuilt, including its associated toast table indexes. If > + * relkind is an index, this index itself will be rebuilt. Maybe mention here that system catalogs and shared relations cannot be reindexed concurrently? > + /* Create concurrent index based on given index */ > + concurrentOid = index_concurrent_create_copy(indexParentRel, > + indOid, > + concurrentName); AIUI, this creates/copies some meta-data for the concurrent index, but does not yet create the index itself, right? If so, the comment is somewhat misleading. > + /* > + * Now open the relation of concurrent index, a lock is also needed on > + * it > + */ Multi-line comments should end with a full-stop I think? > + /* > + * Phase 3 of REINDEX CONCURRENTLY [...] > + /* > + * This concurrent index is now valid as they contain all the tuples > + * necessary. However, it might not have taken into account deleted tuples "as they contain" should be "as it contains" I guess, since the rest of the comment is talking about a singular index. > + /* > + * Phase 4 of REINDEX CONCURRENTLY > + * > + * Now that the concurrent indexes have been validated, it is necessary > + * to swap each concurrent index with its corresponding old index. > + * > + * We mark the new indexes as valid and the old indexes dead at the same > + * time to make sure we get only get constraint violations from the "we only get" > + /* > + * Phase 5 of REINDEX CONCURRENTLY > + * > + * The indexes hold now a fresh relfilenode of their respective concurrent I'd write "now hold" instead of "hold now". > + * entries indexes. It is time to mark the now-useless concurrent entries > + * as not ready so as they can be safely discarded from write operations > + * that may occur on them. So the "concurrent entries" is the original index, as that one should be now-useless? If so, that's a bit confusing terminology to me and it was called "old index" in the previous phases. > + /* > + * Phase 6 of REINDEX CONCURRENTLY > + * > + * Drop the concurrent indexes, with actually the same code path as Again, I'd have written "Drop the old indexes". Also, "with actually the same" sounds a bit awkward, maybe "actually using the same" would be better. > + /* > + * Last thing to do is to release the session-level lock on the parent table > + * and the indexes of table. "and on the indexes of the table"? Or what exactly is meant with the last bit? > diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c > index 20b5273405..c76dacc44a 100644 > --- a/src/backend/tcop/utility.c > +++ b/src/backend/tcop/utility.c > @@ -773,16 +773,20 @@ standard_ProcessUtility(PlannedStmt *pstmt, > - ReindexIndex(stmt->relation, stmt->options); > + ReindexIndex(stmt->relation, stmt->options, stmt->concurrent); > - ReindexTable(stmt->relation, stmt->options); > + ReindexTable(stmt->relation, stmt->options, stmt->concurrent); > - ReindexMultipleTables(stmt->name, stmt->kind, stmt->options); > + ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent); Those lines are now in excess of 80 chars. Cheers, Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
On Mon, Mar 13, 2017 at 11:11 AM, Andreas Karlsson <andreas@proxel.se> wrote: > On 03/02/2017 03:10 AM, Michael Paquier wrote: >> There is a lot of mumbo-jumbo in the patch to create the exact same >> index definition as the original one being reindexed, and that's a >> huge maintenance burden for the future. You can blame me for that in >> the current patch. I am wondering if it would not just be better to >> generate a CREATE INDEX query string and then use the SPI to create >> the index, and also do the following extensions at SQL level: >> - Add a sort of WITH NO DATA clause where the index is created, so the >> index is created empty, and is marked invalid and not ready. >> - Extend pg_get_indexdef_string() with an optional parameter to >> enforce the index name to something else, most likely it should be >> extended with the WITH NO DATA/INVALID clause, which should just be a >> storage parameter by the way. >> By doing something like that what the REINDEX CONCURRENTLY code path >> should just be careful about is that it chooses an index name that >> avoids any conflicts. > > > Hm, I am not sure how much that would help since a lot of the mumb-jumbo is > by necessity in the step where we move the constraints over from the old > index to the new. Well, the idea is really to get rid of that as there are already facilities of this kind for CREATE TABLE LIKE in the parser and ALTER TABLE when rewriting a relation. It is not really attractive to have a 3rd method in the backend code to do the same kind of things, for a method that is even harder to maintain than the other two. -- Michael
On Thu, Mar 30, 2017 at 5:13 AM, Michael Banck <michael.banck@credativ.de> wrote: > On Mon, Mar 13, 2017 at 03:11:50AM +0100, Andreas Karlsson wrote: >> Spotted one of my TODO comments there so I have attached a patch where I >> have cleaned up that function. I also fixed the the code to properly support >> triggers. > > I hope that Michael will post a full review as he worked on the code > extensively, but here are some some code comments, mostly on the > comments (note that I'm not a native speaker, so I might be wrong on > some of them as well): Thanks, Michael. I have done a pass on it > [review comments] Here are more comments: + <para> + <command>REINDEX SYSTEM</command> does not support + <command>CONCURRENTLY</command>. + </para> It would be nice to mention that REINDEX SCHEMA pg_catalog is not supported, or just tell that concurrent reindexing of any system catalog indexes. When running REINDEX SCHEMA CONCURRENTLY public on the regression database I am bumping into a bunch of these warnings: WARNING: 01000: snapshot 0x7fa5e6000040 still active LOCATION: AtEOXact_Snapshot, snapmgr.c:1123 WARNING: 01000: snapshot 0x7fa5e6000040 still active LOCATION: AtEOXact_Snapshot, snapmgr.c:1123 + * Reset attcacheoff for a TupleDesc + */ +void +ResetTupleDescCache(TupleDesc tupdesc) +{ + int i; + + for (i = 0; i < tupdesc->natts; i++) + tupdesc->attrs[i]->attcacheoff = -1; +} I think that it would be better to merge that with TupleDescInitEntry to be sure that the initialization of a TupleDesc's attribute goes through only one code path. + /* + * Copy contraint flags for old index. This is safe because the old index + * guaranteed uniquness. + */ s/uniquness/uniqueness/ and s/contraint/constraint/. + /* + * Move contstraints and triggers over to the new index + */ s/contstraints/constraints/. -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replaceable class="PARAMETER">name</replaceable> +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> I am taking the war path with such a sentence... But what about adding CONCURRENTLY to the list of options in parenthesis instead? With this patch, we are reaching the 9th boolean argument for create_index(). Any opinions about refactoring that into a set of bitwise flags? Fairly unrelated to this patch. + /* + * Move all dependencies on the old index to the new + */ Sentence unfinished. It has been years since I looked at this code (I wrote it in majority), but here is what I would explore if I were to work on that for the next release cycle: - Explore the use of SQL-level interfaces to mark an index as inactive at creation. - Remove work done in changeDependencyForAll, and replace it by something similar to what tablecmds.c does. There is I think here some place for refactoring if that's not with CREATE TABLE LIKE. This requires to the same work of creation, renaming and drop of the old triggers and constraints. - Do a per-index rebuild and not a per-relation rebuild for concurrent indexing. Doing a per-relation reindex has the disadvantage that many objects need to be created at the same time, and in the case of REINDEX CONCURRENTLY time of the operation is not what matters, it is how intrusive the operation is. Relations with many indexes would also result in much object locks taken at each step. The first and second points require a bit of thoughts for sure, but in the long term that would pay in maintenance if we don't reinvent the wheel, or at least try not to. -- Michael
Thanks for the feedback. I will look at it when I get the time. On 03/31/2017 08:27 AM, Michael Paquier wrote: > - Do a per-index rebuild and not a per-relation rebuild for concurrent > indexing. Doing a per-relation reindex has the disadvantage that many > objects need to be created at the same time, and in the case of > REINDEX CONCURRENTLY time of the operation is not what matters, it is > how intrusive the operation is. Relations with many indexes would also > result in much object locks taken at each step. I am personally worried about the amount time spent waiting for long running transactions if you reindex per index rather than per relation. Because when you for one index wait on long running transactions nothing prevents new long transaction from starting, which we will have to wait for while reindexing the next index. If your database has many long running transactions more time will be spent waiting than the time spent working. Are the number of locks really a big deal compared to other costs involved here? REINDEX does a lot of expensive things like staring transactions, taking snapshots, scanning large tables, building a new index, etc. The trade off I see is between temporary disk usage and time spent waiting for transactions, and doing the REINDEX per relation allows for flexibility since people can still explicitly reindex per index of they want to. Andreas
On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson <andreas@proxel.se> wrote: > Thanks for the feedback. I will look at it when I get the time. > > On 03/31/2017 08:27 AM, Michael Paquier wrote: >> >> - Do a per-index rebuild and not a per-relation rebuild for concurrent >> indexing. Doing a per-relation reindex has the disadvantage that many >> objects need to be created at the same time, and in the case of >> REINDEX CONCURRENTLY time of the operation is not what matters, it is >> how intrusive the operation is. Relations with many indexes would also >> result in much object locks taken at each step. > > > I am personally worried about the amount time spent waiting for long running > transactions if you reindex per index rather than per relation. Because when > you for one index wait on long running transactions nothing prevents new > long transaction from starting, which we will have to wait for while > reindexing the next index. If your database has many long running > transactions more time will be spent waiting than the time spent working. Yup, I am not saying that one approach or the other are bad, both are worth considering. That's a deal between waiting and manual potential cleanup in the event of a failure. > and doing the REINDEX per relation allows for flexibility > since people can still explicitly reindex per index of they want to. You have a point here. I am marking this patch as returned with feedback, this won't get in PG10. If I am freed from the SCRAM-related open items I'll try to give another shot at implementing this feature before the first CF of PG11. -- Michael
On 04/03/2017 07:57 AM, Michael Paquier wrote: > On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson <andreas@proxel.se> wrote: >> On 03/31/2017 08:27 AM, Michael Paquier wrote: >>> - Do a per-index rebuild and not a per-relation rebuild for concurrent >>> indexing. Doing a per-relation reindex has the disadvantage that many >>> objects need to be created at the same time, and in the case of >>> REINDEX CONCURRENTLY time of the operation is not what matters, it is >>> how intrusive the operation is. Relations with many indexes would also >>> result in much object locks taken at each step. >> >> I am personally worried about the amount time spent waiting for long running >> transactions if you reindex per index rather than per relation. Because when >> you for one index wait on long running transactions nothing prevents new >> long transaction from starting, which we will have to wait for while >> reindexing the next index. If your database has many long running >> transactions more time will be spent waiting than the time spent working. > > Yup, I am not saying that one approach or the other are bad, both are > worth considering. That's a deal between waiting and manual potential > cleanup in the event of a failure. Agreed, and which is worse probably depends heavily on your schema and workload. > I am marking this patch as returned with feedback, this won't get in > PG10. If I am freed from the SCRAM-related open items I'll try to give > another shot at implementing this feature before the first CF of PG11. Thanks! I also think I will have time to work on this before the first CF. Andreas
I have attached a new, rebased version of the batch with most of Banck's and some of your feedback incorporated. Thanks for the good feedback! On 03/31/2017 08:27 AM, Michael Paquier wrote> When running REINDEX SCHEMA CONCURRENTLY public on the regression > database I am bumping into a bunch of these warnings: > WARNING: 01000: snapshot 0x7fa5e6000040 still active > LOCATION: AtEOXact_Snapshot, snapmgr.c:1123 > WARNING: 01000: snapshot 0x7fa5e6000040 still active > LOCATION: AtEOXact_Snapshot, snapmgr.c:1123 I failed to reproduce this. Do you have a reproducible test case? > + * Reset attcacheoff for a TupleDesc > + */ > +void > +ResetTupleDescCache(TupleDesc tupdesc) > +{ > + int i; > + > + for (i = 0; i < tupdesc->natts; i++) > + tupdesc->attrs[i]->attcacheoff = -1; > +} > I think that it would be better to merge that with TupleDescInitEntry > to be sure that the initialization of a TupleDesc's attribute goes > through only one code path. Sorry, but I am not sure I understand your suggestion. I do not like the ResetTupleDescCache function so all suggestions are welcome. > -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM > } <replaceable class="PARAMETER">name</replaceable> > +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM > } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> > I am taking the war path with such a sentence... But what about adding > CONCURRENTLY to the list of options in parenthesis instead? I have thought some about this myself and I do not care strongly either way. > - Explore the use of SQL-level interfaces to mark an index as inactive > at creation. > - Remove work done in changeDependencyForAll, and replace it by > something similar to what tablecmds.c does. There is I think here some > place for refactoring if that's not with CREATE TABLE LIKE. This > requires to the same work of creation, renaming and drop of the old > triggers and constraints. I am no fan of the current code duplication and how fragile it is, but I think these cases are sufficiently different to prevent meaningful code reuse. But it could just be me who is unfamiliar with that part of the code. > - Do a per-index rebuild and not a per-relation rebuild for concurrent > indexing. Doing a per-relation reindex has the disadvantage that many > objects need to be created at the same time, and in the case of > REINDEX CONCURRENTLY time of the operation is not what matters, it is > how intrusive the operation is. Relations with many indexes would also > result in much object locks taken at each step. I am still leaning towards my current tradeoff since waiting for all queries to stop using an index can take a lot of time and if you only have to do that once per table it would be a huge benefit under some workloads, and you can still reindex each index separately if you need to. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Here is a rebased version of the patch. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Wed, Nov 1, 2017 at 1:20 PM, Andreas Karlsson <andreas@proxel.se> wrote: > Here is a rebased version of the patch. The patch does not apply, and needs a rebase. I am moving it to next CF with waiting on author as status. -- Michael
Andreas Karlsson wrote: > Here is a rebased version of the patch. Is anybody working on rebasing this patch? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Michael Paquier wrote: > Well, the idea is really to get rid of that as there are already > facilities of this kind for CREATE TABLE LIKE in the parser and ALTER > TABLE when rewriting a relation. It is not really attractive to have a > 3rd method in the backend code to do the same kind of things, for a > method that is even harder to maintain than the other two. I dislike the backend code that uses SPI and manufacturing node to re-creates indexes. IMO we should get rid of it. Let's not call it "facilities", but rather "grotty hacks". I think before suggesting to add even more code to perpetuate that idea, we should think about going in the other direction. I have not tried to write the code, but it should be possible to have an intermediate function called by ProcessUtility* which transforms the IndexStmt into an internal representation, then calls DefineIndex. This way, all this code that wants to create indexes for backend-internal reasons can create the internal representation directly then call DefineIndex, instead of the horrible hacks they use today creating parse nodes by hand. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Dec 21, 2017 at 11:46 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Michael Paquier wrote: >> Well, the idea is really to get rid of that as there are already >> facilities of this kind for CREATE TABLE LIKE in the parser and ALTER >> TABLE when rewriting a relation. It is not really attractive to have a >> 3rd method in the backend code to do the same kind of things, for a >> method that is even harder to maintain than the other two. > > I dislike the backend code that uses SPI and manufacturing node to > re-creates indexes. IMO we should get rid of it. Let's not call it > "facilities", but rather "grotty hacks". Aha. You are making my day here ;) > I think before suggesting to add even more code to perpetuate that idea, > we should think about going in the other direction. I have not tried to > write the code, but it should be possible to have an intermediate > function called by ProcessUtility* which transforms the IndexStmt into > an internal representation, then calls DefineIndex. This way, all this > code that wants to create indexes for backend-internal reasons can > create the internal representation directly then call DefineIndex, > instead of the horrible hacks they use today creating parse nodes by > hand. Yeah, that would be likely possible. I am not volunteering for that in the short term though.. -- Michael
On 21 December 2017 at 11:31, Michael Paquier <michael.paquier@gmail.com> wrote:
On Thu, Dec 21, 2017 at 11:46 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
> Michael Paquier wrote:
>> Well, the idea is really to get rid of that as there are already
>> facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
>> TABLE when rewriting a relation. It is not really attractive to have a
>> 3rd method in the backend code to do the same kind of things, for a
>> method that is even harder to maintain than the other two.
>
> I dislike the backend code that uses SPI and manufacturing node to
> re-creates indexes. IMO we should get rid of it. Let's not call it
> "facilities", but rather "grotty hacks".
Aha. You are making my day here ;)
> I think before suggesting to add even more code to perpetuate that idea,
> we should think about going in the other direction. I have not tried to
> write the code, but it should be possible to have an intermediate
> function called by ProcessUtility* which transforms the IndexStmt into
> an internal representation, then calls DefineIndex. This way, all this
> code that wants to create indexes for backend-internal reasons can
> create the internal representation directly then call DefineIndex,
> instead of the horrible hacks they use today creating parse nodes by
> hand.
Yeah, that would be likely possible. I am not volunteering for that in
the short term though..
It sounds like that'd make some of ALTER TABLE a bit less ... upsetting ... too.
Craig, Michael, all, * Craig Ringer (craig@2ndquadrant.com) wrote: > On 21 December 2017 at 11:31, Michael Paquier <michael.paquier@gmail.com> > wrote: > > > On Thu, Dec 21, 2017 at 11:46 AM, Alvaro Herrera > > <alvherre@alvh.no-ip.org> wrote: > > > Michael Paquier wrote: > > >> Well, the idea is really to get rid of that as there are already > > >> facilities of this kind for CREATE TABLE LIKE in the parser and ALTER > > >> TABLE when rewriting a relation. It is not really attractive to have a > > >> 3rd method in the backend code to do the same kind of things, for a > > >> method that is even harder to maintain than the other two. > > > > > > I dislike the backend code that uses SPI and manufacturing node to > > > re-creates indexes. IMO we should get rid of it. Let's not call it > > > "facilities", but rather "grotty hacks". > > > > Aha. You are making my day here ;) > > > > > I think before suggesting to add even more code to perpetuate that idea, > > > we should think about going in the other direction. I have not tried to > > > write the code, but it should be possible to have an intermediate > > > function called by ProcessUtility* which transforms the IndexStmt into > > > an internal representation, then calls DefineIndex. This way, all this > > > code that wants to create indexes for backend-internal reasons can > > > create the internal representation directly then call DefineIndex, > > > instead of the horrible hacks they use today creating parse nodes by > > > hand. > > > > Yeah, that would be likely possible. I am not volunteering for that in > > the short term though.. > > It sounds like that'd make some of ALTER TABLE a bit less ... upsetting ... > too. I'm a big fan of this patch but it doesn't appear to have made any progress in quite a while. Is there any chance we can get an updated patch and perhaps get another review before the end of this CF...? Refactoring this to have an internal representation between ProcessUtility() and DefineIndex doesn't sound too terrible and if it means the ability to reuse that, seems like it'd be awful nice to do so.. Thanks! Stephen
Attachment
On 01/26/2018 03:28 AM, Stephen Frost wrote: > I'm a big fan of this patch but it doesn't appear to have made any > progress in quite a while. Is there any chance we can get an updated > patch and perhaps get another review before the end of this CF...? Sorry, as you may have guessed I do not have the time right now to get this updated during this commitfest. > Refactoring this to have an internal representation between > ProcessUtility() and DefineIndex doesn't sound too terrible and if it > means the ability to reuse that, seems like it'd be awful nice to do > so.. I too like the concept, but have not had the time to look into it. Andreas
On Wed, Jan 31, 2018 at 01:48:00AM +0100, Andreas Karlsson wrote: > I too like the concept, but have not had the time to look into it. This may happen at some point, for now I am marking the patch as returned with feedback. -- Michael
Attachment
Here is a revival of this patch. This is Andreas Karlsson's v4 patch (2017-11-01) with some updates for conflicts and changed APIs. AFAICT from the discussions, there were no more conceptual concerns with this approach. Recall that with this patch REINDEX CONCURRENTLY creates a new index (with a new OID) and then switch the names and dependencies. I have done a review of this patch and it looks pretty solid to me. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hello Thank you for working on this patch! I perform some tests and think behavior with partition tables is slightly inconsistent. postgres=# reindex table measurement; WARNING: REINDEX of partitioned tables is not yet implemented, skipping "measurement" NOTICE: table "measurement" has no indexes REINDEX postgres=# reindex table CONCURRENTLY measurement; ERROR: cannot reindex concurrently this type of relation Maybe we need report warning and skip partitioned tables similar to plain reindex? This makes more sense for "reindex database" or "reindex schema": as far i can see, concurrent reindex will stop work afterfirst partitioned table in list. regards, Sergei
On 07/12/2018 17:40, Sergei Kornilov wrote: > I perform some tests and think behavior with partition tables is slightly inconsistent. > > postgres=# reindex table measurement; > WARNING: REINDEX of partitioned tables is not yet implemented, skipping "measurement" > NOTICE: table "measurement" has no indexes > REINDEX > postgres=# reindex table CONCURRENTLY measurement; > ERROR: cannot reindex concurrently this type of relation > > Maybe we need report warning and skip partitioned tables similar to plain reindex? OK, that should be easy to fix. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello I review code and documentation and i have few notes. Did you register this patch in CF app? I found one error in phase 4. Simple reproducer: > create table test (i int); > create index this_is_very_large_exactly_maxnamelen_index_name_wink_wink_wink on test (i); > create index this_is_very_large_exactly_maxnamelen_index_name_wink_winkccold on test (i); > reindex table CONCURRENTLY test; This fails with error > ERROR: duplicate key value violates unique constraint "pg_class_relname_nsp_index" > DETAIL: Key (relname, relnamespace)=(this_is_very_large_exactly_maxnamelen_index_name_wink_win_ccold, 2200) already exists. CommandCounterIncrement() in (or after) index_concurrently_swap will fix this issue. > ReindexPartitionedIndex(Relation parentIdx) > ereport(ERROR, > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > errmsg("REINDEX is not yet implemented for partitioned indexes"))); I think we need add errhint("you can REINDEX each partition separately") or something similar. Also can we omit this warning for reindex database? All partition must be in same database and warning in such case is useless:we have warning, but doing reindex for each partition => we reindex partitioned table correctly. Another behavior issue i found with reindex (verbose) schema/database: INFO ereport is printed twice for each table. > INFO: relation "measurement_y2006m02" was reindexed > DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.07 s. > INFO: table "public.measurement_y2006m02" was reindexed One from ReindexMultipleTables and another (with pg_rusage_show) from ReindexRelationConcurrently. > ReindexRelationConcurrently > if (!indexRelation->rd_index->indisvalid) it is better use IndexIsValid macro here? And same question about added indexform->indisvalid in src/backend/commands/tablecmds.c > <para> > An index build with the <literal>CONCURRENTLY</literal> option failed, leaving > an <quote>invalid</quote> index. Such indexes are useless but it can be >- convenient to use <command>REINDEX</command> to rebuild them. Note that >- <command>REINDEX</command> will not perform a concurrent build. To build the >- index without interfering with production you should drop the index and >- reissue the <command>CREATE INDEX CONCURRENTLY</command> command. >+ convenient to use <command>REINDEX</command> to rebuild them. > </para> This documentation change seems wrong for me: reindex concurrently does not rebuild invalid indexes. To fix invalid indexeswe still need reindex with lock table or recreate this index concurrently. > + A first pass to build the index is done for each new index entry. > + Once the index is built, its flag <literal>pg_class.isready</literal> is > + switched to <quote>true</quote> > + At this point <literal>pg_class.indisvalid</literal> is switched to > + <quote>true</quote> for the new index and to <quote>false</quote> for the old, and > + Old indexes have <literal>pg_class.isready</literal> switched to <quote>false</quote> Should be pg_index.indisvalid and pg_index.indisready, right? regards, Sergei
On 09/12/2018 19:55, Sergei Kornilov wrote: >> reindex table CONCURRENTLY test; By the way, does this syntax make sense? I haven't seen a discussion on this anywhere in the various threads. I keep thinking that reindex concurrently table test; would make more sense. How about in combination with (verbose)? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Greetings, * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: > On 09/12/2018 19:55, Sergei Kornilov wrote: > >> reindex table CONCURRENTLY test; > > By the way, does this syntax make sense? I haven't seen a discussion on > this anywhere in the various threads. I keep thinking that > > reindex concurrently table test; > > would make more sense. How about in combination with (verbose)? I don't think it's a mistake that we have 'create index concurrently' and it certainly would seem odd to me for 'create index' and 'reindex table' to be different. Certainly, from my recollection of english, you'd say "I am going to reindex the table concurrently", you wouldn't say "I am going to reindex concurrently the table." Based on at least a quick looking around, the actual grammar rule seems to match my recollection[1], adverbs should typically go AFTER the verb + object, and the adverb shouldn't ever be placed between the verb and the object. Thanks! Stephen [1]: http://www.grammar.cl/Notes/Adverbs.htm
Attachment
On Thu, Dec 13, 2018 at 07:14:57PM -0500, Stephen Frost wrote: > Based on at least a quick looking around, the actual grammar rule seems > to match my recollection[1], adverbs should typically go AFTER the > verb + object, and the adverb shouldn't ever be placed between the verb > and the object. This part has been a long debate already in 2012-2013 when I sent the first iterations of the patch, and my memories on the matter are that the grammar you are showing here matches with the past agreement. -- Michael
Attachment
On 14/12/2018 01:14, Stephen Frost wrote: >>>> reindex table CONCURRENTLY test; >> >> By the way, does this syntax make sense? I haven't seen a discussion on >> this anywhere in the various threads. I keep thinking that >> >> reindex concurrently table test; >> >> would make more sense. How about in combination with (verbose)? > > I don't think it's a mistake that we have 'create index concurrently' > and it certainly would seem odd to me for 'create index' and 'reindex > table' to be different. > > Certainly, from my recollection of english, you'd say "I am going to > reindex the table concurrently", you wouldn't say "I am going to > reindex concurrently the table." > > Based on at least a quick looking around, the actual grammar rule seems > to match my recollection[1], adverbs should typically go AFTER the > verb + object, and the adverb shouldn't ever be placed between the verb > and the object. So it would be grammatical to say reindex table test concurrently or in a pinch reindex concurrently table test but I don't see anything grammatical about reindex table concurrently test (given that the object is "table test"). Where this gets really messy is stuff like this: reindex (verbose) database concurrently postgres Why would "concurrently" not be part of the options next to "verbose"? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 14/12/2018 01:23, Michael Paquier wrote: > On Thu, Dec 13, 2018 at 07:14:57PM -0500, Stephen Frost wrote: >> Based on at least a quick looking around, the actual grammar rule seems >> to match my recollection[1], adverbs should typically go AFTER the >> verb + object, and the adverb shouldn't ever be placed between the verb >> and the object. > > This part has been a long debate already in 2012-2013 when I sent the > first iterations of the patch, and my memories on the matter are that > the grammar you are showing here matches with the past agreement. Do you happen to have a link for that? I didn't find anything. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-Dec-14, Peter Eisentraut wrote: > On 14/12/2018 01:23, Michael Paquier wrote: > > On Thu, Dec 13, 2018 at 07:14:57PM -0500, Stephen Frost wrote: > >> Based on at least a quick looking around, the actual grammar rule seems > >> to match my recollection[1], adverbs should typically go AFTER the > >> verb + object, and the adverb shouldn't ever be placed between the verb > >> and the object. > > > > This part has been a long debate already in 2012-2013 when I sent the > > first iterations of the patch, and my memories on the matter are that > > the grammar you are showing here matches with the past agreement. > > Do you happen to have a link for that? I didn't find anything. I think putting the CONCURRENTLY in the parenthesized list of options is most sensible. CREATE INDEX didn't have such an option list when we added this feature there; see https://www.postgresql.org/message-id/flat/200608011143.k71Bh9c22067%40momjian.us#029e9a7ee8cb38beee494ef7891bec1d for some discussion about that grammar. Our options were not great ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Greetings, * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote: > On 14/12/2018 01:14, Stephen Frost wrote: > >>>> reindex table CONCURRENTLY test; > >> > >> By the way, does this syntax make sense? I haven't seen a discussion on > >> this anywhere in the various threads. I keep thinking that > >> > >> reindex concurrently table test; > >> > >> would make more sense. How about in combination with (verbose)? > > > > I don't think it's a mistake that we have 'create index concurrently' > > and it certainly would seem odd to me for 'create index' and 'reindex > > table' to be different. > > > > Certainly, from my recollection of english, you'd say "I am going to > > reindex the table concurrently", you wouldn't say "I am going to > > reindex concurrently the table." > > > > Based on at least a quick looking around, the actual grammar rule seems > > to match my recollection[1], adverbs should typically go AFTER the > > verb + object, and the adverb shouldn't ever be placed between the verb > > and the object. > > So it would be grammatical to say > > reindex table test concurrently Yes, though I'm not really a fan of it. > or in a pinch > > reindex concurrently table test No, you can't put concurrently between reindex and table. > but I don't see anything grammatical about > > reindex table concurrently test I disagree, this does look reasonable to me and it's certainly much better than 'reindex concurrently table' which looks clearly incorrect. > Where this gets really messy is stuff like this: > > reindex (verbose) database concurrently postgres > > Why would "concurrently" not be part of the options next to "verbose"? That wasn't what was asked and I don't think I see a problem with having concurrently be allowed in the parentheses. For comparison, it's not like "explain analyze select ..." or "explain buffers select" is terribly good grammatical form. If you wanted to try to get to a better form for the spelled out sentence, I would think: concurrently reindex table test would probably be the approach to use, though that's not what we use for 'create index' and it'd be rather out of character for us to start a command with an adverb, making it ultimately a poor choice overall. Going back to what we already have done and have in released versions, we have 'create unique index concurrently test ...' and that's at least reasonable (the adverb isn't showing up between the verb and the object, and the adjective is between the verb and the object) and is what I vote to go with, with the caveat that if we want to also allow it inside the parentheses, I'm fine with that. Thanks! Stephen
Attachment
On 2018-Dec-14, Stephen Frost wrote: > That wasn't what was asked and I don't think I see a problem with having > concurrently be allowed in the parentheses. For comparison, it's not > like "explain analyze select ..." or "explain buffers select" is > terribly good grammatical form. ... and we don't allow EXPLAIN BUFFERS at all, and if we had had a parenthesized option list in EXPLAIN when we invented EXPLAIN ANALYZE, I bet we would have *not* made the ANALYZE keyword appear unadorned in that command. > If you wanted to try to get to a better form for the spelled out > sentence, I would think: > > concurrently reindex table test > > would probably be the approach to use, I think this is terrible from a command-completion perspective, and from a documentation perspective (Certainly we wouldn't have a manpage about the "concurrently" command, for starters). My vote goes to put the keyword inside of and exclusively in the parenthesized option list. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Greetings, * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > On 2018-Dec-14, Stephen Frost wrote: > > > That wasn't what was asked and I don't think I see a problem with having > > concurrently be allowed in the parentheses. For comparison, it's not > > like "explain analyze select ..." or "explain buffers select" is > > terribly good grammatical form. > > ... and we don't allow EXPLAIN BUFFERS at all, and if we had had a > parenthesized option list in EXPLAIN when we invented EXPLAIN ANALYZE, I > bet we would have *not* made the ANALYZE keyword appear unadorned in > that command. I'm not convinced of that- there is value in being able to write full and useful commands without having to always use parentheses. > > If you wanted to try to get to a better form for the spelled out > > sentence, I would think: > > > > concurrently reindex table test > > > > would probably be the approach to use, > > I think this is terrible from a command-completion perspective, and from > a documentation perspective (Certainly we wouldn't have a manpage about > the "concurrently" command, for starters). Right, I agreed that this had other downsides in the email you're replying to here. Glad we agree that it's not a good option. > My vote goes to put the keyword inside of and exclusively in the > parenthesized option list. I disagree with the idea of exclusively having concurrently be in the parentheses. 'explain buffers' is a much less frequently used option (though that might, in part, be because it's a bit annoying to write out explain (analyze, buffers) select...; I wonder if we could have a way to say "if I'm running analyze, I always want buffers"...), but concurrently reindexing a table (or index..) is going to almost certainly be extremely common, perhaps even more common than *not* reindexing concurrently. Thanks! Stephen
Attachment
On Fri, Dec 14, 2018 at 09:00:58AM -0300, Alvaro Herrera wrote: > On 2018-Dec-14, Peter Eisentraut wrote: >> Do you happen to have a link for that? I didn't find anything. The message I was thinking about is close to here: https://www.postgresql.org/message-id/20121210152856.GC16664@awork2.anarazel.de > I think putting the CONCURRENTLY in the parenthesized list of options is > most sensible. For new options of VACUUM and ANALYZE we tend to prefer that as well, and this simplifies the query parsing. -- Michael
Attachment
On 09/12/2018 19:55, Sergei Kornilov wrote: >> <para> >> An index build with the <literal>CONCURRENTLY</literal> option failed, leaving >> an <quote>invalid</quote> index. Such indexes are useless but it can be >> - convenient to use <command>REINDEX</command> to rebuild them. Note that >> - <command>REINDEX</command> will not perform a concurrent build. To build the >> - index without interfering with production you should drop the index and >> - reissue the <command>CREATE INDEX CONCURRENTLY</command> command. >> + convenient to use <command>REINDEX</command> to rebuild them. >> </para> > This documentation change seems wrong for me: reindex concurrently does not rebuild invalid indexes. To fix invalid indexeswe still need reindex with lock table or recreate this index concurrently. > The current patch prevents REINDEX CONCURRENTLY of invalid indexes, but I wonder why that is so. Anyone remember? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Dec 27, 2018 at 11:04:09AM +0100, Peter Eisentraut wrote: > The current patch prevents REINDEX CONCURRENTLY of invalid indexes, but > I wonder why that is so. Anyone remember? It should be around this time: https://www.postgresql.org/message-id/CAB7nPqRwVtQcHWErUf9o0hrRGFyQ9xArk7K7jCLxqKLy_6CXPQ@mail.gmail.com And if I recall correctly the rason to not be able to reindex invalid entries was that when working on a table, schema or database, if a failure happens in the process, the reindexing would need to happen for a double number of indexes when repeating the command. -- Michael
Attachment
On 2018-Dec-14, Stephen Frost wrote: > > My vote goes to put the keyword inside of and exclusively in the > > parenthesized option list. > > I disagree with the idea of exclusively having concurrently be in the > parentheses. 'explain buffers' is a much less frequently used option > (though that might, in part, be because it's a bit annoying to write out > explain (analyze, buffers) select...; I wonder if we could have a way to > say "if I'm running analyze, I always want buffers"...), I'm skeptical. I think EXPLAIN ANALYZE is more common because it has more than one decade of advantage compared to the more detailed option list. Yes, it's also easier, but IMO it's a brain thing (muscle memory), not a fingers thing. > but concurrently reindexing a table (or index..) is going to almost > certainly be extremely common, perhaps even more common than *not* > reindexing concurrently. Well, users can use the reindexdb utility and save some keystrokes. Anyway we don't typically add redundant ways to express the same things. Where we have them, it's just because the old way was there before, and we added the extensible way later. Adding two in the first appearance of a new feature seems absurd to me. After looking at the proposed grammar again today and in danger of repeating myself, IMO allowing the concurrency keyword to appear outside the parens would be a mistake. Valid commands: REINDEX (VERBOSE, CONCURRENTLY) TABLE foo; REINDEX (CONCURRENTLY) INDEX bar; -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>>> "Alvaro" == Alvaro Herrera <alvherre@2ndquadrant.com> writes: Alvaro> After looking at the proposed grammar again today and in danger Alvaro> of repeating myself, IMO allowing the concurrency keyword to Alvaro> appear outside the parens would be a mistake. Valid commands: Alvaro> REINDEX (VERBOSE, CONCURRENTLY) TABLE foo; Alvaro> REINDEX (CONCURRENTLY) INDEX bar; We burned that bridge with CREATE INDEX CONCURRENTLY; to make REINDEX require different syntax would be too inconsistent. If we didn't have all these existing uses of CONCURRENTLY without parens, your argument might have more merit; but we do. -- Andrew (irc:RhodiumToad)
On 2018-12-31 21:35:57 +0000, Andrew Gierth wrote: > >>>>> "Alvaro" == Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Alvaro> After looking at the proposed grammar again today and in danger > Alvaro> of repeating myself, IMO allowing the concurrency keyword to > Alvaro> appear outside the parens would be a mistake. Valid commands: > > Alvaro> REINDEX (VERBOSE, CONCURRENTLY) TABLE foo; > Alvaro> REINDEX (CONCURRENTLY) INDEX bar; > > We burned that bridge with CREATE INDEX CONCURRENTLY; to make REINDEX > require different syntax would be too inconsistent. > > If we didn't have all these existing uses of CONCURRENTLY without > parens, your argument might have more merit; but we do. +1
Greetings, * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > On 2018-Dec-14, Stephen Frost wrote: > > > > My vote goes to put the keyword inside of and exclusively in the > > > parenthesized option list. > > > > I disagree with the idea of exclusively having concurrently be in the > > parentheses. 'explain buffers' is a much less frequently used option > > (though that might, in part, be because it's a bit annoying to write out > > explain (analyze, buffers) select...; I wonder if we could have a way to > > say "if I'm running analyze, I always want buffers"...), > > I'm skeptical. I think EXPLAIN ANALYZE is more common because it has > more than one decade of advantage compared to the more detailed option > list. Yes, it's also easier, but IMO it's a brain thing (muscle > memory), not a fingers thing. I would argue that it's both. > > but concurrently reindexing a table (or index..) is going to almost > > certainly be extremely common, perhaps even more common than *not* > > reindexing concurrently. > > Well, users can use the reindexdb utility and save some keystrokes. That's a really poor argument as those unix utilities are hardly ever used, in my experience. > Anyway we don't typically add redundant ways to express the same things. > Where we have them, it's just because the old way was there before, and > we added the extensible way later. Adding two in the first appearance > of a new feature seems absurd to me. SQL allows many, many different ways to express the same thing. I agree that we haven't done that much in our utility commands, but I don't see that as an argument against doing so, just that we haven't (previously) really had the need- because most of the time we don't have a bunch of different options where we want to have a list. > After looking at the proposed grammar again today and in danger of > repeating myself, IMO allowing the concurrency keyword to appear outside > the parens would be a mistake. Valid commands: > > REINDEX (VERBOSE, CONCURRENTLY) TABLE foo; > REINDEX (CONCURRENTLY) INDEX bar; This discussion hasn't changed my opinion, and, though I'm likely repeating myself as well, I also agree with the down-thread comment that this ship really has already sailed. Thanks! Stephen
Attachment
Updated patch for some merge conflicts and addressing most of your comments. (I did not do anything about the syntax.) On 09/12/2018 19:55, Sergei Kornilov wrote: > I found one error in phase 4. Simple reproducer: > >> create table test (i int); >> create index this_is_very_large_exactly_maxnamelen_index_name_wink_wink_wink on test (i); >> create index this_is_very_large_exactly_maxnamelen_index_name_wink_winkccold on test (i); >> reindex table CONCURRENTLY test; > > This fails with error > >> ERROR: duplicate key value violates unique constraint "pg_class_relname_nsp_index" >> DETAIL: Key (relname, relnamespace)=(this_is_very_large_exactly_maxnamelen_index_name_wink_win_ccold, 2200) already exists. > > CommandCounterIncrement() in (or after) index_concurrently_swap will fix this issue. fixed >> ReindexPartitionedIndex(Relation parentIdx) >> ereport(ERROR, >> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), >> errmsg("REINDEX is not yet implemented for partitioned indexes"))); > > I think we need add errhint("you can REINDEX each partition separately") or something similar. > Also can we omit this warning for reindex database? All partition must be in same database and warning in such case isuseless: we have warning, but doing reindex for each partition => we reindex partitioned table correctly. fixed by skipping in ReindexRelationConcurrently(), same as other unsupported relkinds > Another behavior issue i found with reindex (verbose) schema/database: INFO ereport is printed twice for each table. > >> INFO: relation "measurement_y2006m02" was reindexed >> DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.07 s. >> INFO: table "public.measurement_y2006m02" was reindexed > > One from ReindexMultipleTables and another (with pg_rusage_show) from ReindexRelationConcurrently. fixed with some restructuring >> ReindexRelationConcurrently >> if (!indexRelation->rd_index->indisvalid) > > it is better use IndexIsValid macro here? And same question about added indexform->indisvalid in src/backend/commands/tablecmds.c IndexIsValid() has been removed in the meantime. >> <para> >> An index build with the <literal>CONCURRENTLY</literal> option failed, leaving >> an <quote>invalid</quote> index. Such indexes are useless but it can be >> - convenient to use <command>REINDEX</command> to rebuild them. Note that >> - <command>REINDEX</command> will not perform a concurrent build. To build the >> - index without interfering with production you should drop the index and >> - reissue the <command>CREATE INDEX CONCURRENTLY</command> command. >> + convenient to use <command>REINDEX</command> to rebuild them. >> </para> > > This documentation change seems wrong for me: reindex concurrently does not rebuild invalid indexes. To fix invalid indexeswe still need reindex with lock table or recreate this index concurrently. still being discussed elsewhere in this thread >> + A first pass to build the index is done for each new index entry. >> + Once the index is built, its flag <literal>pg_class.isready</literal> is >> + switched to <quote>true</quote> >> + At this point <literal>pg_class.indisvalid</literal> is switched to >> + <quote>true</quote> for the new index and to <quote>false</quote> for the old, and >> + Old indexes have <literal>pg_class.isready</literal> switched to <quote>false</quote> > > Should be pg_index.indisvalid and pg_index.indisready, right? fixed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hello Thank you! I review new patch version. It applied, builds and pass tests. Code looks good, but i notice new behavior notes: > postgres=# reindex (verbose) table CONCURRENTLY measurement ; > WARNING: REINDEX of partitioned tables is not yet implemented, skipping "measurement" > NOTICE: table "measurement" has no indexes > REINDEX > postgres=# \d measurement > Partitioned table "public.measurement" > Column | Type | Collation | Nullable | Default > -----------+---------+-----------+----------+--------- > city_id | integer | | not null | > logdate | date | | not null | > peaktemp | integer | | | > unitsales | integer | | | > Partition key: RANGE (logdate) > Indexes: > "measurement_logdate_idx" btree (logdate) > Number of partitions: 0 NOTICE seems unnecessary here. Unfortunally concurrenttly reindex loses comments, reproducer: > create table testcomment (i int); > create index testcomment_idx1 on testcomment (i); > comment on index testcomment_idx1 IS 'test comment'; > \di+ testcomment_idx1 > reindex table testcomment ; > \di+ testcomment_idx1 # ok > reindex table CONCURRENTLY testcomment ; > \di+ testcomment_idx1 # we lose comment Also i think we need change REINDEX to "<command>REINDEX</command> (without <option>CONCURRENTLY</option>)" in ACCESS EXCLUSIVEsection Table-level Lock Modes documentation (to be similar with REFRESH MATERIALIZED VIEW and CREATE INDEX description) About reindex invalid indexes - i found one good question in archives [1]: how about toast indexes? I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index. Reproduce: session 1: begin; select from test_toast ... for update; session 2: reindex table CONCURRENTLY test_toast ; session 2: interrupt by ctrl+C session 1: commit session 2: reindex table test_toast ; and now we have two toast indexes. DROP INDEX is able to remove only invalid ones. Valid index gives "ERROR: permissiondenied: "pg_toast_16426_index_ccnew" is a system catalog" About syntax: i vote for current syntax "reindex table CONCURRENTLY tablename". This looks consistent with existed CREATEINDEX CONCURRENTLY and REFRESH MATERIALIZED VIEW CONCURRENTLY. regards, Sergei [1]: https://www.postgresql.org/message-id/CAB7nPqT%2B6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw%40mail.gmail.com
About syntax: i vote for current syntax "reindex table CONCURRENTLY tablename". This looks consistent with existed CREATE INDEX CONCURRENTLY and REFRESH MATERIALIZED VIEW CONCURRENTLY.
+1
Pavel
regards, Sergei
[1]: https://www.postgresql.org/message-id/CAB7nPqT%2B6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw%40mail.gmail.com
On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote: > NOTICE seems unnecessary here. > > Unfortunally concurrently reindex loses comments, reproducer: Yes, the NOTICE message makes little sense. I am getting back in touch with this stuff. It has been some time but the core of the patch has not actually changed in its base concept, so I am still very familiar with it as the original author. There are even typos I may have introduced a couple of years back, like "contraint". I have not yet spent much time on that, but there are at quick glance a bunch of things that could be retouched to get pieces of that committable. + The concurrent index created during the processing has a name ending in + the suffix ccnew, or ccold if it is an old index definiton which we failed + to drop. Invalid indexes can be dropped using <literal>DROP INDEX</literal>, + including invalid toast indexes. This needs <literal> markups for "ccnew" and "ccold". "definiton" is not correct. index_create does not actually need its extra argument with the tuple descriptor. I think that we had better grab the column name list from indexInfo and just pass that down to index_create() (patched on my local branch), so it is an overkill to take a full copy of the index's TupleDesc. The patch, standing as-is, is close to 2k lines long, so let's cut that first into more pieces refactoring the concurrent build code. Here are some preliminary notes: - WaitForOlderSnapshots() could be in its own patch. - index_concurrently_build() and index_concurrently_set_dead() can be in an independent patch. set_dead() had better be a wrapper on top of index_set_state_flags actually which is able to set any kind of flags. - A couple of pieces in index_create() could be cut as well. I can send patches for those things as first steps which could happen in this commit then, and commit them as needed. This way, we reduce the size of the main patch. Even if the main portion does not get into v12, we'd still have base pieces to move on next. Regarding the grammar, we tend for the last couple of years to avoid complicating the main grammar and move on to parenthesized grammars (see VACUUM, ANALYZE, EXPLAIN, etc). So in the same vein I think that it would make sense to only support CONCURRENTLY within parenthesis and just plugin that with the VERBOSE option. Does somebody mind if I jump into the ship after so long? I was the original author of the monster after all... -- Michael
Attachment
On 1/16/19 9:27 AM, Michael Paquier wrote:> Regarding the grammar, we tend for the last couple of years to avoid > complicating the main grammar and move on to parenthesized grammars > (see VACUUM, ANALYZE, EXPLAIN, etc). So in the same vein I think that > it would make sense to only support CONCURRENTLY within parenthesis > and just plugin that with the VERBOSE option. Personally I do not care, but there have been a lot of voices for keeping REINDEX CONCURRENTLY consistent with CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY. > Does somebody mind if I jump into the ship after so long? I was the > original author of the monster after all... Fine by me. Peter? Andreas
On 2019-Jan-16, Michael Paquier wrote: > Regarding the grammar, we tend for the last couple of years to avoid > complicating the main grammar and move on to parenthesized grammars > (see VACUUM, ANALYZE, EXPLAIN, etc). So in the same vein I think that > it would make sense to only support CONCURRENTLY within parenthesis > and just plugin that with the VERBOSE option. That's my opinion too, but I was outvoted in another subthread -- see https://postgr.es/m/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql Stephen Frost, Andrew Gierth and Andres Freund all voted to put CONCURRENTLY outside the parens. It seems we now have three votes to put it *in* the parens (you, Peter Eisentraut, me). I guess more votes are needed to settle this issue. My opinion is that if we had had parenthesized options lists back when CREATE INDEX CONCURRENTLY was invented, we would have put it there. But we were young and needed the money ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 16, 2019 at 02:59:31PM -0300, Alvaro Herrera wrote: > That's my opinion too, but I was outvoted in another subthread -- see > https://postgr.es/m/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql > Stephen Frost, Andrew Gierth and Andres Freund all voted to put > CONCURRENTLY outside the parens. It seems we now have three votes to > put it *in* the parens (you, Peter Eisentraut, me). I guess more votes > are needed to settle this issue. Sure, let's see. I would have been in the crowd of not using parenthetised grammar five years ago, but the recent deals with other commands worry me, as we would repeat the same errors. > My opinion is that if we had had parenthesized options lists back when > CREATE INDEX CONCURRENTLY was invented, we would have put it there. > But we were young and needed the money ... :) -- Michael
Attachment
On Wed, Jan 16, 2019 at 05:56:15PM +0100, Andreas Karlsson wrote: > On 1/16/19 9:27 AM, Michael Paquier wrote: >> Does somebody mind if I jump into the ship after so long? I was the >> original author of the monster after all... > > Fine by me. Peter? Okay, I have begun digging into the patch, and extracted for now two things which can be refactored first, giving a total of three patches: - 0001, which creates WaitForOlderSnapshots to snapmgr.c. I think that this can be useful for external extensions to have a process wait for snapshots older than a minimum threshold hold by other transactions. - 0002, which moves the concurrent index build into its own routine, index_build_concurrent(). At the same time, index_build() has a isprimary argument which is not used, so let's remove it. This simplifies a bit the refactoring as well. - 0003 is the core patch, realigned with the rest, fixing some typos I found on the way. Here are also some notes for things I am planning to look after with a second pass: - The concurrent drop (phase 5) part, still shares a lot with DROP INDEX CONCURRENTLY, and I think that we had better refactor more the code so as REINDEX CONCURRENTLY shared more with DROP INDEX. One thing which I think is incorrect is that we do not clear the invalid flag of the drop index before marking it as dead. This looks like a missing piece from another concurrent-related bug fix lost over the rebases this patch had. - set_dead could be refactored so as it is able to handle in input multiple indexes, using WaitForLockersMultiple(). This way CREATE INDEX CONCURRENTLY could also use it. - There are no regression tests for partitioned tables. - The NOTICE messages showing up when a table has no indexes should be removed. - index_create() does not really need a TupleDesc argument, as long as the caller is able to provide a list of column names. - At the end of the day, I think that it would be nice to reach a state where we have a set of low-level routines like index_build_concurrent, index_set_dead_concurrent which are used by both paths of CONCURRENTLY and can be called for each phase within a given transaction. Those pieces can also be helpful to implement for example an extension able to do concurrent reindexing out of core. I think that the refactorings in 0001 and 0002 are committable as-is, and this shaves some code from the core patch. Thoughts? -- Michael
Attachment
On 16/01/2019 18:59, Alvaro Herrera wrote: > On 2019-Jan-16, Michael Paquier wrote: > >> Regarding the grammar, we tend for the last couple of years to avoid >> complicating the main grammar and move on to parenthesized grammars >> (see VACUUM, ANALYZE, EXPLAIN, etc). So in the same vein I think that >> it would make sense to only support CONCURRENTLY within parenthesis >> and just plugin that with the VERBOSE option. > > That's my opinion too, but I was outvoted in another subthread -- see > https://postgr.es/m/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql > Stephen Frost, Andrew Gierth and Andres Freund all voted to put > CONCURRENTLY outside the parens. It seems we now have three votes to > put it *in* the parens (you, Peter Eisentraut, me). I guess more votes > are needed to settle this issue. My vote is to have homogeneous syntax for all of this, and so put it in parentheses, but we should also allow CREATE INDEX and DROP INDEX to use parentheses for it, too. I supposed we'll keep what would then be the legacy syntax for a few decades or more. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Fri, Jan 18, 2019 at 07:58:06PM +0100, Vik Fearing wrote: > My vote is to have homogeneous syntax for all of this, and so put it in > parentheses, but we should also allow CREATE INDEX and DROP INDEX to use > parentheses for it, too. That would be a new thing as these variants don't exist yet, and WITH is for storage parameters. In my opinion, the long-term take on doing such things is that we are then able to reduce the number of reserved keywords in the grammar. Even if for the case of CONCURRENTLY we may see humans on Mars before this actually happens, this does not mean that we should not do it moving forward for other keywords in the grammar. -- Michael
Attachment
On 19/01/2019 02:33, Michael Paquier wrote: > On Fri, Jan 18, 2019 at 07:58:06PM +0100, Vik Fearing wrote: >> My vote is to have homogeneous syntax for all of this, and so put it in >> parentheses, but we should also allow CREATE INDEX and DROP INDEX to use >> parentheses for it, too. > > That would be a new thing as these variants don't exist yet, and WITH > is for storage parameters. In my opinion, the long-term take on doing > such things is that we are then able to reduce the number of reserved > keywords in the grammar. Even if for the case of CONCURRENTLY we may > see humans on Mars before this actually happens, this does not mean > that we should not do it moving forward for other keywords in the > grammar. I'm not sure I understand your point. I don't want a situation like this: CREATE INDEX CONCURRENTLY ... DROP INDEX CONCURRENTLY ... REINDEX INDEX (CONCURRENTLY) ... All three should be the same, and my suggestion is to add the parenthesized version to CREATE and DROP and not add the unparenthesized version to REINDEX. I never said anything about WITH. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Thu, Jan 17, 2019 at 02:11:01PM +0900, Michael Paquier wrote: > Okay, I have begun digging into the patch, and extracted for now two > things which can be refactored first, giving a total of three patches: > - 0001, which creates WaitForOlderSnapshots to snapmgr.c. I think > that this can be useful for external extensions to have a process wait > for snapshots older than a minimum threshold hold by other > transactions. > - 0002, which moves the concurrent index build into its own routine, > index_build_concurrent(). At the same time, index_build() has a > isprimary argument which is not used, so let's remove it. This > simplifies a bit the refactoring as well. > - 0003 is the core patch, realigned with the rest, fixing some typos I > found on the way. Are there any objections if I commit 0001? Introducing WaitForOlderSnapshots() is quite independent from the rest, and the refactoring is obvious. For 0002, I am still not 100% sure if index_build_concurrent() is the best interface but I am planning to look more at this stuff next week, particularly the drop portion which needs more work. -- Michael
Attachment
Hello > I don't want a situation like this: > CREATE INDEX CONCURRENTLY ... > DROP INDEX CONCURRENTLY ... > REINDEX INDEX (CONCURRENTLY) ... > > All three should be the same, and my suggestion is to add the > parenthesized version to CREATE and DROP and not add the unparenthesized > version to REINDEX. We already have parenthesized VERBOSE option for REINDEX. So proposed syntax was: REINDEX (CONCURRENTLY) INDEX ... REINDEX (VERBOSE, CONCURRENTLY) INDEX ... Like parameters for EXPLAIN, VACUUM. And completely unlike create/drop index. So consistent syntax for create/drop would be: CREATE (CONCURRENTLY) INDEX ... CREATE (UNIQUE, CONCURRENTLY) INDEX ... # or we want parenthesized concurrently, but not unique? CREATE UNIQUE (CONCURRENTLY)INDEX? DROP (CONCURRENTLY) INDEX ... How about REFRESH MATERIALIZED VIEW? Do not change? regards, Sergei
Greetings, * Vik Fearing (vik.fearing@2ndquadrant.com) wrote: > On 16/01/2019 18:59, Alvaro Herrera wrote: > > On 2019-Jan-16, Michael Paquier wrote: > > > >> Regarding the grammar, we tend for the last couple of years to avoid > >> complicating the main grammar and move on to parenthesized grammars > >> (see VACUUM, ANALYZE, EXPLAIN, etc). So in the same vein I think that > >> it would make sense to only support CONCURRENTLY within parenthesis > >> and just plugin that with the VERBOSE option. > > > > That's my opinion too, but I was outvoted in another subthread -- see > > https://postgr.es/m/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql > > Stephen Frost, Andrew Gierth and Andres Freund all voted to put > > CONCURRENTLY outside the parens. It seems we now have three votes to > > put it *in* the parens (you, Peter Eisentraut, me). I guess more votes > > are needed to settle this issue. > > My vote is to have homogeneous syntax for all of this, and so put it in > parentheses, but we should also allow CREATE INDEX and DROP INDEX to use > parentheses for it, too. > > I supposed we'll keep what would then be the legacy syntax for a few > decades or more. I'm still of the opinion that we should have CONCURRENTLY allowed without the parentheses. I could see allowing it with them, as well, but I do feel that we should be using the parentheses-based approach more as a last-resort kind of thing instead of just baking in everything to require them. We have said before that we don't want to have things implemented in a purely functional way (see the discussions around pglogical and such) and while this isn't quite the same, I do think it heads in that direction. It's certainly harder to have to think about how to structure these commands so that they look like they belong in SQL but I think it has benefits too. Thanks! Stephen
Attachment
On January 19, 2019 7:32:55 AM PST, Stephen Frost <sfrost@snowman.net> wrote: >Greetings, > >* Vik Fearing (vik.fearing@2ndquadrant.com) wrote: >> My vote is to have homogeneous syntax for all of this, and so put it >in >> parentheses, but we should also allow CREATE INDEX and DROP INDEX to >use >> parentheses for it, too. >> >> I supposed we'll keep what would then be the legacy syntax for a few >> decades or more. > >I'm still of the opinion that we should have CONCURRENTLY allowed >without the parentheses. I could see allowing it with them, as well, >but I do feel that we should be using the parentheses-based approach >more as a last-resort kind of thing instead of just baking in >everything >to require them. +1 Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
On Sat, Jan 19, 2019 at 03:01:07AM +0100, Vik Fearing wrote: > On 19/01/2019 02:33, Michael Paquier wrote: >> On Fri, Jan 18, 2019 at 07:58:06PM +0100, Vik Fearing wrote: >>> My vote is to have homogeneous syntax for all of this, and so put it in >>> parentheses, but we should also allow CREATE INDEX and DROP INDEX to use >>> parentheses for it, too. >> >> That would be a new thing as these variants don't exist yet, and WITH >> is for storage parameters. In my opinion, the long-term take on doing >> such things is that we are then able to reduce the number of reserved >> keywords in the grammar. Even if for the case of CONCURRENTLY we may >> see humans on Mars before this actually happens, this does not mean >> that we should not do it moving forward for other keywords in the >> grammar. > > I'm not sure I understand your point. > > I don't want a situation like this: > CREATE INDEX CONCURRENTLY ... > DROP INDEX CONCURRENTLY ... > REINDEX INDEX (CONCURRENTLY) ... > > All three should be the same, and my suggestion is to add the > parenthesized version to CREATE and DROP and not add the unparenthesized > version to REINDEX. I am not sure what is the actual reason which could force to decide that all three queries should have the same grammar, and why this has anything to do on a thread about REINDEX. REINDEX can work on many more object types than an index so its scope is much larger, contrary to CREATE/DROP INDEX. An advantage of using parenthesized grammar and prioritize it is that you don't have to add it to the list of reserved keywords, and the parser can rely on IDENT for its work. I personally prefer the parenthesized grammar for that reason. If the crowd votes in majority for the other option, that's of course fine to me too. > I never said anything about WITH. Perhaps I have not explained my thoughts clearly here. My point was that if some day we decide to drop the non-parenthesized grammar of CREATE/DROP INDEX, one possibility would be to have a "concurrent" option as part of WITH, even if that's used only now for storage parameters. That's the only actual part of the grammar which is extensible. -- Michael
Attachment
On Fri, Jan 18, 2019 at 9:01 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote: > I don't want a situation like this: > CREATE INDEX CONCURRENTLY ... > DROP INDEX CONCURRENTLY ... > REINDEX INDEX (CONCURRENTLY) ... > > All three should be the same, and my suggestion is to add the > parenthesized version to CREATE and DROP and not add the unparenthesized > version to REINDEX. +1 for all three being the same. I could see allowing only the unparenthesized format for all three, or allowing both forms for all three, but I think having only one form for each and having them not agree will be too confusing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
st 23. 1. 2019 19:17 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:
On Fri, Jan 18, 2019 at 9:01 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
> I don't want a situation like this:
> CREATE INDEX CONCURRENTLY ...
> DROP INDEX CONCURRENTLY ...
> REINDEX INDEX (CONCURRENTLY) ...
>
> All three should be the same, and my suggestion is to add the
> parenthesized version to CREATE and DROP and not add the unparenthesized
> version to REINDEX.
+1 for all three being the same. I could see allowing only the
unparenthesized format for all three, or allowing both forms for all
three, but I think having only one form for each and having them not
agree will be too confusing.
+1
Pavel
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hi, On 2019-01-23 13:17:26 -0500, Robert Haas wrote: > On Fri, Jan 18, 2019 at 9:01 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote: > > I don't want a situation like this: > > CREATE INDEX CONCURRENTLY ... > > DROP INDEX CONCURRENTLY ... > > REINDEX INDEX (CONCURRENTLY) ... > > > > All three should be the same, and my suggestion is to add the > > parenthesized version to CREATE and DROP and not add the unparenthesized > > version to REINDEX. > > +1 for all three being the same. I could see allowing only the > unparenthesized format for all three, or allowing both forms for all > three, but I think having only one form for each and having them not > agree will be too confusing. It seems quite unnecesarily confusing to me to require parens for REINDEX CONCURRENTLY when we've historically not required that for CREATE/DROP INDEX CONCURRENTLY. Besides that, training people that it's the correct form to use parens for CIC/DIC, creates an unnecessary version dependency. I think it actually makes sense to see the CONCURRENTLY versions as somewhat separate types of statements than the non concurrent versions. They have significantly different transactional behaviour (like not being able to be run within one, and leaving gunk behind in case of error). For me it semantically makes sense to have that denoted at the toplevel, it's a related but different type of DDL statement. Greetings, Andres Freund
Here is an updated patch, which addresses some of your issues below as well as the earlier reported issue that comments were lost during REINDEX CONCURRENTLY. On 16/01/2019 09:27, Michael Paquier wrote: > On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote: >> NOTICE seems unnecessary here. >> >> Unfortunally concurrently reindex loses comments, reproducer: > > Yes, the NOTICE message makes little sense. This is existing behavior of reindex-not-concurrently. > I am getting back in touch with this stuff. It has been some time but > the core of the patch has not actually changed in its base concept, so > I am still very familiar with it as the original author. There are > even typos I may have introduced a couple of years back, like > "contraint". I have not yet spent much time on that, but there are at > quick glance a bunch of things that could be retouched to get pieces > of that committable. > > + The concurrent index created during the processing has a name ending in > + the suffix ccnew, or ccold if it is an old index definiton which we failed > + to drop. Invalid indexes can be dropped using <literal>DROP INDEX</literal>, > + including invalid toast indexes. > This needs <literal> markups for "ccnew" and "ccold". "definiton" is > not correct. Fixed those. > index_create does not actually need its extra argument with the tuple > descriptor. I think that we had better grab the column name list from > indexInfo and just pass that down to index_create() (patched on my > local branch), so it is an overkill to take a full copy of the index's > TupleDesc. Please send a fixup patch. > The patch, standing as-is, is close to 2k lines long, so let's cut > that first into more pieces refactoring the concurrent build code. > Here are some preliminary notes: > - WaitForOlderSnapshots() could be in its own patch. > - index_concurrently_build() and index_concurrently_set_dead() can be > in an independent patch. set_dead() had better be a wrapper on top of > index_set_state_flags actually which is able to set any kind of > flags. > - A couple of pieces in index_create() could be cut as well. I'm not a fan of that. I had already considered all the ways in which subparts of this patch could get committed, and some of it was committed, so what's left now is what I thought should stay together. The patch isn't really that big and most of it is moving code around. I would also avoid chopping around in this patch now and focus on getting it finished instead. The functionality seems solid, so if it's good, let's commit it, if it's not, let's get it fixed up. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Tue, Jan 29, 2019 at 09:51:35PM +0100, Peter Eisentraut wrote: > On 16/01/2019 09:27, Michael Paquier wrote: >> index_create does not actually need its extra argument with the tuple >> descriptor. I think that we had better grab the column name list from >> indexInfo and just pass that down to index_create() (patched on my >> local branch), so it is an overkill to take a full copy of the index's >> TupleDesc. > > Please send a fixup patch. Sure. Attached is a patch which can be applied on top of what you sent last, based on what I noticed at review, here and there. You also forgot to switch two heap_open() to table_open() in pg_depend.c. >> The patch, standing as-is, is close to 2k lines long, so let's cut >> that first into more pieces refactoring the concurrent build code. >> Here are some preliminary notes: >> - WaitForOlderSnapshots() could be in its own patch. >> - index_concurrently_build() and index_concurrently_set_dead() can be >> in an independent patch. set_dead() had better be a wrapper on top of >> index_set_state_flags actually which is able to set any kind of >> flags. >> - A couple of pieces in index_create() could be cut as well. > > I'm not a fan of that. I had already considered all the ways in which > subparts of this patch could get committed, and some of it was > committed, so what's left now is what I thought should stay together. > The patch isn't really that big and most of it is moving code around. I > would also avoid chopping around in this patch now and focus on getting > it finished instead. The functionality seems solid, so if it's good, > let's commit it, if it's not, let's get it fixed up. Well, the feature looks solid to me, and not much of its code has actually changed over the years FWIW. Committing large and complex patches is something you have more experience with than myself, and I find the exercise very difficult. So if you feel it's adapted to keep things grouped together, it is not an issue to me. I'll follow the lead. -- Michael
Attachment
On 30/01/2019 06:16, Michael Paquier wrote: > On Tue, Jan 29, 2019 at 09:51:35PM +0100, Peter Eisentraut wrote: >> On 16/01/2019 09:27, Michael Paquier wrote: >>> index_create does not actually need its extra argument with the tuple >>> descriptor. I think that we had better grab the column name list from >>> indexInfo and just pass that down to index_create() (patched on my >>> local branch), so it is an overkill to take a full copy of the index's >>> TupleDesc. >> >> Please send a fixup patch. > > Sure. Attached is a patch which can be applied on top of what you > sent last, based on what I noticed at review, here and there. You > also forgot to switch two heap_open() to table_open() in pg_depend.c. OK, applied most of that. I didn't take your function renaming. I had deliberately named the functions index_concurrently_${task}, so their relationship is more easily visible. Anyway, that's all cosmetics. Are there any more functional or correctness issues to be addressed? Another thing I was thinking of: We need some database-global tests. For example, at some point during development, I had broken some variant of REINDEX DATABASE. Where could we put those tests? Maybe with reindexdb? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Thu, Feb 07, 2019 at 12:49:43PM +0100, Peter Eisentraut wrote: > Anyway, that's all cosmetics. Are there any more functional or > correctness issues to be addressed? Not that I can think of. At least this evening. > Another thing I was thinking of: We need some database-global tests. > For example, at some point during development, I had broken some variant > of REINDEX DATABASE. Where could we put those tests? Maybe with > reindexdb? Having some coverage in the TAP tests of reindexdb is a good idea. -- Michael
Attachment
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation: tested, passed Hello Sorry for late response. I review latest patch version. I didn't found any new behavior bugs. reindex concurrenlty can be in deadlock with another reindex concurrently (or manual vacuum (maybe with wraparound autovacuum)or create index concurrently on same table). But i think this is not issue for this feature, two create indexconcurrently can be in deadlock too. Just one small note for documentation: > +Indexes: > + "idx" btree (col) > + "idx_cct" btree (col) INVALID Second index should be idx_ccnew (or idx_ccold), right? Code looks good for me. regards, Sergei
On 2019-Feb-07, Peter Eisentraut wrote: > Another thing I was thinking of: We need some database-global tests. > For example, at some point during development, I had broken some variant > of REINDEX DATABASE. Where could we put those tests? Maybe with reindexdb? What's wrong with a new reindex.sql in regress? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Feb 07, 2019 at 12:07:01PM -0300, Alvaro Herrera wrote: > On 2019-Feb-07, Peter Eisentraut wrote: >> Another thing I was thinking of: We need some database-global tests. >> For example, at some point during development, I had broken some variant >> of REINDEX DATABASE. Where could we put those tests? Maybe with reindexdb? > > What's wrong with a new reindex.sql in regress? Depending on the numbers of objects created and remaining around before the script is run in the main suite, this would be costly. I think that this approach would not scale well in the long-term. Having TAP test with reindexdb gives you access to a full instance with its contents always fully known at test time. -- Michael
Attachment
Hello Patch is marked as target version 12, but is inactive few weeks long. I think many users want this feature and patch is ingood shape. We have open questions on this thread? Latest patch still can be aplied cleanly; it builds and pass tests. regards, Sergei
On 2019-03-13 15:13, Sergei Kornilov wrote: > Patch is marked as target version 12, but is inactive few weeks long. I think many users want this feature and patch isin good shape. We have open questions on this thread? > > Latest patch still can be aplied cleanly; it builds and pass tests. Here is an updated patch. It includes the typo fix in the documentation from you, some small bug fixes, a new reindexdb --concurrently option, and based on that some whole-database tests, as discussed recently. I think this addresses all open issues. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Hi, Am Mittwoch, den 13.03.2019, 23:10 +0100 schrieb Peter Eisentraut: > Here is an updated patch. I had a quick look at some of the comments and noticed some possible nitpicky-level problems: > +/* > + * index_concurrently_set_dead > + * > + * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX > + * CONCURRENTLY before actually dropping the index. After calling this > + * function the index is seen by all the backends as dead. Low-level locks > + * taken here are kept until the end of the transaction doing calling this > + * function. > + */ "the transaction doing calling this function." sounds wrong to me. > + * Extract the list of indexes that are going to be rebuilt based on the > + * list of relation Oids given by caller. For each element in given list, > + * if the relkind of given relation Oid is a table, all its valid indexes > + * will be rebuilt, including its associated toast table indexes. If > + * relkind is an index, this index itself will be rebuilt. The locks taken > + * on parent relations and involved indexes are kept until this > + * transaction is committed to protect against schema changes that might > + * occur until the session lock is taken on each relation, session lock > + * used to similarly protect from any schema change that could happen > + * within the multiple transactions that are used during this process. > + */ I think the last sentence in the above should be split up into several sentences, maybe at "session lock used..."? Or maybe it should just say "a session lock is used" instead? > + else > + { > + /* > + * Save the list of relation OIDs in private > + * context > + */ Missing full stop at end of comment. > + /* Definetely no indexes, so leave */ s/Definetely/Definitely/. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.banck@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
On 2019-03-15 22:32, Michael Banck wrote: > I had a quick look at some of the comments and noticed some possible > nitpicky-level problems: Thanks, I've integrated these changes into my local branch. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello Yet another review of this patch from me... > An index build with the <literal>CONCURRENTLY</literal> option failed, leaving > an <quote>invalid</quote> index. Such indexes are useless but it can be > - convenient to use <command>REINDEX</command> to rebuild them. Note that > - <command>REINDEX</command> will not perform a concurrent build. To build the > - index without interfering with production you should drop the index and > - reissue the <command>CREATE INDEX CONCURRENTLY</command> command. > + convenient to use <command>REINDEX</command> to rebuild them. Not sure we can just say "use REINDEX" since only non-concurrently reindex can rebuild such index. I propose to not changethis part. > + The following steps occur in a concurrent index build, each in a separate > + transaction except when the new index definitions are created > > + All the constraints and foreign keys which refer to the index are swapped... > + ... This step is done within a single transaction > + for each temporary entry. > > + Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote> > + to prevent any new tuple insertions after waiting for running queries which > + may reference the old index to complete. This step is done within a single > + transaction for each temporary entry. According to the code index_concurrently_swap is called in loop inside one transaction for all processed indexes of table.Same for index_concurrently_set_dead and index_concurrently_drop calls. So this part of documentation seems incorrect. And few questions: - reindexdb has concurrently flag logic even in reindex_system_catalogs, but "reindex concurrently" can not reindex systemcatalog. Is this expected? - should reindexdb check server version? For example, binary from patched HEAD can reindex v11 cluster and obviously failif --concurrently was requested. - psql/tab-complete.c vs old releases? Seems we need suggest CONCURRENTLY keyword only for releases with concurrently support. Well, i still have no new questions about backend logic. Maybe we need mark patch as "Ready for Committer" in order to getmore attention from other committers? regards, Sergei
On 2019-03-23 20:04, Sergei Kornilov wrote: >> An index build with the <literal>CONCURRENTLY</literal> option failed, leaving >> an <quote>invalid</quote> index. Such indexes are useless but it can be >> - convenient to use <command>REINDEX</command> to rebuild them. Note that >> - <command>REINDEX</command> will not perform a concurrent build. To build the >> - index without interfering with production you should drop the index and >> - reissue the <command>CREATE INDEX CONCURRENTLY</command> command. >> + convenient to use <command>REINDEX</command> to rebuild them. > > Not sure we can just say "use REINDEX" since only non-concurrently reindex can rebuild such index. I propose to not changethis part. Yeah, I reverted that and adjusted the wording a bit. >> + Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote> >> + to prevent any new tuple insertions after waiting for running queries which >> + may reference the old index to complete. This step is done within a single >> + transaction for each temporary entry. > > According to the code index_concurrently_swap is called in loop inside one transaction for all processed indexes of table.Same for index_concurrently_set_dead and index_concurrently_drop calls. So this part of documentation seems incorrect. I rewrote that whole procedure to make it a bit simpler. > And few questions: > - reindexdb has concurrently flag logic even in reindex_system_catalogs, but "reindex concurrently" can not reindex systemcatalog. Is this expected? If support is ever added, then reindexdb supports it automatically. It seems simpler to not have to repeat the same checks in two places. > - should reindexdb check server version? For example, binary from patched HEAD can reindex v11 cluster and obviously failif --concurrently was requested. Added. > - psql/tab-complete.c vs old releases? Seems we need suggest CONCURRENTLY keyword only for releases with concurrently support. It seems we don't do version checks for tab completion of keywords. > Well, i still have no new questions about backend logic. Maybe we need mark patch as "Ready for Committer" in order toget more attention from other committers? Let's do it. :-) I've gone over this patch a few more times. I've read all the discussion since 2012 again and made sure all the issues were addressed. I made particularly sure that during the refactoring nothing in CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY was inadvertently changed. I checked all the steps again. I'm happy with it. One more change I made was in the drop phase. I had to hack it up a bit so that we can call index_drop() with a concurrent lock but not actually doing the concurrent processing (that would be a bit recursive). The previous patch was actually taking too strong a lock here. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Mon, Mar 25, 2019 at 04:23:34PM +0100, Peter Eisentraut wrote: > Let's do it. :-) I am pretty sure that this has been said at least once since 2012. > I've gone over this patch a few more times. I've read all the > discussion since 2012 again and made sure all the issues were addressed. > I made particularly sure that during the refactoring nothing in CREATE > INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY was inadvertently > changed. I checked all the steps again. I'm happy with it. From my side, I would be happy to look at this patch. Unfortunately I won't have the room to look at it this week I think :( But if you are happy with it that's fine by me, at least I can fix anything which is broken :) -- Michael
Attachment
Hi Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)? >> And few questions: >> - reindexdb has concurrently flag logic even in reindex_system_catalogs, but "reindex concurrently" can not reindex systemcatalog. Is this expected? > > If support is ever added, then reindexdb supports it automatically. It > seems simpler to not have to repeat the same checks in two places. ok, reasonable for me >> - psql/tab-complete.c vs old releases? Seems we need suggest CONCURRENTLY keyword only for releases with concurrentlysupport. > > It seems we don't do version checks for tab completion of keywords. Hmm, yes, i found only few checks for "create trigger" syntax about "EXECUTE FUNCTION"/"EXECUTE PROCEDURE" difference in11 regards, Sergei
On 2019-03-28 09:07, Sergei Kornilov wrote: > Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)? Committed :) -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>> Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)? > > Committed :) wow! Congratulations! This was very long way my favorite pg12 feature regards, Sergei
On Fri, Mar 29, 2019 at 10:39:23AM +0300, Sergei Kornilov wrote: > wow! Congratulations! This was very long way > > my favorite pg12 feature So this has been committed, nice! Thanks a lot to all for keeping alive this patch over the ages, with particular thanks to Andreas and Peter. -- Michael
Attachment
On 2019-03-29 09:04, Michael Paquier wrote: > On Fri, Mar 29, 2019 at 10:39:23AM +0300, Sergei Kornilov wrote: >> wow! Congratulations! This was very long way >> >> my favorite pg12 feature > > So this has been committed, nice! Thanks a lot to all for keeping > alive this patch over the ages, with particular thanks to Andreas and > Peter. So, we're getting buildfarm failures, only with clang. I can reproduce those (with clang). It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX CONCURRENTLY". More eyes welcome. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-03-29 09:13, Peter Eisentraut wrote: > On 2019-03-29 09:04, Michael Paquier wrote: >> On Fri, Mar 29, 2019 at 10:39:23AM +0300, Sergei Kornilov wrote: >>> wow! Congratulations! This was very long way >>> >>> my favorite pg12 feature >> >> So this has been committed, nice! Thanks a lot to all for keeping >> alive this patch over the ages, with particular thanks to Andreas and >> Peter. > > So, we're getting buildfarm failures, only with clang. I can reproduce > those (with clang). > > It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX > CONCURRENTLY". More eyes welcome. I think I found a fix. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Mar 29, 2019 at 09:13:35AM +0100, Peter Eisentraut wrote: > So, we're getting buildfarm failures, only with clang. I can reproduce > those (with clang). Indeed, I can reproduce the failures using -O2 with clang. I am wondering if we are not missing a volatile flag somewhere and that some code reordering is at cause here. > It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX > CONCURRENTLY". More eyes welcome. Here is a short reproducer: create materialized view aam as select 1 AS a; create index aai on aam(a); reindex table CONCURRENTLY aam; -- Michael
Attachment
Hi hackers, I tried this great feature for partition index. The first time the REINDEX TABLE CONCURRENTLY statement is executed to the partition, then an error occurs. The second run succeeds but leaves an index with an INVALID status. I think this is not the desired behaviour. # TEST postgres=> CREATE TABLE part1(c1 INT) PARTITION BY RANGE(c1); CREATE TABLE postgres=> CREATE TABLE part1v1 PARTITION OF part1 FOR VALUES FROM (0) TO (100); CREATE TABLE postgres=> CREATE INDEX idx1_part1 ON part1(c1); CREATE INDEX postgres=> REINDEX TABLE CONCURRENTLY part1v1; ERROR: cannot drop index part1v1_c1_idx_ccold because index idx1_part1 requires it HINT: You can drop index idx1_part1 instead. postgres=> \d+ part1v1 Table "public.part1v1" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- c1 | integer | | | | plain | | Partition of: part1 FOR VALUES FROM (0) TO (100) Partition constraint: ((c1 IS NOT NULL) AND (c1 >= 0) AND (c1 < 100)) Indexes: "part1v1_c1_idx" btree (c1) "part1v1_c1_idx_ccold" btree (c1) INVALID Access method: heap postgres=> REINDEX TABLE CONCURRENTLY part1v1; REINDEX postgres=> \d+ part1v1 Table "public.part1v1" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- c1 | integer | | | | plain | | Partition of: part1 FOR VALUES FROM (0) TO (100) Partition constraint: ((c1 IS NOT NULL) AND (c1 >= 0) AND (c1 < 100)) Indexes: "part1v1_c1_idx" btree (c1) "part1v1_c1_idx_ccold" btree (c1) INVALID Access method: heap Regards, Noriyoshi Shinoda -----Original Message----- From: Michael Paquier [mailto:michael@paquier.xyz] Sent: Friday, March 29, 2019 6:21 PM To: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Cc: Sergei Kornilov <sk@zsrv.org>; pgsql-hackers@lists.postgresql.org Subject: Re: REINDEX CONCURRENTLY 2.0 On Fri, Mar 29, 2019 at 09:13:35AM +0100, Peter Eisentraut wrote: > So, we're getting buildfarm failures, only with clang. I can > reproduce those (with clang). Indeed, I can reproduce the failures using -O2 with clang. I am wondering if we are not missing a volatile flag somewhereand that some code reordering is at cause here. > It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX > CONCURRENTLY". More eyes welcome. Here is a short reproducer: create materialized view aam as select 1 AS a; create index aai on aam(a); reindex table CONCURRENTLY aam; -- Michael
On Fri, Mar 29, 2019 at 3:28 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > On 2019-03-28 09:07, Sergei Kornilov wrote: > > Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)? > > Committed :) > Given this has been committed I've probably missed the window, but philosophically speaking, is there any reason not to make the "concurrently" behavior the default behavior, and require a keyword for the more heavy-weight old behavior? In most production scenarios you probably want to avoid exclusive locking, and in the cases where that isn't an issue, 'concurrently' isn't that much slower that most users would object to it. I would perhaps give a nod to historical syntax concerns, but this would more closely align with the behavior in vacuum vs vacuum full, and we've done behavior modifying changes such as the recent WITH ... MATERIALIZED change. Thoughts? Robert Treat https://xzilla.net
Hi, On 2019-03-29 11:47:10 -0400, Robert Treat wrote: > On Fri, Mar 29, 2019 at 3:28 AM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: > > > > On 2019-03-28 09:07, Sergei Kornilov wrote: > > > Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)? > > > > Committed :) > > > > Given this has been committed I've probably missed the window, but > philosophically speaking, is there any reason not to make the > "concurrently" behavior the default behavior, and require a keyword > for the more heavy-weight old behavior? Yes, it increases the total runtime quite considerably. And it adds new failure modes with partially built invalid indexes hanging around that need to be dropped manually. > In most production scenarios > you probably want to avoid exclusive locking, and in the cases where > that isn't an issue, 'concurrently' isn't that much slower that most > users would object to it. It does at *least* twice as much IO. Greetings, Andres Freund
I noticed a very small typo in the documentation for this feature. diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index ccabb330cb..e45bf86c8d 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -349,7 +349,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR <listitem> <para> The old indexes are dropped. The <literal>SHARE UPDATE - EXCLUSIVE</literal> session locks for the indexes and the table ar + EXCLUSIVE</literal> session locks for the indexes and the table are released. </para> </listitem> Nathan
On Fri, Mar 29, 2019 at 03:53:05PM +0000, Bossart, Nathan wrote: > I noticed a very small typo in the documentation for this feature. I submit a bunch more changes for consideration, attached.
Attachment
On 2019-03-29 16:53, Bossart, Nathan wrote: > I noticed a very small typo in the documentation for this feature. fixed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-03-29 17:01, Justin Pryzby wrote: > On Fri, Mar 29, 2019 at 03:53:05PM +0000, Bossart, Nathan wrote: >> I noticed a very small typo in the documentation for this feature. > > I submit a bunch more changes for consideration, attached. fixed, thanks -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Mar 29, 2019 at 03:10:23PM +0000, Shinoda, Noriyoshi (PN Japan A&PS Delivery) wrote: > I tried this great feature for partition index. > The first time the REINDEX TABLE CONCURRENTLY statement is executed > to the partition, then an error occurs. Yes, that's a problem. I am adding an open item. > The second run succeeds but leaves an index with an INVALID status. > I think this is not the desired behaviour. This one is partially expected actually. Invalid indexes are ignored when processing as that would cause at least a table-level reindex to double its amount of processing. However, it is not possible either to detach an index from an partition tree for indexes, hence the index cannot be dropped directly either. It seems to me that the root of the problem is that the partition indexes created as copycats of the original ones should never be in a state where they are attached to the index tree. -- Michael
Attachment
On Fri, Mar 29, 2019 at 08:48:03AM -0700, Andres Freund wrote: > Yes, it increases the total runtime quite considerably. And it adds new > failure modes with partially built invalid indexes hanging around that > need to be dropped manually. On top of that CONCURRENTLY needs multiple transactions to perform its different phases for each transaction: build, validation, swap and cleanup. So it cannot run in a transaction block. Having a separate option makes the most sense. > It does at *least* twice as much IO. Yeah, I can guarantee you that it is much slower, at the advantage of being lock-free. -- Michael
Attachment
On 2019-03-29 16:10, Shinoda, Noriyoshi (PN Japan A&PS Delivery) wrote: > postgres=> CREATE TABLE part1(c1 INT) PARTITION BY RANGE(c1); > CREATE TABLE > postgres=> CREATE TABLE part1v1 PARTITION OF part1 FOR VALUES FROM (0) TO (100); > CREATE TABLE > postgres=> CREATE INDEX idx1_part1 ON part1(c1); > CREATE INDEX > postgres=> REINDEX TABLE CONCURRENTLY part1v1; > ERROR: cannot drop index part1v1_c1_idx_ccold because index idx1_part1 requires it > HINT: You can drop index idx1_part1 instead. The attached patch fixes this. The issue was that we didn't move all dependencies from the index (only in the other direction). Maybe that was sufficient when the patch was originally written, before partitioned indexes. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Sat, Mar 30, 2019 at 11:56:27AM +0100, Peter Eisentraut wrote: > The attached patch fixes this. The issue was that we didn't move all > dependencies from the index (only in the other direction). Maybe that > was sufficient when the patch was originally written, before partitioned > indexes. Hm. I don't think that it is quite right either because the new index is missing from the partition tree after the reindex, taking the example from your patch I see that: =# CREATE TABLE concur_reindex_part1 (c1 int) PARTITION BY RANGE (c1); CREATE TABLE =# CREATE TABLE concur_reindex_part1v1 PARTITION OF concur_reindex_part1 FOR VALUES FROM (0) TO (100); CREATE TABLE =# SELECT relid, level FROM pg_partition_tree('concur_reindex_idx1_part1'); relid | level -------------------------------+------- concur_reindex_idx1_part1 | 0 concur_reindex_part1v1_c1_idx | 1 (2 rows) =# CREATE INDEX concur_reindex_idx1_part1 ON concur_reindex_part1 (c1); CREATE INDEX =# REINDEX TABLE CONCURRENTLY concur_reindex_part1v1; REINDEX SELECT relid, level FROM pg_partition_tree('concur_reindex_idx1_part1'); relid | level ---------------------------+------- concur_reindex_idx1_part1 | 0 (1 row) And I would have expected concur_reindex_part1v1_c1_idx to still be part of the partition tree. I think that the issue is in index_concurrently_create_copy() where we create the new index with index_create() without setting parentIndexRelid, causing the dependency to be lost. -- Michael
Attachment
On Sat, Mar 30, 2019 at 11:56:27AM +0100, Peter Eisentraut wrote: > The attached patch fixes this. The issue was that we didn't move all > dependencies from the index (only in the other direction). Maybe that > was sufficient when the patch was originally written, before partitioned > indexes. Hm. I don't think that it is quite right either because the new index is missing from the partition tree after the reindex. Taking the example from your patch I see that: =# CREATE TABLE concur_reindex_part1 (c1 int) PARTITION BY RANGE (c1); CREATE TABLE =# CREATE TABLE concur_reindex_part1v1 PARTITION OF concur_reindex_part1 FOR VALUES FROM (0) TO (100); CREATE TABLE =# SELECT relid, level FROM pg_partition_tree('concur_reindex_idx1_part1'); relid | level -------------------------------+------- concur_reindex_idx1_part1 | 0 concur_reindex_part1v1_c1_idx | 1 (2 rows) =# CREATE INDEX concur_reindex_idx1_part1 ON concur_reindex_part1 (c1); CREATE INDEX =# REINDEX TABLE CONCURRENTLY concur_reindex_part1v1; REINDEX SELECT relid, level FROM pg_partition_tree('concur_reindex_idx1_part1'); relid | level ---------------------------+------- concur_reindex_idx1_part1 | 0 (1 row) And I would have expected concur_reindex_part1v1_c1_idx to still be part of the partition tree. I think that the issue is in index_concurrently_create_copy() where we create the new index with index_create() without setting parentIndexRelid, causing the dependency to be lost. This parameter ought to be set to the OID of the parent index so I think that we need to look at the ancestors of the index if relispartition is set, and use get_partition_ancestors() for that purpose. -- Michael
Attachment
On Mon, Apr 01, 2019 at 03:43:43PM +0900, Michael Paquier wrote: > And I would have expected concur_reindex_part1v1_c1_idx to still be > part of the partition tree. I think that the issue is in > index_concurrently_create_copy() where we create the new index with > index_create() without setting parentIndexRelid, causing the > dependency to be lost. This parameter ought to be set to the OID of > the parent index so I think that we need to look at the ancestors of > the index if relispartition is set, and use get_partition_ancestors() > for that purpose. And here is the patch to address this issue. It happens that a bit more than the dependency switch was lacking here: - At swap time, we need to have the new index definition track relispartition from the old index. - Again at swap time, the inheritance link needs to be updated between the old/new index and its parent when reindexing a partition index. Tracking the OID of the parent via index_concurrently_create_copy() is not a bright idea as we would finish with the impossibility to drop invalid indexes if the REINDEX CONCURRENTLY failed in the middle (just added some manual elog(ERROR) to test that). I have added a comment before making the index duplica. I have also expanded the regression tests so as we have more coverage for all that, finishing with the attached which keeps partition trees consistent across the operations. Thoughts? -- Michael
Attachment
On Tue, Apr 09, 2019 at 03:50:27PM +0900, Michael Paquier wrote: > And here is the patch to address this issue. It happens that a bit > more than the dependency switch was lacking here: > - At swap time, we need to have the new index definition track > relispartition from the old index. > - Again at swap time, the inheritance link needs to be updated between > the old/new index and its parent when reindexing a partition index. Peter, this is an open item, and I think as the committer of the feature you are its owner. Well, in this case, I don't mind taking the ownership as need be as I know this stuff. Anyway, could you have a look at the patch proposed and see if you have any issues with it? -- Michael
Attachment
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 2019-03-28 09:07, Sergei Kornilov wrote: >> Unfortunately patch does not apply due recent commits. Any chance >> this can be fixed (and even committed in pg12)? > > Committed :) I noticed that the docs for how to recover from a failed CREATE INDEX CONCURRENTLY say that «REINDEX does not support concurrent builds», which is no longer true. I was going to just remove the caveat, but then I discovered that REINDEX CONCURRENTLY doesn't work on INVALID indexes (why?). Attached is a patch that instead adjust the claim to say that REINDEX dos not support rebulilding invalid indexess concurrently. - ilmari -- "A disappointingly low fraction of the human race is, at any given time, on fire." - Stig Sandbeck Mathisen From e2de72b348f8a96e24128fc4188bd542eb676610 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Thu, 11 Apr 2019 10:58:47 +0100 Subject: [PATCH] Correct claim about REINDEX CONCURRENTLY in CREATE INDEX CONCURRENTLY docs REINDEX CONCURRENTLY exists, but cannot reindex invalid indexes. --- doc/src/sgml/ref/create_index.sgml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index d9d95b20e3..c458f54ef1 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -557,8 +557,8 @@ method in such cases is to drop the index and try again to perform <command>CREATE INDEX CONCURRENTLY</command>. (Another possibility is to rebuild the index with <command>REINDEX</command>. However, since <command>REINDEX</command> - does not support concurrent builds, this option is unlikely to seem - attractive.) + does not support reindexing invalid indexes concurrently, this option is + unlikely to seem attractive.) </para> <para> -- 2.20.1
On Thu, Apr 11, 2019 at 11:21:29AM +0100, Dagfinn Ilmari Mannsåker wrote: > I noticed that the docs for how to recover from a failed CREATE INDEX > CONCURRENTLY say that «REINDEX does not support concurrent builds», > which is no longer true. Good catch. I'll apply that in a couple of hours. > I was going to just remove the caveat, but then I discovered that > REINDEX CONCURRENTLY doesn't work on INVALID indexes (why?). This is a wanted choice. The index built in parallel of the existing one during a concurrent reindex is marked invalid during most of the operation. Hence, if the reindex is interrupted or fails, you finish with potentially twice the number of original indexes, half being invalid and the other half being the ones in use. If the user decides to rinse and repeat the concurrent reindex, and if we were to also select invalid indexes for the operation, then you would finish by potentially doing twice the amount of work when working on a relation, half of it for nothing. -- Michael
Attachment
On 2019-Apr-11, Michael Paquier wrote: > > I was going to just remove the caveat, but then I discovered that > > REINDEX CONCURRENTLY doesn't work on INVALID indexes (why?). > > This is a wanted choice. The index built in parallel of the existing > one during a concurrent reindex is marked invalid during most of the > operation. Hence, if the reindex is interrupted or fails, you finish > with potentially twice the number of original indexes, half being > invalid and the other half being the ones in use. If the user decides > to rinse and repeat the concurrent reindex, and if we were to also > select invalid indexes for the operation, then you would finish by > potentially doing twice the amount of work when working on a relation, > half of it for nothing. Hmm, I suppose that makes sense for REINDEX TABLE or anything that reindexes more than one index, but if you do REINDEX INDEX surely it is reasonable to allow the case? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Apr 11, 2019 at 09:49:47AM -0400, Alvaro Herrera wrote: > Hmm, I suppose that makes sense for REINDEX TABLE or anything that > reindexes more than one index, but if you do REINDEX INDEX surely it > is reasonable to allow the case? Yes, we could revisit the REINDEX INDEX portion of the decision, and after sleeping on it my previous argument makes limited sense for REINDEX processes using only one index. One could note that the header comment of ReindexRelationConcurrently() kind of implies the same conclusion as you do, but that's perhaps just an accident. So... I am coming up with the patch attached. I have introduced some tests using a trick with CIC to have an invalid index to work on. -- Michael
Attachment
On 2019-04-11 05:59, Michael Paquier wrote: > On Tue, Apr 09, 2019 at 03:50:27PM +0900, Michael Paquier wrote: >> And here is the patch to address this issue. It happens that a bit >> more than the dependency switch was lacking here: >> - At swap time, we need to have the new index definition track >> relispartition from the old index. >> - Again at swap time, the inheritance link needs to be updated between >> the old/new index and its parent when reindexing a partition index. > > Peter, this is an open item, and I think as the committer of the > feature you are its owner. Well, in this case, I don't mind taking > the ownership as need be as I know this stuff. Anyway, could you have > a look at the patch proposed and see if you have any issues with it? Looks good, committed. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Apr 12, 2019 at 08:44:18AM +0200, Peter Eisentraut wrote: > Looks good, committed. Thanks for committing! -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > So... I am coming up with the patch attached. I have introduced some > tests using a trick with CIC to have an invalid index to work on. I don't have any comments on the code (but the test looks sensible, it's the same trick I used to discover the issue in the first place). However, the doc patch lost the trailing paren: > The recommended recovery > method in such cases is to drop the index and try again to perform > - <command>CREATE INDEX CONCURRENTLY</command>. (Another possibility is to rebuild > - the index with <command>REINDEX</command>. However, since <command>REINDEX</command> > - does not support concurrent builds, this option is unlikely to seem > - attractive.) > + <command>CREATE INDEX CONCURRENTLY</command>. (Another possibility is > + to rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>. > </para> - ilmari -- - Twitter seems more influential [than blogs] in the 'gets reported in the mainstream press' sense at least. - Matt McLeod - That'd be because the content of a tweet is easier to condense down to a mainstream media article. - Calle Dybedahl
On Fri, Apr 12, 2019 at 12:11:12PM +0100, Dagfinn Ilmari Mannsåker wrote: > I don't have any comments on the code (but the test looks sensible, it's > the same trick I used to discover the issue in the first place). After thinking some more on it, this behavior looks rather sensible to me. Are there any objections? > However, the doc patch lost the trailing paren: Fixed on my branch, thanks. -- Michael
Attachment
On 2019-04-16 08:19, Michael Paquier wrote: > On Fri, Apr 12, 2019 at 12:11:12PM +0100, Dagfinn Ilmari Mannsåker wrote: >> I don't have any comments on the code (but the test looks sensible, it's >> the same trick I used to discover the issue in the first place). > > After thinking some more on it, this behavior looks rather sensible to > me. Are there any objections? Looks good to me. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Apr 16, 2019 at 08:50:31AM +0200, Peter Eisentraut wrote: > Looks good to me. Thanks, committed. If there are additional discussions on various points of the feature, let's move to a new thread please. This one has been already extensively used ;) -- Michael