Thread: REINDEX CONCURRENTLY 2.0

REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
Hi all,

Please find attached updated patches for the support of REINDEX
CONCURRENTLY, renamed 2.0 for the occasion:
- 20131114_1_index_drop_comments.patch, patch that updates some
comments in index_drop. This updates only a couple of comments in
index_drop but has not been committed yet. It should be IMO...
- 20131114_2_WaitForOldsnapshots_refactor.patch, a refactoring patch
providing a single API that can be used to wait for old snapshots
- 20131114_3_reindex_concurrently.patch, providing the core feature.
Patch 3 needs to have patch 2 applied first. Regression tests,
isolation tests and documentation are included with the patch.

This is the continuation of the previous thread that finished here:
http://www.postgresql.org/message-id/CAB7nPqS+WYN021oQHd9GPe_5dSVcVXMvEBW_E2AV9OOEwggMHw@mail.gmail.com

This patch has been added for this commit fest.
Regards,
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 11/14/13, 9:40 PM, Michael Paquier wrote:
> Hi all,
> 
> Please find attached updated patches for the support of REINDEX
> CONCURRENTLY, renamed 2.0 for the occasion:
> - 20131114_1_index_drop_comments.patch, patch that updates some
> comments in index_drop. This updates only a couple of comments in
> index_drop but has not been committed yet. It should be IMO...
> - 20131114_2_WaitForOldsnapshots_refactor.patch, a refactoring patch
> providing a single API that can be used to wait for old snapshots
> - 20131114_3_reindex_concurrently.patch, providing the core feature.
> Patch 3 needs to have patch 2 applied first. Regression tests,
> isolation tests and documentation are included with the patch.

The third patch needs to be rebased, because of a conflict in index.c.




Re: REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
Hi,

On 2013-11-15 11:40:17 +0900, Michael Paquier wrote:
> - 20131114_3_reindex_concurrently.patch, providing the core feature.
> Patch 3 needs to have patch 2 applied first. Regression tests,
> isolation tests and documentation are included with the patch.

Have you addressed my concurrency concerns from the last version?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Sat, Nov 16, 2013 at 5:09 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-11-15 11:40:17 +0900, Michael Paquier wrote:
>> - 20131114_3_reindex_concurrently.patch, providing the core feature.
>> Patch 3 needs to have patch 2 applied first. Regression tests,
>> isolation tests and documentation are included with the patch.
>
> Have you addressed my concurrency concerns from the last version?
I have added documentation in the patch with a better explanation
about why those choices of implementation are made.
Thanks,
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On 2013-11-18 19:52:29 +0900, Michael Paquier wrote:
> On Sat, Nov 16, 2013 at 5:09 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-11-15 11:40:17 +0900, Michael Paquier wrote:
> >> - 20131114_3_reindex_concurrently.patch, providing the core feature.
> >> Patch 3 needs to have patch 2 applied first. Regression tests,
> >> isolation tests and documentation are included with the patch.
> >
> > Have you addressed my concurrency concerns from the last version?
> I have added documentation in the patch with a better explanation
> about why those choices of implementation are made.

The dropping still isn't safe:
After phase 4 we are in the state:
old index: valid, live, !isdead
new index: !valid, live, !isdead
Then you do a index_concurrent_set_dead() from that state on in phase 5.
There you do WaitForLockers(locktag, AccessExclusiveLock) before
index_set_state_flags(INDEX_DROP_SET_DEAD).
That's not sufficient.

Consider what happens with the following sequence:
1) WaitForLockers(locktag, AccessExclusiveLock)  -> GetLockConflicts() => virtualxact 1  -> VirtualXactLock(1)
2) virtualxact 2 starts, opens the *old* index since it's currently the  only valid one.
3) virtualxact 1 finishes
4) index_concurrent_set_dead() does index_set_state_flags(DROP_SET_DEAD)
5) another transaction (vxid 3) starts inserting data into the relation, updates  only the new index, the old index is
dead
6) vxid 2 inserts data, updates only the old index. Since it had the  index already open the cache invalidations won't
beprocessed.
 

Now the indexes are out of sync. There's entries only in the old index
and there's entries only in the new index. Not good.

I hate to repeat myself, but you really need to follow the current
protocol for concurrently dropping indexes. Which involves *first*
marking the index as invalid so it won't be used for querying anymore,
then wait for everyone possibly still seing that entry to finish, and
only *after* that mark the index as dead. You cannot argue away
correctness concerns with potential deadlocks.

c.f. http://www.postgresql.org/message-id/20130926103400.GA2471420@alap2.anarazel.de


I am also still unconvinced that the logic in index_concurrent_swap() is
correct. It very much needs to explain why no backend can see values
that are inconsistent. E.g. what prevents a backend thinking the old and
new indexes have the same relfilenode? MVCC snapshots don't seem to
protect you against that.
I am not sure there's a problem, but there certainly needs to more
comments explaining why there are none.

Something like the following might be possible:

Backend 1: start reindex concurrently, till phase 4
Backend 2: ExecOpenIndices()          -> RelationGetIndexList (that list is consistent due to mvcc snapshots!)
Backend 2: -> index_open(old_index) (old relfilenode)
Backend 1: index_concurrent_swap()          -> CommitTransaction()          -> ProcArrayEndTransaction() (changes
visibleto others henceforth!)
 
Backend 2: -> index_open(new_index)          => no cache invalidations yet, gets the old relfilenode
Backend 2: ExecInsertIndexTuples()          => updates the same relation twice, corruptf
Backend 1: stil. in CommitTransaction()         -> AtEOXact_Inval() sends out invalidations

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REINDEX CONCURRENTLY 2.0

From
Alvaro Herrera
Date:
Michael Paquier escribió:
> Hi all,
> 
> Please find attached updated patches for the support of REINDEX
> CONCURRENTLY, renamed 2.0 for the occasion:
> - 20131114_1_index_drop_comments.patch, patch that updates some
> comments in index_drop. This updates only a couple of comments in
> index_drop but has not been committed yet. It should be IMO...

Pushed this one, thanks.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Jim Nasby
Date:
Sorry for the lateness of this...

On 11/14/13, 8:40 PM, Michael Paquier wrote:
> +    /*
> +     * Phase 4 of REINDEX CONCURRENTLY
> +     *
> +     * Now that the concurrent indexes have been validated could be used,
> +     * we need to swap each concurrent index with its corresponding old index.
> +     * Note that the concurrent index used for swaping is not marked as valid
> +     * because we need to keep the former index and the concurrent index with
> +     * a different valid status to avoid an implosion in the number of indexes
> +     * a parent relation could have if this operation fails multiple times in
> +     * a row due to a reason or another. Note that we already know thanks to
> +     * validation step that
> +     */
> +

Was there supposed to be more to that comment?

In the loop right below it...

+    /* Swap the indexes and mark the indexes that have the old data as invalid */
+    forboth(lc, indexIds, lc2, concurrentIndexIds)
...
+        CacheInvalidateRelcacheByRelid(relOid);

Do we actually need to invalidate the cache on each case? Is it because we're grabbing a new transaction each time
through?
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
Hi,

Thanks for your comments.

On Fri, Jan 10, 2014 at 9:59 AM, Jim Nasby <jim@nasby.net> wrote:
> Sorry for the lateness of this...
>
> On 11/14/13, 8:40 PM, Michael Paquier wrote:
>>
>> +       /*
>> +        * Phase 4 of REINDEX CONCURRENTLY
>> +        *
>> +        * Now that the concurrent indexes have been validated could be
>> used,
>> +        * we need to swap each concurrent index with its corresponding
>> old index.
>> +        * Note that the concurrent index used for swaping is not marked
>> as valid
>> +        * because we need to keep the former index and the concurrent
>> index with
>> +        * a different valid status to avoid an implosion in the number of
>> indexes
>> +        * a parent relation could have if this operation fails multiple
>> times in
>> +        * a row due to a reason or another. Note that we already know
>> thanks to
>> +        * validation step that
>> +        */
>> +
>
>
> Was there supposed to be more to that comment?
Not really, it seems that this chunk came out after writing multiple
successive versions of this patch.

> In the loop right below it...
>
> +       /* Swap the indexes and mark the indexes that have the old data as
> invalid */
> +       forboth(lc, indexIds, lc2, concurrentIndexIds)
> ...
> +               CacheInvalidateRelcacheByRelid(relOid);
>
> Do we actually need to invalidate the cache on each case? Is it because
> we're grabbing a new transaction each time through?
This is to force a refresh of the cached plans that have been using
the old index before transaction of step 4 began.

I have realigned this patch with latest head (d2458e3)... In case
someone is interested at some point...

Regards,
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Tue, Jan 21, 2014 at 10:12 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> I have realigned this patch with latest head (d2458e3)... In case
> someone is interested at some point...
Attached is a patch for REINDEX CONCURRENTLY rebased on HEAD
(d7938a4), as some people are showing interest in it by reading recent
discussions. Patch compiles and passes regression as well as isolation
tests..
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On 2014-08-27 11:00:56 +0900, Michael Paquier wrote:
> On Tue, Jan 21, 2014 at 10:12 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > I have realigned this patch with latest head (d2458e3)... In case
> > someone is interested at some point...
> Attached is a patch for REINDEX CONCURRENTLY rebased on HEAD
> (d7938a4), as some people are showing interest in it by reading recent
> discussions. Patch compiles and passes regression as well as isolation
> tests..

Can you add it to the next CF? I'll try to look earlier, but can't
promise anything.

I very much would like this to get committed in some form or another.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Can you add it to the next CF? I'll try to look earlier, but can't
> promise anything.
>
> I very much would like this to get committed in some form or another.
Added it here to keep track of it:
https://commitfest.postgresql.org/action/patch_view?id=1563
Regards,
-- 
Michael



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Wed, Aug 27, 2014 at 3:53 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Can you add it to the next CF? I'll try to look earlier, but can't
> promise anything.
>
> I very much would like this to get committed in some form or another.
Added it here to keep track of it:
https://commitfest.postgresql.org/action/patch_view?id=1563
Attached is a fairly-refreshed patch that should be used as a base for the next commit fest. The following changes should be noticed:
- Use of AccessExclusiveLock when swapping relfilenodes of an index and its concurrent entry instead of ShareUpdateExclusiveLock for safety. At the limit of my understanding, that's the consensus reached until now.
- Cleanup of many comments and refresh of the documentation that was somewhat wrongly-formulated or shaped at some places
- Addition of support for autocommit off in psql for REINDEX [ TABLE | INDEX ] CONCURRENTLY
- Some more code cleanup..
I haven't been through the tab completion support for psql but looking at tab-completion.c this seems a bit tricky with the stuff related to CREATE INDEX CONCURRENTLY already present. Nothing huge though.
Regards,
--
Michael
Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Jim Nasby
Date:
On 10/1/14, 2:00 AM, Michael Paquier wrote:
> On Wed, Aug 27, 2014 at 3:53 PM, Michael Paquier <michael.paquier@gmail.com <mailto:michael.paquier@gmail.com>>
wrote:
>
>     On Wed, Aug 27, 2014 at 3:41 PM, Andres Freund <andres@2ndquadrant.com <mailto:andres@2ndquadrant.com>> wrote:
>     > Can you add it to the next CF? I'll try to look earlier, but can't
>     > promise anything.
>     >
>     > I very much would like this to get committed in some form or another.
>     Added it here to keep track of it:
>     https://commitfest.postgresql.org/action/patch_view?id=1563
>
> Attached is a fairly-refreshed patch that should be used as a base for the next commit fest. The following changes
shouldbe noticed: 
> - Use of AccessExclusiveLock when swapping relfilenodes of an index and its concurrent entry instead of
ShareUpdateExclusiveLockfor safety. At the limit of my understanding, that's the consensus reached until now. 
> - Cleanup of many comments and refresh of the documentation that was somewhat wrongly-formulated or shaped at some
places
> - Addition of support for autocommit off in psql for REINDEX [ TABLE | INDEX ] CONCURRENTLY
> - Some more code cleanup..
> I haven't been through the tab completion support for psql but looking at tab-completion.c this seems a bit tricky
withthe stuff related to CREATE INDEX CONCURRENTLY already present. Nothing huge though. 

Patch applies against current HEAD and builds, but I'm getting 37 failed tests (mostly parallel, but also misc and
WITH;results attached). Is that expeccted? 

+   <para>
+    In a concurrent index build, a new index whose storage will replace the one
+    to be rebuild is actually entered into the system catalogs in one
+    transaction, then two table scans occur in two more transactions. Once this
+    is performed, the old and fresh indexes are swapped by taking a lock
+    <literal>ACCESS EXCLUSIVE</>. Finally two additional transactions
+    are used to mark the concurrent index as not ready and then drop it.
+   </para>

The "mark the concurrent index" bit is rather confusing; it sounds like it's referring to the new index instead of the
old.Now that I've read the code I understand what's going on here between the concurrent index *entry* and the filenode
swap,but I don't think the docs make this sufficiently clear to users. 

How about something like this:

The following steps occur in a concurrent index build, each in a separate transaction. Note that if there are multiple
indexesto be rebuilt then each step loops through all the indexes we're rebuilding, using a separate transaction for
eachone. 

1. A new "temporary" index definition is entered into the catalog. This definition is only used to build the new index,
andwill be removed at the completion of the process. 
2. A first pass index build is done.
3. A second pass is performed to add tuples that were added while the first pass build was running.
4. pg_class.relfilenode for the existing index definition and the "temporary" definition are swapped. This means that
theexisting index definition now uses the index data that we stored during the build, and the "temporary" definition is
usingthe old index data. 
5. The "temporary" index definition is marked as dead.
6. The "temporary" index definition and it's data (which is now the data for the old index) are dropped.


+ * index_concurrent_create
+ *
+ * Create an index based on the given one that will be used for concurrent
+ * operations. The index is inserted into catalogs and needs to be built later
+ * on. This is called during concurrent index processing. The heap relation
+ * on which is based the index needs to be closed by the caller.

Last bit presumably should be "on which the index is based".


+    /* Build the list of column names, necessary for index_create */
Instead of all this work wouldn't it be easier to create a version of index_create/ConstructTupleDescriptor that will
usethe IndexInfo for the old index? ISTM index_concurrent_create() is doing a heck of a lot of work to marshal data
intoone form just to have it get marshaled yet again. Worst case, if we do have to play this game, there should be a
stand-alonefunction to get the columns/expressions for an existing index; you're duplicating a lot of code from
pg_get_indexdef_worker().

index_concurrent_swap(): Perhaps it'd be better to create index_concurrent_swap_setup() and
index_concurrent_swap_cleanup()and refactor the duplicated code out... the actual function would then become: 


ReindexRelationConcurrently()

+ * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
+ * either an index or a table. If a table is specified, each step of REINDEX
+ * CONCURRENTLY is done in parallel with all the table's indexes as well as
+ * its dependent toast indexes.
This comment is a bit misleading; we're not actually doing anything in parallel, right? AFAICT index_concurrent_build
isgoing to block while each index is built the first time. 

+     * relkind is an index, this index itself will be rebuilt. The locks taken
+     * parent relations and involved indexes are kept until this transaction
+     * is committed to protect against schema changes that might occur until
+     * the session lock is taken on each relation.

This comment is a bit unclear to me... at minimum I think it should be "* on parent relations" instead of "* parent
relations",but I think it needs to elaborate on why/when we're also taking session level locks. 

I also wordsmithed this comment a bit...
     * Here begins the process for concurrently rebuilding the index entries.
     * We need first to create an index which is based on the same data
     * as the former index except that it will be only registered in catalogs
     * and will be built later. It is possible to perform all the operations
     * on all the indexes at the same time for a parent relation including
     * indexes for its toast relation.

and this one...
     * During this phase the concurrent indexes catch up with any new tuples that
     * were created during the previous phase.
     *
     * We once again wait until no transaction can have the table open with
     * the index marked as read-only for updates. Each index validation is done
     * in a separate transaction to minimize how long we hold an open transaction.

+     * a different valid status to avoid an implosion in the number of indexes
+     * a parent relation could have if this operation step fails multiple times
+     * in a row due to a reason or another.

I'd change that to "explosion in the number of indexes a parent relation could have if this operation fails."

Phase 4, 5 and 6 are rather confusing if you don't understand that each "concurrent index" entry is meant to be thrown
away.I think the Phase 4 comment should elaborate on that. 

The comment in check_exclusion_constraint() is good; shouldn't the related comment on this line in index_create()
mentionthat check_exclusion_constraint() needs to be changed if we ever support concurrent builds of exclusion indexes? 
    if (concurrent && is_exclusion && !is_reindex)


--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
Thanks for your input, Jim!

On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> Patch applies against current HEAD and builds, but I'm getting 37 failed
> tests (mostly parallel, but also misc and WITH; results attached). Is that
> expected?
This is caused by the recent commit 7b1c2a0 (that I actually participated in reviewing :p) because of a missing inclusion of ruleutils.h in index.c.

> The "mark the concurrent index" bit is rather confusing; it sounds like it's
> referring to the new index instead of the old. Now that I've read the code I
> understand what's going on here between the concurrent index *entry* and the
> filenode swap, but I don't think the docs make this sufficiently clear to
> users.
>
> How about something like this:
>
> The following steps occur in a concurrent index build, each in a separate
> transaction. Note that if there are multiple indexes to be rebuilt then each
> step loops through all the indexes we're rebuilding, using a separate
> transaction for each one.
>
> 1. [blah]
Definitely a good idea! I took your text and made it more precise, listing the actions done for each step, the pg_index flags switched, using <orderedlist> to make the list of steps described in a separate paragraph more exhaustive for the user. At the same time I reworked the docs removing a part that was somewhat duplicated about dealing with the constraints having invalid index entries and how to drop them.

> + * index_concurrent_create
> + *
> + * Create an index based on the given one that will be used for concurrent
> + * operations. The index is inserted into catalogs and needs to be built
> later
> + * on. This is called during concurrent index processing. The heap relation
> + * on which is based the index needs to be closed by the caller.
>
> Last bit presumably should be "on which the index is based".
What about "Create a concurrent index based on the definition of the one provided by caller"?

> +       /* Build the list of column names, necessary for index_create */
> Instead of all this work wouldn't it be easier to create a version of
> index_create/ConstructTupleDescriptor that will use the IndexInfo for the
> old index? ISTM index_concurrent_create() is doing a heck of a lot of work
> to marshal data into one form just to have it get marshaled yet again. Worst
> case, if we do have to play this game, there should be a stand-alone
> function to get the columns/expressions for an existing index; you're
> duplicating a lot of code from pg_get_indexdef_worker().
Yes, this definitely sucks and the approach creating a function to get all the column names is not productive as well. Then let's define an additional argument in index_create to pass a potential TupleDesc instead of this whole wart. I noticed as well that we need to actually reset attcacheoff to be able to use a fresh version of the tuple descriptor of the old index. I added a small API for this purpose in tupdesc.h called ResetTupleDescCache. Would it make sense instead to extend CreateTupleDescCopyConstr or CreateTupleDescCopy with a boolean flag?

> index_concurrent_swap(): Perhaps it'd be better to create
> index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and
> refactor the duplicated code out... the actual function would then become:
This sentence is not finished :) IMO, index_concurrent_swap looks good as is, taking as arguments the index and its concurrent entry, and swapping their relfilenode after taking AccessExclusiveLock that will be hold until the end of this transaction.

> ReindexRelationConcurrently()
>
> + * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
> + * either an index or a table. If a table is specified, each step of
> REINDEX
> + * CONCURRENTLY is done in parallel with all the table's indexes as well as
> + * its dependent toast indexes.
> This comment is a bit misleading; we're not actually doing anything in
> parallel, right? AFAICT index_concurrent_build is going to block while each
> index is built the first time.
Yes, parallel may be misleading. What is meant here is that each step of the process is done one by one on all the valid indexes a table may have.

> +        * relkind is an index, this index itself will be rebuilt. The locks
> taken
> +        * parent relations and involved indexes are kept until this
> transaction
> +        * is committed to protect against schema changes that might occur
> until
> +        * the session lock is taken on each relation.
>
> This comment is a bit unclear to me... at minimum I think it should be "* on
> parent relations" instead of "* parent relations", but I think it needs to
> elaborate on why/when we're also taking session level locks.
Hum, done as follows:
@@ -896,9 +896,11 @@ ReindexRelationConcurrently(Oid relationOid)
         * If the relkind of given relation Oid is a table, all its valid indexes
         * will be rebuilt, including its associated toast table indexes. If
         * relkind is an index, this index itself will be rebuilt. The locks taken
-        * parent relations and involved indexes are kept until this transaction
+        * on parent relations and involved indexes are kept until this transaction
         * is committed to protect against schema changes that might occur until
-        * the session lock is taken on each relation.
+        * the session lock is taken on each relation, session lock used to
+        * similarly protect from any schema change that could happen within the
+        * multiple transactions that are used during this process.
         */

> I also wordsmithed this comment a bit...
>          * Here begins the process for concurrently rebuilding the index
> and this one...
>          * During this phase the concurrent indexes catch up with any new
Slight differences indeed. Thanks and included.

> I'd change that to "explosion in the number of indexes a parent relation
> could have if this operation fails."
Well, implosion was more... I don't recall my state of mind when writing that. So changed the way you recommend.

> Phase 4, 5 and 6 are rather confusing if you don't understand that each
> "concurrent index" entry is meant to be thrown away. I think the Phase 4
> comment should elaborate on that.
OK, done.

> The comment in check_exclusion_constraint() is good; shouldn't the related
> comment on this line in index_create() mention that
> check_exclusion_constraint() needs to be changed if we ever support
> concurrent builds of exclusion indexes?
>
>         if (concurrent && is_exclusion && !is_reindex)
OK, what about that then:
        /*
-        * This case is currently not supported, but there's no way to ask for it
-        * in the grammar anyway, so it can't happen.
+        * This case is currently only supported during a concurrent index
+        * rebuild, but there is no way to ask for it in the grammar otherwise
+        * anyway. If support for exclusion constraints is added in the future,
+        * the check similar to this one in check_exclusion_constraint should as
+        * well be changed accordingly.

Updated patch is attached.
Thanks again.
Regards,
--
Michael
Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Jim Nasby
Date:
On 10/30/14, 3:19 AM, Michael Paquier wrote:
> Thanks for your input, Jim!
>
> On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>> wrote:
>  > Patch applies against current HEAD and builds, but I'm getting 37 failed
>  > tests (mostly parallel, but also misc and WITH; results attached). Is that
>  > expected?
> This is caused by the recent commit 7b1c2a0 (that I actually participated in reviewing :p) because of a missing
inclusionof ruleutils.h in index.c.
 

>  > The "mark the concurrent index" bit is rather confusing; it sounds like it's
>  > referring to the new index instead of the old. Now that I've read the code I
>  > understand what's going on here between the concurrent index *entry* and the
>  > filenode swap, but I don't think the docs make this sufficiently clear to
>  > users.
>  >
>  > How about something like this:
>  >
>  > The following steps occur in a concurrent index build, each in a separate
>  > transaction. Note that if there are multiple indexes to be rebuilt then each
>  > step loops through all the indexes we're rebuilding, using a separate
>  > transaction for each one.
>  >
>  > 1. [blah]
> Definitely a good idea! I took your text and made it more precise, listing the actions done for each step, the
pg_indexflags switched, using <orderedlist> to make the list of steps described in a separate paragraph more exhaustive
forthe user. At the same time I reworked the docs removing a part that was somewhat duplicated about dealing with the
constraintshaving invalid index entries and how to drop them.
 

Awesome! Just a few items here:

+       Then a second pass is performed to add tuples that were added while
+       the first pass build was running. One the validation of the index

s/One the/Once the/

>  > + * index_concurrent_create
>  > + *
>  > + * Create an index based on the given one that will be used for concurrent
>  > + * operations. The index is inserted into catalogs and needs to be built
>  > later
>  > + * on. This is called during concurrent index processing. The heap relation
>  > + * on which is based the index needs to be closed by the caller.
>  >
>  > Last bit presumably should be "on which the index is based".
> What about "Create a concurrent index based on the definition of the one provided by caller"?

That's good too, but my comment was on the last sentence, not the first.

>  > +       /* Build the list of column names, necessary for index_create */
>  > Instead of all this work wouldn't it be easier to create a version of
>  > index_create/ConstructTupleDescriptor that will use the IndexInfo for the
>  > old index? ISTM index_concurrent_create() is doing a heck of a lot of work
>  > to marshal data into one form just to have it get marshaled yet again. Worst
>  > case, if we do have to play this game, there should be a stand-alone
>  > function to get the columns/expressions for an existing index; you're
>  > duplicating a lot of code from pg_get_indexdef_worker().
> Yes, this definitely sucks and the approach creating a function to get all the column names is not productive as
well.Then let's define an additional argument in index_create to pass a potential TupleDesc instead of this whole wart.
Inoticed as well that we need to actually reset attcacheoff to be able to use a fresh version of the tuple descriptor
ofthe old index. I added a small API for this purpose in tupdesc.h called ResetTupleDescCache. Would it make sense
insteadto extend CreateTupleDescCopyConstr or CreateTupleDescCopy with a boolean flag?
 

Perhaps there'd be other places that would want to reset the stats, so I lean slightly that direction.

The comment at the beginning of index_create needs to be modified to mention tupdesc and especially that setting
tupdescover-rides indexColNames.
 

>  > index_concurrent_swap(): Perhaps it'd be better to create
>  > index_concurrent_swap_setup() and index_concurrent_swap_cleanup() and
>  > refactor the duplicated code out... the actual function would then become:
> This sentence is not finished :) IMO, index_concurrent_swap looks good as is, taking as arguments the index and its
concurrententry, and swapping their relfilenode after taking AccessExclusiveLock that will be hold until the end of
thistransaction.
 

Fair enough.

>  > ReindexRelationConcurrently()
>  >
>  > + * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
>  > + * either an index or a table. If a table is specified, each step of
>  > REINDEX
>  > + * CONCURRENTLY is done in parallel with all the table's indexes as well as
>  > + * its dependent toast indexes.
>  > This comment is a bit misleading; we're not actually doing anything in
>  > parallel, right? AFAICT index_concurrent_build is going to block while each
>  > index is built the first time.
> Yes, parallel may be misleading. What is meant here is that each step of the process is done one by one on all the
validindexes a table may have.
 

New comment looks good.

>  > +        * relkind is an index, this index itself will be rebuilt. The locks
>  > taken
>  > +        * parent relations and involved indexes are kept until this
>  > transaction
>  > +        * is committed to protect against schema changes that might occur
>  > until
>  > +        * the session lock is taken on each relation.
>  >
>  > This comment is a bit unclear to me... at minimum I think it should be "* on
>  > parent relations" instead of "* parent relations", but I think it needs to
>  > elaborate on why/when we're also taking session level locks.
> Hum, done as follows:
> @@ -896,9 +896,11 @@ ReindexRelationConcurrently(Oid relationOid)
>           * If the relkind of given relation Oid is a table, all its valid indexes
>           * will be rebuilt, including its associated toast table indexes. If
>           * relkind is an index, this index itself will be rebuilt. The locks taken
> -        * parent relations and involved indexes are kept until this transaction
> +        * on parent relations and involved indexes are kept until this transaction
>           * is committed to protect against schema changes that might occur until
> -        * the session lock is taken on each relation.
> +        * the session lock is taken on each relation, session lock used to
> +        * similarly protect from any schema change that could happen within the
> +        * multiple transactions that are used during this process.
>           */

Cool.


> OK, what about that then:
>          /*
> -        * This case is currently not supported, but there's no way to ask for it
> -        * in the grammar anyway, so it can't happen.
> +        * This case is currently only supported during a concurrent index
> +        * rebuild, but there is no way to ask for it in the grammar otherwise
> +        * anyway. If support for exclusion constraints is added in the future,
> +        * the check similar to this one in check_exclusion_constraint should as
> +        * well be changed accordingly.
>
> Updated patch is attached.

Works for me.

Keep in mind I'm not super familiar with the guts of index creation, so it'd be good for someone else to look at that
bit(especially index_concurrent_create and ReindexRelationConcurrently).
 
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: REINDEX CONCURRENTLY 2.0

From
Jim Nasby
Date:
On 10/30/14, 3:19 AM, Michael Paquier wrote:
> On Wed, Oct 29, 2014 at 7:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>> wrote:
>  > Patch applies against current HEAD and builds, but I'm getting 37 failed
>  > tests (mostly parallel, but also misc and WITH; results attached). Is that
>  > expected?
> This is caused by the recent commit 7b1c2a0 (that I actually participated in reviewing :p) because of a missing
inclusionof ruleutils.h in index.c.
 

Sorry, forgot to mention patch now passes make check cleanly.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Updated patch is attached.
Please find attached an updated patch with the following things changed:
- Addition of tab completion in psql for all new commands
- Addition of a call to WaitForLockers in index_concurrent_swap to
ensure that there are no running transactions on the parent table
running before exclusive locks are taken on the index and its
concurrent entry. Previous patch versions created deadlocks because of
that, issue spotted by the isolation tests integrated in the patch.
- Isolation tests for reindex concurrently are re-enabled by default.
Regards,
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 10/1/14 3:00 AM, Michael Paquier wrote:
> - Use of AccessExclusiveLock when swapping relfilenodes of an index and
> its concurrent entry instead of ShareUpdateExclusiveLock for safety. At
> the limit of my understanding, that's the consensus reached until now.

I'm very curious about this point.  I looked through all the previous
discussions, and the only time I saw this mentioned was at the very
beginning when it was said that we could review the patch while ignoring
this issue and fix it later with MVCC catalog access.  Then it got very
technical, but it was never explicitly concluded whether it was possible
to fix this or not.

Also, in the thread "Concurrently option for reindexdb" you pointed out
that requiring an exclusive lock isn't really concurrent and proposed an
option like --minimum-locks.

I will point out again that we specifically invented DROP INDEX
CONCURRENTLY because holding an exclusive lock even briefly isn't good
enough.

If REINDEX cannot work without an exclusive lock, we should invent some
other qualifier, like WITH FEWER LOCKS.  It's still useful, but we
shouldn't give people the idea that they can throw away their custom
CREATE INDEX CONCURRENTLY + DROP INDEX CONCURRENTLY scripts.




Re: REINDEX CONCURRENTLY 2.0

From
Jeff Janes
Date:
On Wed, Nov 5, 2014 at 8:49 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Updated patch is attached.
Please find attached an updated patch with the following things changed:
- Addition of tab completion in psql for all new commands
- Addition of a call to WaitForLockers in index_concurrent_swap to
ensure that there are no running transactions on the parent table
running before exclusive locks are taken on the index and its
concurrent entry. Previous patch versions created deadlocks because of
that, issue spotted by the isolation tests integrated in the patch.
- Isolation tests for reindex concurrently are re-enabled by default.
Regards,

It looks like this needs another rebase, I get failures on index.c, toasting.c, indexcmds.c, and index.h

Thanks,

Jeff

Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Tue, Nov 11, 2014 at 3:24 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
> On Wed, Nov 5, 2014 at 8:49 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
>>
>> On Thu, Oct 30, 2014 at 5:19 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>> > Updated patch is attached.
>> Please find attached an updated patch with the following things changed:
>> - Addition of tab completion in psql for all new commands
>> - Addition of a call to WaitForLockers in index_concurrent_swap to
>> ensure that there are no running transactions on the parent table
>> running before exclusive locks are taken on the index and its
>> concurrent entry. Previous patch versions created deadlocks because of
>> that, issue spotted by the isolation tests integrated in the patch.
>> - Isolation tests for reindex concurrently are re-enabled by default.
>> Regards,
>
>
> It looks like this needs another rebase, I get failures on index.c, toasting.c, indexcmds.c, and index.h

Indeed. There are some conflicts created by the recent modification of
index_create. Here is a rebased patch.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Robert Haas
Date:
On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> If REINDEX cannot work without an exclusive lock, we should invent some
> other qualifier, like WITH FEWER LOCKS.

What he said.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: REINDEX CONCURRENTLY 2.0

From
Robert Haas
Date:
On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> If REINDEX cannot work without an exclusive lock, we should invent some
>> other qualifier, like WITH FEWER LOCKS.
>
> What he said.

But more to the point .... why, precisely, can't this work without an
AccessExclusiveLock?  And can't we fix that instead of setting for
something clearly inferior?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
> On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> >> If REINDEX cannot work without an exclusive lock, we should invent some
> >> other qualifier, like WITH FEWER LOCKS.
> >
> > What he said.

I'm unconvinced. A *short* exclusive lock (just to update two pg_class
row), still gives most of the benefits of CONCURRENTLY. Also, I do think
we can get rid of that period in the not too far away future.

> But more to the point .... why, precisely, can't this work without an
> AccessExclusiveLock?  And can't we fix that instead of setting for
> something clearly inferior?

It's nontrivial to fix, but I think we can fix it at some point. I just
think we should get the *major* part of the feature before investing
lots of time making it even better. There's *very* frequent questions
about having this. And people do really dangerous stuff (like manually
updating pg_class.relfilenode and such) to cope.

The problem is that it's very hard to avoid the wrong index's
relfilenode being used when swapping the relfilenodes between two
indexes.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REINDEX CONCURRENTLY 2.0

From
Robert Haas
Date:
On Wed, Nov 12, 2014 at 4:39 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
>> On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> >> If REINDEX cannot work without an exclusive lock, we should invent some
>> >> other qualifier, like WITH FEWER LOCKS.
>> >
>> > What he said.
>
> I'm unconvinced. A *short* exclusive lock (just to update two pg_class
> row), still gives most of the benefits of CONCURRENTLY.

I am pretty doubtful about that.  It's still going to require you to
wait for all transactions to drain out of the table while new ones are
blocked from entering.  Which sucks.  Unless all of your transactions
are very short, but that's not necessarily typical.

> The problem is that it's very hard to avoid the wrong index's
> relfilenode being used when swapping the relfilenodes between two
> indexes.

How about storing both the old and new relfilenodes in the same pg_class entry?

1. Take a snapshot.
2. Index all the tuples in that snapshot.
3. Publish the new relfilenode to an additional pg_class column,
relnewfilenode or similar.
4. Wait until everyone can see step #3.
5. Rescan the table and add any missing tuples to the index.
6. Set some flag in pg_class to mark the relnewfilenode as active and
relfilenode as not to be used for queries.
7. Wait until everyone can see step #6.
8. Set some flag in pg_class to mark relfilenode as not even to be opened.
9. Wait until everyone can see step #8.
10. Drop old relfilenode.
11. Clean up by setting relfilenode = relnewfilenode, relfilenode = 0.

This is basically CREATE INDEX CONCURRENTLY (without the first step
where we out-wait people who might create now-invalid HOT chains,
because that can't arise with a REINDEX of an existing index) plus
DROP INDEX CONCURRENTLY.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On 2014-11-12 18:23:38 -0500, Robert Haas wrote:
> On Wed, Nov 12, 2014 at 4:39 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
> >> On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> >> >> If REINDEX cannot work without an exclusive lock, we should invent some
> >> >> other qualifier, like WITH FEWER LOCKS.
> >> >
> >> > What he said.
> >
> > I'm unconvinced. A *short* exclusive lock (just to update two pg_class
> > row), still gives most of the benefits of CONCURRENTLY.
> 
> I am pretty doubtful about that.  It's still going to require you to
> wait for all transactions to drain out of the table while new ones are
> blocked from entering.  Which sucks.  Unless all of your transactions
> are very short, but that's not necessarily typical.

Yes, it sucks. But it beats not being able to reindex a relation with a
primary key (referenced by a fkey) without waiting several hours by a
couple magnitudes. And that's the current situation.

> > The problem is that it's very hard to avoid the wrong index's
> > relfilenode being used when swapping the relfilenodes between two
> > indexes.
> 
> How about storing both the old and new relfilenodes in the same pg_class entry?

That's quite a cool idea

[think a bit]

But I think it won't work realistically. We have a *lot* of
infrastructure that refers to indexes using it's primary key. I don't
think we want to touch all those places to also disambiguate on some
other factor. All the relevant APIs are either just passing around oids
or relcache entries.

There's also the problem that we'd really need two different pg_index
rows to make things work. Alternatively we can duplicate the three
relevant columns (indisready, indislive, indislive) in there for the
different filenodes. But that's not entirely pretty.

> 1. Take a snapshot.
> 2. Index all the tuples in that snapshot.
> 3. Publish the new relfilenode to an additional pg_class column,
> relnewfilenode or similar.
> 4. Wait until everyone can see step #3.

Here all backends need to update both indexes, right? And all the
indexing infrastructure can't deal with that without having separate
oids & relcache entries.

> 5. Rescan the table and add any missing tuples to the index.
> 6. Set some flag in pg_class to mark the relnewfilenode as active and
> relfilenode as not to be used for queries.
> 7. Wait until everyone can see step #6.
> 8. Set some flag in pg_class to mark relfilenode as not even to be opened.
> 9. Wait until everyone can see step #8.
> 10. Drop old relfilenode.
> 11. Clean up by setting relfilenode = relnewfilenode, relfilenode = 0.

Even that one isn't trivial - how do you deal with the fact that
somebody looking at updating newrelfilenode might, in the midst of
processing, see newrelfilenode = 0?


I've earlier come up with a couple possible solutions, but I
unfortunately found holes in all of them. And if I can find holes in
them, there surely are more :(.

I don't recall what the problem with just swapping the names was - but
I'm pretty sure there was one... Hm. The index relation oids are
referred to by constraints and dependencies. That's somewhat
solvable. But I think there was something else as well...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REINDEX CONCURRENTLY 2.0

From
Alvaro Herrera
Date:
Andres Freund wrote:
> On 2014-11-12 18:23:38 -0500, Robert Haas wrote:

> > > The problem is that it's very hard to avoid the wrong index's
> > > relfilenode being used when swapping the relfilenodes between two
> > > indexes.
> > 
> > How about storing both the old and new relfilenodes in the same pg_class entry?
> 
> That's quite a cool idea
> 
> [think a bit]
> 
> But I think it won't work realistically. We have a *lot* of
> infrastructure that refers to indexes using it's primary key.

Hmm, can we make the relmapper do this job instead of having another
pg_class column?  Essentially the same sketch Robert proposed, instead
we would initially set relfilenode=0 and have all onlookers use the
relmapper to obtain the correct relfilenode; switching to the new
relfilenode can be done atomically, and un-relmap the index once the
process is complete.

The difference from what Robert proposes is that the transient state is
known to cause failures for anyone not prepared to deal with it, so it
should be easy to spot what places need adjustment.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Nov 13, 2014 at 9:31 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> I don't recall what the problem with just swapping the names was - but
> I'm pretty sure there was one... Hm. The index relation oids are
> referred to by constraints and dependencies. That's somewhat
> solvable. But I think there was something else as well...
The reason given 2 years ago for not using relname was the fast that
the oid of the index changes, and to it be refered by some pg_depend
entries:
http://www.postgresql.org/message-id/20121208133730.GA6422@awork2.anarazel.de
http://www.postgresql.org/message-id/12742.1354977643@sss.pgh.pa.us
Regards,
-- 
Michael



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Nov 13, 2014 at 10:26 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Nov 13, 2014 at 9:31 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> I don't recall what the problem with just swapping the names was - but
>> I'm pretty sure there was one... Hm. The index relation oids are
>> referred to by constraints and dependencies. That's somewhat
>> solvable. But I think there was something else as well...
> The reason given 2 years ago for not using relname was the fast that
> the oid of the index changes, and to it be refered by some pg_depend
> entries:
Feel free to correct: "and that it could be referred".
-- 
Michael



Re: REINDEX CONCURRENTLY 2.0

From
Robert Haas
Date:
On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> But I think it won't work realistically. We have a *lot* of
> infrastructure that refers to indexes using it's primary key. I don't
> think we want to touch all those places to also disambiguate on some
> other factor. All the relevant APIs are either just passing around oids
> or relcache entries.

I'm not quite following this.  The whole point is to AVOID having two
indexes.  You have one index which may at times have two sets of
physical storage.

> There's also the problem that we'd really need two different pg_index
> rows to make things work. Alternatively we can duplicate the three
> relevant columns (indisready, indislive, indislive) in there for the
> different filenodes. But that's not entirely pretty.

I think what you would probably end up with is a single "char" or int2
column that defines the state of the index.  Certain states would be
valid only when relnewfilenode != 0.

>> 1. Take a snapshot.
>> 2. Index all the tuples in that snapshot.
>> 3. Publish the new relfilenode to an additional pg_class column,
>> relnewfilenode or similar.
>> 4. Wait until everyone can see step #3.
>
> Here all backends need to update both indexes, right?

Yes.

> And all the
> indexing infrastructure can't deal with that without having separate
> oids & relcache entries.

Why not?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 11/12/14 7:31 PM, Andres Freund wrote:
> Yes, it sucks. But it beats not being able to reindex a relation with a
> primary key (referenced by a fkey) without waiting several hours by a
> couple magnitudes. And that's the current situation.

That's fine, but we have, for better or worse, defined CONCURRENTLY :=
does not take exclusive locks.  Use a different adverb for an in-between
facility.




Re: REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On November 13, 2014 10:23:41 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote:
>On 11/12/14 7:31 PM, Andres Freund wrote:
>> Yes, it sucks. But it beats not being able to reindex a relation with
>a
>> primary key (referenced by a fkey) without waiting several hours by a
>> couple magnitudes. And that's the current situation.
>
>That's fine, but we have, for better or worse, defined CONCURRENTLY :=
>does not take exclusive locks.  Use a different adverb for an
>in-between
>facility.

I think that's not actually a service to our users. They'll have to adapt their scripts and knowledge when we get
aroundto the more concurrent version. What exactly CONCURRENTLY means is already not strictly defined and differs
betweenthe actions.
 

I'll note that DROP INDEX CONCURRENTLY actually already  internally acquires an AEL lock. Although it's a bit harder to
seethe consequences of that.
 

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund                       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Jim Nasby
Date:
On 11/13/14, 3:50 PM, Andres Freund wrote:
> On November 13, 2014 10:23:41 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote:
>> On 11/12/14 7:31 PM, Andres Freund wrote:
>>> Yes, it sucks. But it beats not being able to reindex a relation with
>> a
>>> primary key (referenced by a fkey) without waiting several hours by a
>>> couple magnitudes. And that's the current situation.
>>
>> That's fine, but we have, for better or worse, defined CONCURRENTLY :=
>> does not take exclusive locks.  Use a different adverb for an
>> in-between
>> facility.
>
> I think that's not actually a service to our users. They'll have to adapt their scripts and knowledge when we get
aroundto the more concurrent version. What exactly CONCURRENTLY means is already not strictly defined and differs
betweenthe actions.
 

It also means that if we ever found a way to get rid of the exclusive lock we'd then have an inconsistency anyway. Or
we'dalso create REINDEX CONCURRENT at that time, and then have 2 command syntaxes to support.
 

> I'll note that DROP INDEX CONCURRENTLY actually already  internally acquires an AEL lock. Although it's a bit harder
tosee the consequences of that.
 

Having been responsible for a site where downtime was a 6 figure dollar amount per hour, I've spent a LOT of time
worryingabout lock problems. The really big issue here isn't grabbing an exclusive lock; it's grabbing one at some
randomtime when no one is there to actively monitor what's happening. (If you can't handle *any* exclusive locks, that
alsomeans you can never do an ALTER TABLE ADD COLUMN either.) With that in mind, would it be possible to set this up so
thatthe time-consuming process of building the new index file happens first, and then (optionally) some sort of DBA
actionis required to actually do the relfilenode swap? I realize that's not the most elegant solution, but it's WAY
betterthan this feature not hitting 9.5 and people having to hand-code a solution.
 

Possible syntax:
REINDEX CONCURRENTLY -- Does what current patch does
REINDEX CONCURRENT BUILD -- Builds new files
REINDEX CONCURRENT SWAP -- Swaps new files in

This suffers from the syntax problems I mentioned above, but at least this way it's all limited to one command, and it
probablyallows a lot more people to use it.
 
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On 2014-11-14 02:04:00 -0600, Jim Nasby wrote:
> On 11/13/14, 3:50 PM, Andres Freund wrote:
> Having been responsible for a site where downtime was a 6 figure
> dollar amount per hour, I've spent a LOT of time worrying about lock
> problems. The really big issue here isn't grabbing an exclusive lock;
> it's grabbing one at some random time when no one is there to actively
> monitor what's happening. (If you can't handle *any* exclusive locks,
> that also means you can never do an ALTER TABLE ADD COLUMN either.)

> With that in mind, would it be possible to set this up so that the
> time-consuming process of building the new index file happens first,
> and then (optionally) some sort of DBA action is required to actually
> do the relfilenode swap? I realize that's not the most elegant
> solution, but it's WAY better than this feature not hitting 9.5 and
> people having to hand-code a solution.

I don't think having a multi step version of the feature and it not
making into 9.5 are synonymous. And I really don't want to make it even
more complex before we have the basic version in.

I think a split like your:

> Possible syntax:
> REINDEX CONCURRENTLY -- Does what current patch does
> REINDEX CONCURRENT BUILD -- Builds new files
> REINDEX CONCURRENT SWAP -- Swaps new files in

could make sense, but it's really an additional feature ontop.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On 2014-11-13 11:41:18 -0500, Robert Haas wrote:
> On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > But I think it won't work realistically. We have a *lot* of
> > infrastructure that refers to indexes using it's primary key. I don't
> > think we want to touch all those places to also disambiguate on some
> > other factor. All the relevant APIs are either just passing around oids
> > or relcache entries.
> 
> I'm not quite following this.  The whole point is to AVOID having two
> indexes.  You have one index which may at times have two sets of
> physical storage.

Right. But how are we going to refer to these different relfilenodes?
All the indexing infrastructure just uses oids and/or Relation pointers
to refer to the index. How would you hand down the knowledge which of
the relfilenodes is supposed to be used in some callchain?

There's ugly solutions like having a flag like 'bool
rd_useotherfilenode' inside struct RelationData, but even ignoring the
uglyness I don't think that'd work well - what if some function called
inside the index code again starts a index lookup?

I think I might just not getting your idea here?

> > And all the
> > indexing infrastructure can't deal with that without having separate
> > oids & relcache entries.

Hopefully explained above?

Greetings,

Andres Freund



Re: REINDEX CONCURRENTLY 2.0

From
Robert Haas
Date:
On Fri, Nov 14, 2014 at 11:47 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-11-13 11:41:18 -0500, Robert Haas wrote:
>> On Wed, Nov 12, 2014 at 7:31 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> > But I think it won't work realistically. We have a *lot* of
>> > infrastructure that refers to indexes using it's primary key. I don't
>> > think we want to touch all those places to also disambiguate on some
>> > other factor. All the relevant APIs are either just passing around oids
>> > or relcache entries.
>>
>> I'm not quite following this.  The whole point is to AVOID having two
>> indexes.  You have one index which may at times have two sets of
>> physical storage.
>
> Right. But how are we going to refer to these different relfilenodes?
> All the indexing infrastructure just uses oids and/or Relation pointers
> to refer to the index. How would you hand down the knowledge which of
> the relfilenodes is supposed to be used in some callchain?

If you've got a Relation, you don't need someone to tell you which
physical storage to use; you can figure that out for yourself by
looking at the Relation.  If you've got an OID, you're probably going
to go conjure up a Relation, and then you can do the same thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Nov 13, 2014 at 10:25 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Andres Freund wrote:
>> On 2014-11-12 18:23:38 -0500, Robert Haas wrote:
>
>> > > The problem is that it's very hard to avoid the wrong index's
>> > > relfilenode being used when swapping the relfilenodes between two
>> > > indexes.
>> >
>> > How about storing both the old and new relfilenodes in the same pg_class entry?
>>
>> That's quite a cool idea
>>
>> [think a bit]
>>
>> But I think it won't work realistically. We have a *lot* of
>> infrastructure that refers to indexes using it's primary key.
>
> Hmm, can we make the relmapper do this job instead of having another
> pg_class column?  Essentially the same sketch Robert proposed, instead
> we would initially set relfilenode=0 and have all onlookers use the
> relmapper to obtain the correct relfilenode; switching to the new
> relfilenode can be done atomically, and un-relmap the index once the
> process is complete.
> The difference from what Robert proposes is that the transient state is
> known to cause failures for anyone not prepared to deal with it, so it
> should be easy to spot what places need adjustment.

How would the failure handling actually work? Would we need some extra
process to remove the extra relfilenodes? Note that in the current
patch the temporary concurrent entry is kept as INVALID all the time,
giving the user a path to remove them with DROP INDEX even in the case
of invalid toast indexes in catalog pg_toast.

Note that I am on the side of using the exclusive lock when swapping
relfilenodes for now in any case, that's what pg_repack does by
renaming the indexes, and people use it.
-- 
Michael



Re: REINDEX CONCURRENTLY 2.0

From
Oskari Saarenmaa
Date:
13.11.2014, 23:50, Andres Freund kirjoitti:
> On November 13, 2014 10:23:41 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote:
>> On 11/12/14 7:31 PM, Andres Freund wrote:
>>> Yes, it sucks. But it beats not being able to reindex a relation with
>> a
>>> primary key (referenced by a fkey) without waiting several hours by a
>>> couple magnitudes. And that's the current situation.
>>
>> That's fine, but we have, for better or worse, defined CONCURRENTLY :=
>> does not take exclusive locks.  Use a different adverb for an
>> in-between
>> facility.
> 
> I think that's not actually a service to our users. They'll have to adapt their scripts and knowledge when we get
aroundto the more concurrent version. What exactly CONCURRENTLY means is already not strictly defined and differs
betweenthe actions.
 
> 
> I'll note that DROP INDEX CONCURRENTLY actually already  internally acquires an AEL lock. Although it's a bit harder
tosee the consequences of that.
 

If the short-lived lock is the only blocker for this feature at the
moment could we just require an additional qualifier for CONCURRENTLY
(FORCE?) until the lock can be removed, something like:

tmp# REINDEX INDEX CONCURRENTLY tmp_pkey;
ERROR:  REINDEX INDEX CONCURRENTLY is not fully concurrent; use REINDEX
INDEX CONCURRENTLY FORCE to perform reindex with a short-lived lock.

tmp=# REINDEX INDEX CONCURRENTLY FORCE tmp_pkey;
REINDEX

It's not optimal, but currently there's no way to reindex a primary key
anywhere close to concurrently and a short lock would be a huge
improvement over the current situation.

/ Oskari




Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Tue, Dec 23, 2014 at 5:54 PM, Oskari Saarenmaa <os@ohmu.fi> wrote:
>
> If the short-lived lock is the only blocker for this feature at the
> moment could we just require an additional qualifier for CONCURRENTLY
> (FORCE?) until the lock can be removed, something like:
> =# [blah]

FWIW, I'd just keep only CONCURRENTLY with no fancy additional
keywords even if we cheat on it, as long as it is precised in the
documentation that an exclusive lock is taken for a very short time,
largely shorter than what a normal REINDEX would do btw.

> It's not optimal, but currently there's no way to reindex a primary key
> anywhere close to concurrently and a short lock would be a huge
> improvement over the current situation.
Yep.
-- 
Michael



Re: REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
> On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> >> If REINDEX cannot work without an exclusive lock, we should invent some
> >> other qualifier, like WITH FEWER LOCKS.
> >
> > What he said.
> 
> But more to the point .... why, precisely, can't this work without an
> AccessExclusiveLock?  And can't we fix that instead of setting for
> something clearly inferior?

So, here's an alternative approach of how to get rid of the AEL
locks. They're required because we want to switch the relfilenodes
around. I've pretty much no confidence in any of the schemes anybody has
come up to avoid that.

So, let's not switch relfilenodes around.

I think if we should instead just use the new index, repoint the
dependencies onto the new oid, and then afterwards, when dropping,
rename the new index one onto the old one. That means the oid of the
index will change and some less than pretty grovelling around
dependencies, but it still seems preferrable to what we're discussing
here otherwise.

Does anybody see a fundamental problem with that approach?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: REINDEX CONCURRENTLY 2.0

From
Robert Haas
Date:
On Mon, Feb 2, 2015 at 9:10 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-11-12 16:11:58 -0500, Robert Haas wrote:
>> On Wed, Nov 12, 2014 at 4:10 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> > On Thu, Nov 6, 2014 at 9:50 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> >> If REINDEX cannot work without an exclusive lock, we should invent some
>> >> other qualifier, like WITH FEWER LOCKS.
>> >
>> > What he said.
>>
>> But more to the point .... why, precisely, can't this work without an
>> AccessExclusiveLock?  And can't we fix that instead of setting for
>> something clearly inferior?
>
> So, here's an alternative approach of how to get rid of the AEL
> locks. They're required because we want to switch the relfilenodes
> around. I've pretty much no confidence in any of the schemes anybody has
> come up to avoid that.
>
> So, let's not switch relfilenodes around.
>
> I think if we should instead just use the new index, repoint the
> dependencies onto the new oid, and then afterwards, when dropping,
> rename the new index one onto the old one. That means the oid of the
> index will change and some less than pretty grovelling around
> dependencies, but it still seems preferrable to what we're discussing
> here otherwise.
>
> Does anybody see a fundamental problem with that approach?

I'm not sure whether that will work out, but it seems worth a try.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 02/02/2015 03:10 PM, Andres Freund wrote:
> I think if we should instead just use the new index, repoint the
> dependencies onto the new oid, and then afterwards, when dropping,
> rename the new index one onto the old one. That means the oid of the
> index will change and some less than pretty grovelling around
> dependencies, but it still seems preferrable to what we're discussing
> here otherwise.

I think that sounds like a good plan. The oid change does not seem like 
a too big deal to me, especially since that is what users will get now 
too. Do you still think this is the right way to solve this?

I have attached my work in progress patch which implements and is very 
heavily based on Michael's previous work. There are some things left to 
do but I think I should have a patch ready for the next commitfest if 
people still like this type of solution.

I also changed index_set_state_flags() to be transactional since I 
wanted the old index to become invalid at exactly the same time as the 
new becomes valid. From reviewing the code that seems like a safe change.

A couple of bike shedding questions:

- Is the syntax "REINDEX <type> CONCUURENTLY <object>" ok?

- What should we do with REINDEX DATABASE CONCURRENTLY and the system 
catalog? I so not think we can reindex the system catalog concurrently 
safely, so what should REINDEX DATABASE do with the catalog indexes? 
Skip them, reindex them while taking locks, or just error out?

- What level of information should be output in VERBOSE mode?

What remains to be implemented:

- Support for exclusion constraints
- Look more into how I handle constraints (currently the temporary index 
too seems to have the PRIMARY KEY flag)
- Support for the VERBOSE flag
- More testing to catch bugs

Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Sun, Feb 12, 2017 at 6:44 AM, Andreas Karlsson <andreas@proxel.se> wrote:
> On 02/02/2015 03:10 PM, Andres Freund wrote:
>> I think if we should instead just use the new index, repoint the
>> dependencies onto the new oid, and then afterwards, when dropping,
>> rename the new index one onto the old one. That means the oid of the
>> index will change and some less than pretty grovelling around
>> dependencies, but it still seems preferrable to what we're discussing
>> here otherwise.
>
> I think that sounds like a good plan. The oid change does not seem like a
> too big deal to me, especially since that is what users will get now too. Do
> you still think this is the right way to solve this?

That hurts mainly system indexes. Perhaps users with broken system
indexes are not going to care about concurrency anyway. Thinking now
about it I don't see how that would not work, but I did not think
deeply about this problem lately.

> I have attached my work in progress patch which implements and is very
> heavily based on Michael's previous work. There are some things left to do
> but I think I should have a patch ready for the next commitfest if people
> still like this type of solution.

Cool to see a rebase of this patch. It's been a long time...

> I also changed index_set_state_flags() to be transactional since I wanted
> the old index to become invalid at exactly the same time as the new becomes
> valid. From reviewing the code that seems like a safe change.
>
> A couple of bike shedding questions:
>
> - Is the syntax "REINDEX <type> CONCUURENTLY <object>" ok?

Yeah, that's fine. At least that's what has been concluded in previous threads.

> - What should we do with REINDEX DATABASE CONCURRENTLY and the system
> catalog? I so not think we can reindex the system catalog concurrently
> safely, so what should REINDEX DATABASE do with the catalog indexes? Skip
> them, reindex them while taking locks, or just error out?

System indexes cannot have their OIDs changed as they are used in
syscache lookups. So just logging a warning looks fine to me, and the
price to pay to avoid taking an exclusive lock even for a short amount
of time.

> - What level of information should be output in VERBOSE mode?

Er, something like that as well, no?
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

> What remains to be implemented:
> - Support for exclusion constraints
> - Look more into how I handle constraints (currently the temporary index too
> seems to have the PRIMARY KEY flag)
> - Support for the VERBOSE flag
> - More testing to catch bugs

This is a crasher:
create table aa (a int primary key);
reindex (verbose) schema concurrently public ;

For invalid indexes sometimes snapshots are still active (after
issuing the previous crash for example):
=# reindex (verbose) table concurrently aa;
WARNING:  XX002: cannot reindex concurrently invalid index
"public.aa_pkey_cct", skipping
LOCATION:  ReindexRelationConcurrently, indexcmds.c:2119
WARNING:  01000: snapshot 0x7fde12003038 still active
-- 
Michael



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 02/13/2017 06:31 AM, Michael Paquier wrote:
>> - What should we do with REINDEX DATABASE CONCURRENTLY and the system
>> catalog? I so not think we can reindex the system catalog concurrently
>> safely, so what should REINDEX DATABASE do with the catalog indexes? Skip
>> them, reindex them while taking locks, or just error out?
>
> System indexes cannot have their OIDs changed as they are used in
> syscache lookups. So just logging a warning looks fine to me, and the
> price to pay to avoid taking an exclusive lock even for a short amount
> of time.

Good idea, I think I will add one line of warning if it finds any system 
index in the schema.

>> - What level of information should be output in VERBOSE mode?
>
> Er, something like that as well, no?
> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

REINDEX (VERBOSE) currently prints one such line per index, which does 
not really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all 
indexes on a relation at the same time. It is not immediately obvious 
how this should work. Maybe one such detail line per table?

> This is a crasher:
> create table aa (a int primary key);
> reindex (verbose) schema concurrently public ;
>
> For invalid indexes sometimes snapshots are still active (after
> issuing the previous crash for example):
> =# reindex (verbose) table concurrently aa;
> WARNING:  XX002: cannot reindex concurrently invalid index
> "public.aa_pkey_cct", skipping
> LOCATION:  ReindexRelationConcurrently, indexcmds.c:2119
> WARNING:  01000: snapshot 0x7fde12003038 still active

Thanks for testing the patch! The crash was caused by things being 
allocated in the wrong memory context when reindexing multiple tables 
and therefore freed on the first intermediate commit. I have created a 
new memory context to handle this in which I only allocate the lists 
which need to survive between transactions..

Hm, when writing the above I just realized why ReindexTable/ReindexIndex 
did not suffer from the same bug. It is because the first transaction 
there allocated in the PortalHeapMemory context which survives commit. I 
really need to look at if there is a clean way to handle memory contexts 
in my patch.

I also found the snapshot still active bug, it seems to have been caused 
by REINDEX TABLE CONCURRENTLY leaving an open snapshot which cannot be 
popped by PortalRunUtility().

Thanks again!
Andreas



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson <andreas@proxel.se> wrote:
> On 02/13/2017 06:31 AM, Michael Paquier wrote:
>> Er, something like that as well, no?
>> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
>
> REINDEX (VERBOSE) currently prints one such line per index, which does not
> really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes
> on a relation at the same time. It is not immediately obvious how this
> should work. Maybe one such detail line per table?

Hard to recall this thing in details with the time and the fact that a
relation is reindexed by processing all the indexes once at each step.
Hm... What if ReindexRelationConcurrently() actually is refactored in
such a way that it processes all the steps for each index
individually? This way you can monitor the time it takes to build
completely each index, including its . This operation would consume
more transactions but in the event of a failure the amount of things
to clean up is really reduced particularly for relations with many
indexes. This would as well reduce VERBOSE to print one line per index
rebuilt.
-- 
Michael



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Tue, Feb 14, 2017 at 12:56 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> This way you can monitor the time it takes to build
> completely each index, including its .

You can ignore the terms "including its" here.
-- 
Michael



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 02/14/2017 04:56 AM, Michael Paquier wrote:
> On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson <andreas@proxel.se> wrote:
>> On 02/13/2017 06:31 AM, Michael Paquier wrote:
>>> Er, something like that as well, no?
>>> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
>>
>> REINDEX (VERBOSE) currently prints one such line per index, which does not
>> really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes
>> on a relation at the same time. It is not immediately obvious how this
>> should work. Maybe one such detail line per table?
>
> Hard to recall this thing in details with the time and the fact that a
> relation is reindexed by processing all the indexes once at each step.
> Hm... What if ReindexRelationConcurrently() actually is refactored in
> such a way that it processes all the steps for each index
> individually? This way you can monitor the time it takes to build
> completely each index, including its . This operation would consume
> more transactions but in the event of a failure the amount of things
> to clean up is really reduced particularly for relations with many
> indexes. This would as well reduce VERBOSE to print one line per index
> rebuilt.

I am actually thinking about going the opposite direction (by reducing 
the number of times we call WaitForLockers), because it is not just 
about consuming transaction IDs, we also do not want to wait too many 
times for transactions to commit. I am leaning towards only calling 
WaitForLockersMultiple three times per table.

1. Between building and validating the new indexes.
2. Between setting the old indexes to invalid and setting them to dead
3. Between setting the old indexes to dead and dropping them

Right now my patch loops over the indexes in step 2 and 3 and waits for 
lockers once per index. This seems rather wasteful.

I have thought about that the code might be cleaner if we just looped 
over all indexes (and as a bonus the VERBOSE output would be more 
obvious), but I do not think it is worth waiting for lockers all those 
extra times.

Andreas



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 02/17/2017 01:53 PM, Andreas Karlsson wrote:
> I am actually thinking about going the opposite direction (by reducing
> the number of times we call WaitForLockers), because it is not just
> about consuming transaction IDs, we also do not want to wait too many
> times for transactions to commit. I am leaning towards only calling
> WaitForLockersMultiple three times per table.
>
> 1. Between building and validating the new indexes.
> 2. Between setting the old indexes to invalid and setting them to dead
> 3. Between setting the old indexes to dead and dropping them
>
> Right now my patch loops over the indexes in step 2 and 3 and waits for
> lockers once per index. This seems rather wasteful.
>
> I have thought about that the code might be cleaner if we just looped
> over all indexes (and as a bonus the VERBOSE output would be more
> obvious), but I do not think it is worth waiting for lockers all those
> extra times.

Thinking about this makes me wonder about why you decided to use a 
transaction per index in many of the steps rather than a transaction per 
step. Most steps should be quick. The only steps I think the makes sense 
to have a transaction per table are.

1) When building indexes to avoid long running transactions.

2) When validating the new indexes, also to avoid long running transactions.

But when swapping the indexes or when dropping the old indexes I do not 
see any reason to not just use one transaction per step since we do not 
even have to wait for any locks (other than WaitForLockers which we just 
want to call once anyway since all indexes relate to the same table).

Andreas



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson <andreas@proxel.se> wrote:
> Thinking about this makes me wonder about why you decided to use a
> transaction per index in many of the steps rather than a transaction per
> step. Most steps should be quick. The only steps I think the makes sense to
> have a transaction per table are.

I don't recall all the details to be honest :)

> 1) When building indexes to avoid long running transactions.
> 2) When validating the new indexes, also to avoid long running transactions.
>
> But when swapping the indexes or when dropping the old indexes I do not see
> any reason to not just use one transaction per step since we do not even
> have to wait for any locks (other than WaitForLockers which we just want to
> call once anyway since all indexes relate to the same table).

Perhaps, this really needs a careful lookup.

By the way, as this patch is showing up for the first time in this
development cycle, would it be allowed in the last commit fest? That's
not a patch in the easy category, far from that, but it does not
present a new concept.
-- 
Michael



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Bruce Momjian
Date:
On Fri, Feb 17, 2017 at 11:05:31PM +0900, Michael Paquier wrote:
> On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson <andreas@proxel.se> wrote:
> > Thinking about this makes me wonder about why you decided to use a
> > transaction per index in many of the steps rather than a transaction per
> > step. Most steps should be quick. The only steps I think the makes sense to
> > have a transaction per table are.
> 
> I don't recall all the details to be honest :)
> 
> > 1) When building indexes to avoid long running transactions.
> > 2) When validating the new indexes, also to avoid long running transactions.
> >
> > But when swapping the indexes or when dropping the old indexes I do not see
> > any reason to not just use one transaction per step since we do not even
> > have to wait for any locks (other than WaitForLockers which we just want to
> > call once anyway since all indexes relate to the same table).
> 
> Perhaps, this really needs a careful lookup.
> 
> By the way, as this patch is showing up for the first time in this
> development cycle, would it be allowed in the last commit fest? That's
> not a patch in the easy category, far from that, but it does not
> present a new concept.

FYI, the thread started on 2013-11-15.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Tue, Feb 28, 2017 at 5:29 AM, Bruce Momjian <bruce@momjian.us> wrote:
> On Fri, Feb 17, 2017 at 11:05:31PM +0900, Michael Paquier wrote:
>> On Fri, Feb 17, 2017 at 10:43 PM, Andreas Karlsson <andreas@proxel.se> wrote:
>> > Thinking about this makes me wonder about why you decided to use a
>> > transaction per index in many of the steps rather than a transaction per
>> > step. Most steps should be quick. The only steps I think the makes sense to
>> > have a transaction per table are.
>>
>> I don't recall all the details to be honest :)
>>
>> > 1) When building indexes to avoid long running transactions.
>> > 2) When validating the new indexes, also to avoid long running transactions.
>> >
>> > But when swapping the indexes or when dropping the old indexes I do not see
>> > any reason to not just use one transaction per step since we do not even
>> > have to wait for any locks (other than WaitForLockers which we just want to
>> > call once anyway since all indexes relate to the same table).
>>
>> Perhaps, this really needs a careful lookup.
>>
>> By the way, as this patch is showing up for the first time in this
>> development cycle, would it be allowed in the last commit fest? That's
>> not a patch in the easy category, far from that, but it does not
>> present a new concept.
>
> FYI, the thread started on 2013-11-15.

I don't object to the addition of this patch in next CF as this
presents no new concept. Still per the complications this patch and
because it is a complicated patch I was wondering if people are fine
to include it in this last CF.
-- 
Michael



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes:
> I don't object to the addition of this patch in next CF as this
> presents no new concept. Still per the complications this patch and
> because it is a complicated patch I was wondering if people are fine
> to include it in this last CF.

The March CF is already looking pretty daunting.  We can try to include
this but I won't be too surprised if it gets punted to a future CF.
        regards, tom lane



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Bruce Momjian
Date:
On Mon, Feb 27, 2017 at 05:31:21PM -0500, Tom Lane wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
> > I don't object to the addition of this patch in next CF as this
> > presents no new concept. Still per the complications this patch and
> > because it is a complicated patch I was wondering if people are fine
> > to include it in this last CF.
> 
> The March CF is already looking pretty daunting.  We can try to include
> this but I won't be too surprised if it gets punted to a future CF.

Yeah, that was my reaction too.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
Hi,

Here is a third take on this feature, heavily based on Michael Paquier's 
2.0 patch. This time the patch does not attempt to preserve the index 
oids, but instead creates new indexes and moves all dependencies from 
the old indexes to the new before dropping the old ones. The only 
downside I can see to this approach is that we no logner will able to 
reindex catalog tables concurrently, but in return it should be easier 
to confirm that this approach can be made work.

This patch relies on that we can change the indisvalid flag of indexes 
transactionally, and as far as I can tell this is the case now that we 
have MVCC for the catalog updates.

The code does some extra intermediate commits when building the indexes 
to avoid long running transactions.

How REINDEX CONCURRENTLY operates:

For each table:

1. Create new indexes without populating them, and lock the tables and 
indexes for the session.

2. After waiting for all running transactions populate each index in a 
separate transaction and set them to ready.

3. After waiting again for all running transactions validate each index 
in a separate transaction (but not setting them to valid just yet).

4. Swap all dependencies over from each old index to the new index and 
rename the old and the new indexes (from the <name> to <name>_ccold and 
<name>_new to <name>), and set isprimary and isexclusion flags. Here we 
also mark the new indexes as valid and the old indexes as invalid.

5. After waiting for all running transactions we change each index from 
invalid to dead.

6. After waiting for all running transactions we drop each index.

7. Drop all session locks.

Andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Jim Nasby
Date:
On 2/28/17 11:21 AM, Andreas Karlsson wrote:
> The only downside I can see to this approach is that we no logner will
> able to reindex catalog tables concurrently, but in return it should be
> easier to confirm that this approach can be made work.

Another downside is any stored regclass fields will become invalid. 
Admittedly that's a pretty unusual use case, but it'd be nice if there 
was at least a way to let users fix things during the rename phase 
(perhaps via an event trigger).
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson <andreas@proxel.se> wrote:
> For each table:
>
> 1. Create new indexes without populating them, and lock the tables and
> indexes for the session.

+    /*
+     * Copy contraint flags for old index. This is safe because the old index
+     * guaranteed uniquness.
+     */
+    newIndexForm->indisprimary = oldIndexForm->indisprimary;
+    oldIndexForm->indisprimary = false;
+    newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
+    oldIndexForm->indisexclusion = false;
[...]
+        deleteDependencyRecordsForClass(RelationRelationId, newIndexOid,
+                                        RelationRelationId, DEPENDENCY_AUTO);
+        deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
+                                        ConstraintRelationId,
DEPENDENCY_INTERNAL);
+
+        // TODO: pg_depend for old index?
There is a lot of mumbo-jumbo in the patch to create the exact same
index definition as the original one being reindexed, and that's a
huge maintenance burden for the future. You can blame me for that in
the current patch. I am wondering if it would not just be better to
generate a CREATE INDEX query string and then use the SPI to create
the index, and also do the following extensions at SQL level:
- Add a sort of WITH NO DATA clause where the index is created, so the
index is created empty, and is marked invalid and not ready.
- Extend pg_get_indexdef_string() with an optional parameter to
enforce the index name to something else, most likely it should be
extended with the WITH NO DATA/INVALID clause, which should just be a
storage parameter by the way.
By doing something like that what the REINDEX CONCURRENTLY code path
should just be careful about is that it chooses an index name that
avoids any conflicts.
-- 
Michael



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On 2017-03-01 19:25:23 -0600, Jim Nasby wrote:
> On 2/28/17 11:21 AM, Andreas Karlsson wrote:
> > The only downside I can see to this approach is that we no logner will
> > able to reindex catalog tables concurrently, but in return it should be
> > easier to confirm that this approach can be made work.
> 
> Another downside is any stored regclass fields will become invalid.
> Admittedly that's a pretty unusual use case, but it'd be nice if there was
> at least a way to let users fix things during the rename phase (perhaps via
> an event trigger).

I'm fairly confident that we don't want to invoke event triggers inside
the CIC code...  I'm also fairly confident that between index oids
stored somewhere being invalidated - what'd be a realistic use case of
that - and not having reindex concurrently, just about everyone will
choose the former.

Regards,

Andres



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 03/02/2017 02:25 AM, Jim Nasby wrote:
> On 2/28/17 11:21 AM, Andreas Karlsson wrote:
>> The only downside I can see to this approach is that we no logner will
>> able to reindex catalog tables concurrently, but in return it should be
>> easier to confirm that this approach can be made work.
>
> Another downside is any stored regclass fields will become invalid.
> Admittedly that's a pretty unusual use case, but it'd be nice if there
> was at least a way to let users fix things during the rename phase
> (perhaps via an event trigger).

Good point, but I agree with Andres here. Having REINDEX CONCURRENTLY 
issue event triggers seems strange to me. While it does create and drop 
indexes as part of its implementation, it is actually just an index 
maintenance job.

Andreas



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Robert Haas
Date:
On Thu, Mar 2, 2017 at 11:48 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2017-03-01 19:25:23 -0600, Jim Nasby wrote:
>> On 2/28/17 11:21 AM, Andreas Karlsson wrote:
>> > The only downside I can see to this approach is that we no logner will
>> > able to reindex catalog tables concurrently, but in return it should be
>> > easier to confirm that this approach can be made work.
>>
>> Another downside is any stored regclass fields will become invalid.
>> Admittedly that's a pretty unusual use case, but it'd be nice if there was
>> at least a way to let users fix things during the rename phase (perhaps via
>> an event trigger).
>
> I'm fairly confident that we don't want to invoke event triggers inside
> the CIC code...  I'm also fairly confident that between index oids
> stored somewhere being invalidated - what'd be a realistic use case of
> that - and not having reindex concurrently, just about everyone will
> choose the former.

Maybe.  But it looks to me like this patch is going to have
considerably more than its share of user-visible warts, and I'm not
very excited about that.  I feel like what we ought to be doing is
keeping the index OID the same and changing the relfilenode to point
to a newly-created one, and I attribute our failure to make that
design work thus far to insufficiently aggressive hacking.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:

On March 4, 2017 1:16:56 AM PST, Robert Haas <robertmhaas@gmail.com> wrote:
>Maybe.  But it looks to me like this patch is going to have
>considerably more than its share of user-visible warts, and I'm not
>very excited about that.  I feel like what we ought to be doing is
>keeping the index OID the same and changing the relfilenode to point
>to a newly-created one, and I attribute our failure to make that
>design work thus far to insufficiently aggressive hacking.

We literally spent years and a lot of emails waiting for that to happen. Users now hack up solutions like this in
userspace,obviously a bad solution. 

I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Robert Haas
Date:
On Sat, Mar 4, 2017 at 12:34 PM, Andres Freund <andres@anarazel.de> wrote:
> I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart.

I, again, give that a firm "maybe".  If the warts end up annoying 1%
of the users who try to use this feature, then you're right.  If they
end up making a substantial percentage of people who try to use this
feature give up on it, then we've added a bunch of complexity and
future code maintenance for little real gain.  I'm not ruling out the
possibility that you're 100% correct, but I'm not nearly as convinced
of that as you are.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Peter Geoghegan
Date:
On Sat, Mar 4, 2017 at 9:34 AM, Andres Freund <andres@anarazel.de> wrote:
> On March 4, 2017 1:16:56 AM PST, Robert Haas <robertmhaas@gmail.com> wrote:
>>Maybe.  But it looks to me like this patch is going to have
>>considerably more than its share of user-visible warts, and I'm not
>>very excited about that.  I feel like what we ought to be doing is
>>keeping the index OID the same and changing the relfilenode to point
>>to a newly-created one, and I attribute our failure to make that
>>design work thus far to insufficiently aggressive hacking.
>
> We literally spent years and a lot of emails waiting for that to happen. Users now hack up solutions like this in
userspace,obviously a bad solution.
 
>
> I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart.

IMHO, REINDEX itself is implemented in a way that is conceptually
pure, and yet quite user hostile.

I tend to tell colleagues that ask about REINDEX something along the
lines of: Just assume that REINDEX is going to block out even SELECT
statements referencing the underlying table. It might not be that bad
for you in practice, but the details are arcane such that it might as
well be that simple most of the time. Even if you have time to listen
to me explain it all, which you clearly don't, you're still probably
not going to be able to apply what you've learned in a way that helps
you.

-- 
Peter Geoghegan



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 03/05/2017 07:56 PM, Robert Haas wrote:
> On Sat, Mar 4, 2017 at 12:34 PM, Andres Freund <andres@anarazel.de> wrote:
>> I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart.
>
> I, again, give that a firm "maybe".  If the warts end up annoying 1%
> of the users who try to use this feature, then you're right.  If they
> end up making a substantial percentage of people who try to use this
> feature give up on it, then we've added a bunch of complexity and
> future code maintenance for little real gain.  I'm not ruling out the
> possibility that you're 100% correct, but I'm not nearly as convinced
> of that as you are.

I agree that these warts are annoying but I also think the limitations
can be explained pretty easily to users (e.g. by explaining it in the 
manual how REINDEX CONCURRENTLY creates a new index to replace the old 
one), and I think that is the important question about if a useful 
feature with warts should be merged or not. Does it make things 
substantially harder for the average user to grok?

And I would argue that his feature is useful for quite many, based on my 
experience running a semi-large database. Index bloat happens and 
without REINDEX CONCURRENTLY it can be really annoying to solve, 
especially for primary keys. Certainly more people have problems with 
index bloat than the number of people who store index oids in their 
database.

Andreas



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Robert Haas
Date:
On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas@proxel.se> wrote:
> And I would argue that his feature is useful for quite many, based on my
> experience running a semi-large database. Index bloat happens and without
> REINDEX CONCURRENTLY it can be really annoying to solve, especially for
> primary keys. Certainly more people have problems with index bloat than the
> number of people who store index oids in their database.

Yeah, but that's not the only wart, I think.  For example, I believe
(haven't looked at this patch series in a while) that the patch takes
a lock and later escalates the lock level.  If so, that could lead to
doing a lot of work to build the index and then getting killed by the
deadlock detector.  Also, if by any chance you think (or use any
software that thinks) that OIDs for system objects are a stable
identifier, this will be the first case where that ceases to be true.
If the system is shut down or crashes or the session is killed, you'll
be left with stray objects with names that you've never typed into the
system.  I'm sure you're going to say "don't worry, none of that is
any big deal" and maybe you're right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On 2017-03-07 21:48:23 -0500, Robert Haas wrote:
> On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas@proxel.se> wrote:
> > And I would argue that his feature is useful for quite many, based on my
> > experience running a semi-large database. Index bloat happens and without
> > REINDEX CONCURRENTLY it can be really annoying to solve, especially for
> > primary keys. Certainly more people have problems with index bloat than the
> > number of people who store index oids in their database.
> 
> Yeah, but that's not the only wart, I think.

I don't really see any other warts that don't correspond to CREATE/DROP
INDEX CONCURRENTLY.


> For example, I believe (haven't looked at this patch series in a
> while) that the patch takes a lock and later escalates the lock level.

It shouldn't* - that was required precisely because we had to switch the
relfilenodes when the oid stayed the same.  Otherwise in-progress index
lookups could end up using the wrong relfilenodes and/or switch in the
middle of a lookup.

* excepting the exclusive lock DROP INDEX CONCURRENTLY style dropping uses after marking the index as dead - but that
shouldn'tbe much of a concern?
 


> Also, if by any chance you think (or use any software that thinks)
> that OIDs for system objects are a stable identifier, this will be the
> first case where that ceases to be true.

Can you come up with an halfway realistic scenario why an index oid, not
a table, constraint, sequence oid, would be relied upon?


> If the system is shut down or crashes or the session is killed, you'll
> be left with stray objects with names that you've never typed into the
> system.

Given how relatively few complaints we have about CIC's possibility of
ending up with invalid indexes - not that there are none - and it's
widespread usage, I'm not too concerned about this.

Greetings,

Andres Freund



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 03/08/2017 03:48 AM, Robert Haas wrote:
> On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas@proxel.se> wrote:
>> And I would argue that his feature is useful for quite many, based on my
>> experience running a semi-large database. Index bloat happens and without
>> REINDEX CONCURRENTLY it can be really annoying to solve, especially for
>> primary keys. Certainly more people have problems with index bloat than the
>> number of people who store index oids in their database.
>
> Yeah, but that's not the only wart, I think.

The only two potential issues I see with the patch are:

1) That the index oid changes visibly to external users.

2) That the code for moving the dependencies will need to be updated 
when adding new things which refer to an index oid.

Given how useful I find REINDEX CONCURRENTLY I think these warts are 
worth it given that the impact is quite limited. I am of course biased 
since if I did not believe this I would not pursue this solution in the 
first place.

> For example, I believe
> (haven't looked at this patch series in a while) that the patch takes
> a lock and later escalates the lock level.  If so, that could lead to
> doing a lot of work to build the index and then getting killed by the
> deadlock detector.

This version of the patch no longer does that. For my use case 
escalating the lock would make this patch much less interesting. The 
highest lock level taken is the same one as the initial one (SHARE 
UPDATE EXCLUSIVE). The current patch does on a high level (very 
simplified) this:

1. CREATE INDEX CONCURRENTLY ind_new;
2. Atomically move all dependencies from ind to ind_new, rename ind to 
ind_old, and rename ind_new to ind.
3. DROP INDEX CONCURRENTLY ind_old;

The actual implementation is a bit more complicated in reality, but no 
part escalates the lock level over what would be required by the steps 
for creating and dropping indexes concurrently

> Also, if by any chance you think (or use any
> software that thinks) that OIDs for system objects are a stable
> identifier, this will be the first case where that ceases to be true.
> If the system is shut down or crashes or the session is killed, you'll
> be left with stray objects with names that you've never typed into the
> system.  I'm sure you're going to say "don't worry, none of that is
> any big deal" and maybe you're right.

Hm, I cannot think of any real life scenario where this will be an issue 
based on my personal experience with PostgreSQL, but if you can think of 
one please provide it. I will try to ponder some more on this myself.

Andreas



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Jim Nasby
Date:
On 3/8/17 9:34 AM, Andreas Karlsson wrote:
>> Also, if by any chance you think (or use any
>> software that thinks) that OIDs for system objects are a stable
>> identifier, this will be the first case where that ceases to be true.
>> If the system is shut down or crashes or the session is killed, you'll
>> be left with stray objects with names that you've never typed into the
>> system.  I'm sure you're going to say "don't worry, none of that is
>> any big deal" and maybe you're right.
>
> Hm, I cannot think of any real life scenario where this will be an issue
> based on my personal experience with PostgreSQL, but if you can think of
> one please provide it. I will try to ponder some more on this myself.

The case I currently have is to allow tracking database objects similar 
to (but not the same) as how we track the objects that belong to an 
extension[1]. That currently depends on event triggers to keep names 
updated if they're changed, as well as making use of the reg* types. If 
an event trigger fired as part of the index rename (essentially treating 
it like an ALTER INDEX) then I should be able to work around that.

The ultimate reason for doing this is to provide something similar to 
extensions (create a bunch of database objects that are all bound 
together), but also similar to classes in OO languages (so you can have 
multiple instances).[2]

Admittedly, this is pretty off the beaten path and I certainly wouldn't 
hold up the patch because of it. I am hoping that it'd be fairly easy to 
fire an event trigger as if someone had just renamed the index.

1: https://github.com/decibel/object_reference
2: https://github.com/decibel/pg_classy
-- 
Jim Nasby, Chief Data Architect, OpenSCG
http://OpenSCG.com



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Thomas Munro
Date:
On Wed, Mar 8, 2017 at 4:12 PM, Andres Freund <andres@anarazel.de> wrote:
> Can you come up with an halfway realistic scenario why an index oid, not
> a table, constraint, sequence oid, would be relied upon?

Is there an implication for SIREAD locks?  Predicate locks on index
pages include the index OID in the tag.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Thomas Munro
Date:
On Fri, Mar 10, 2017 at 9:36 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, Mar 8, 2017 at 4:12 PM, Andres Freund <andres@anarazel.de> wrote:
>> Can you come up with an halfway realistic scenario why an index oid, not
>> a table, constraint, sequence oid, would be relied upon?
>
> Is there an implication for SIREAD locks?  Predicate locks on index
> pages include the index OID in the tag.

Ah, yes, but that is covered by a call to
TransferPredicateLocksToHeapRelation() in index_concurrent_set_dead().

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 03/02/2017 03:10 AM, Michael Paquier wrote:
> On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson <andreas@proxel.se> wrote:
> +    /*
> +     * Copy contraint flags for old index. This is safe because the old index
> +     * guaranteed uniquness.
> +     */
> +    newIndexForm->indisprimary = oldIndexForm->indisprimary;
> +    oldIndexForm->indisprimary = false;
> +    newIndexForm->indisexclusion = oldIndexForm->indisexclusion;
> +    oldIndexForm->indisexclusion = false;
> [...]
> +        deleteDependencyRecordsForClass(RelationRelationId, newIndexOid,
> +                                        RelationRelationId, DEPENDENCY_AUTO);
> +        deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid,
> +                                        ConstraintRelationId,
> DEPENDENCY_INTERNAL);
> +
> +        // TODO: pg_depend for old index?

Spotted one of my TODO comments there so I have attached a patch where I 
have cleaned up that function. I also fixed the the code to properly 
support triggers.

> There is a lot of mumbo-jumbo in the patch to create the exact same
> index definition as the original one being reindexed, and that's a
> huge maintenance burden for the future. You can blame me for that in
> the current patch. I am wondering if it would not just be better to
> generate a CREATE INDEX query string and then use the SPI to create
> the index, and also do the following extensions at SQL level:
> - Add a sort of WITH NO DATA clause where the index is created, so the
> index is created empty, and is marked invalid and not ready.
> - Extend pg_get_indexdef_string() with an optional parameter to
> enforce the index name to something else, most likely it should be
> extended with the WITH NO DATA/INVALID clause, which should just be a
> storage parameter by the way.
> By doing something like that what the REINDEX CONCURRENTLY code path
> should just be careful about is that it chooses an index name that
> avoids any conflicts.

Hm, I am not sure how much that would help since a lot of the mumb-jumbo 
is by necessity in the step where we move the constraints over from the 
old index to the new.

Andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 03/13/2017 03:11 AM, Andreas Karlsson wrote:
> I also fixed the the code to properly support triggers.

And by "support triggers" I actually meant fixing the support for moving 
the foreign keys to the new index.

Andreas



Re: REINDEX CONCURRENTLY 2.0

From
Michael Banck
Date:
Hi,

I had a look at this.

On Mon, Mar 13, 2017 at 03:11:50AM +0100, Andreas Karlsson wrote:
> Spotted one of my TODO comments there so I have attached a patch where I
> have cleaned up that function. I also fixed the the code to properly support
> triggers.

The patch applies with quite a few offsets on top of current (2fd8685)
master, I have not verified that those are all ok.

Regression tests pass, also the included isolation tests.

I hope that Michael will post a full review as he worked on the code
extensively, but here are some some code comments, mostly on the
comments (note that I'm not a native speaker, so I might be wrong on
some of them as well):

> diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
> index 3908ade37b..3449c0af73 100644
> --- a/doc/src/sgml/ref/reindex.sgml
> +++ b/doc/src/sgml/ref/reindex.sgml
> @@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } <replacea
>        An index build with the <literal>CONCURRENTLY</> option failed, leaving
>        an <quote>invalid</> index. Such indexes are useless but it can be
>        convenient to use <command>REINDEX</> to rebuild them. Note that
> -      <command>REINDEX</> will not perform a concurrent build. To build the
> -      index without interfering with production you should drop the index and
> -      reissue the <command>CREATE INDEX CONCURRENTLY</> command.
> +      <command>REINDEX</> will perform a concurrent build if <literal>
> +      CONCURRENTLY</> is specified. To build the index without interfering
> +      with production you should drop the index and reissue either the
> +      <command>CREATE INDEX CONCURRENTLY</> or <command>REINDEX CONCURRENTLY</>
> +      command. Indexes of toast relations can be rebuilt with <command>REINDEX
> +      CONCURRENTLY</>.

I think the "To build the index[...]" part should be rephrased, the
current diff makes it sound like you should drop the index first even if
you reindex concurrently. What about "Note that <command>REINDEX</> will
only perform a concurrent build if <literal> CONCURRENTLY</> is
specified"?

Anyway, this part is only about reindexing invalid indexes, so
mentioning that reindex is not concurrently or the part about create-
index-concurrently-then-rename only for this case is a bit weird, but
this is a pre-existing condition.

> diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
> index 8d42a347ea..c40ac0b154 100644
> --- a/src/backend/catalog/index.c
> +++ b/src/backend/catalog/index.c
>  /*
> + * index_concurrent_create_copy
> + *
> + * Create a concurrent index based on the definition of the one provided by
> + * caller that will be used for concurrent operations. The index is inserted
> + * into catalogs and needs to be built later on. This is called during
> + * concurrent reindex processing. The heap relation on which is based the index
> + * needs to be closed by the caller.
> + */

That should be "The heap relation on which the index is based ..." I
think.

> +    /*
> +     * We have to re-build the IndexInfo struct, since it was lost in
> +     * commit of transaction where this concurrent index was created
> +     * at the catalog level.
> +     */
> +    indexInfo = BuildIndexInfo(indexRelation);

Looks like indexInfo starts with lowercase, but the comment above has
upper case `IndexInfo'.

> +/*
> + * index_concurrent_swap
> + *
> + * Swap name, dependencies and constraints of the old index over to the new
> + * index, while marking the old index as invalid and the new as valid.
> + */

The `while' looks slightly odd to me, ISTM this is just another
operation this function performs, whereas "while" makes it sound like
the marking happens concurrently; so maybe ". Also, mark the old index
as invalid[...]"?

> +void
> +index_concurrent_swap(Oid newIndexOid, Oid oldIndexOid, const char *oldName)
> +{

[...]

> +    /*
> +     * Copy contraint flags for old index. This is safe because the old index
> +     * guaranteed uniquness.
> +     */

"uniqueness".

> +    /* Mark old index as valid and new is invalid as index_set_state_flags */

"new as invalid". Also, this comment style is different to this one:

> +    /*
> +     * Move contstraints and triggers over to the new index
> +     */

I guess the latter could be changed to a one-line comment as the former,
but maybe there is a deeper sense (locality of comment?) in this.

> +            /* make a modifiable copy */

I think comments should start capitalized?


> +/*
> + * index_concurrent_drop
> + *
> + * Drop a single index concurrently as the last step of an index concurrent
> + * process. Deletion is done through performDeletion or dependencies of the
> + * index would not get dropped. At this point all the indexes are already
> + * considered as invalid and dead so they can be dropped without using any
> + * concurrent options as it is sure that they will not interact with other
> + * server sessions.
> + */

I'd write "as it is certain" instead of "as it is sure", but I can't
explain why. Maybe persons are sure, but situations are certain?

> @@ -1483,41 +1939,13 @@ index_drop(Oid indexId, bool concurrent)
>           * Note: the reason we use actual lock acquisition here, rather than
>           * just checking the ProcArray and sleeping, is that deadlock is
>           * possible if one of the transactions in question is blocked trying
> -         * to acquire an exclusive lock on our table.  The lock code will
> +         * to acquire an exclusive lock on our table. The lock code will

Gratuitous whitespace change, seeing that other comments added in this
patch have the extra whitespace after full stops as well.

> diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
> index d0ee851215..9dce6420df 100644
> --- a/src/backend/catalog/pg_depend.c
> +++ b/src/backend/catalog/pg_depend.c
> @@ -377,6 +377,94 @@ changeDependencyFor(Oid classId, Oid objectId,
>  }
>  
>  /*
> + * Adjust all dependency records to point to a different object of the same type
> + *
> + * refClassId/oldRefObjectId specify the old referenced object.
> + * newRefObjectId is the new referenced object (must be of class refClassId).
> + *
> + * Returns the number of records updated.
> + */
> +long
> +changeDependencyForAll(Oid refClassId, Oid oldRefObjectId,
> +                       Oid newRefObjectId)

This one is mostly a copy-paste of changeDependencyFor(), did you
consider refactoring that into handling the All case as well?

> @@ -722,3 +810,58 @@ get_index_constraint(Oid indexId)
>  
>      return constraintId;
>  }
> +
> +/*
> + * get_index_ref_constraints
> + *        Given the OID of an index, return the OID of all foreign key
> + *        constraints which reference the index.
> + */
> +List *
> +get_index_ref_constraints(Oid indexId)

Same for this one, but there's two similar functions
(get_constraint_index() and get_index_constraint()) already so I guess
it's fine?

> diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
> index 72bb06c760..7a51c25d98 100644
> --- a/src/backend/commands/indexcmds.c
> +++ b/src/backend/commands/indexcmds.c
> @@ -283,6 +285,87 @@ CheckIndexCompatible(Oid oldId,
>      return ret;
>  }
>  
> +
> +/*
> + * WaitForOlderSnapshots
> + *
> + * Wait for transactions that might have older snapshot than the given xmin

"an older snapshot" maybe?

> + * limit, because it might not contain tuples deleted just before it has
> + * been taken. Obtain a list of VXIDs of such transactions, and wait for them
> + * individually.
> + *
> + * We can exclude any running transactions that have xmin > the xmin given;
> + * their oldest snapshot must be newer than our xmin limit.
> + * We can also exclude any transactions that have xmin = zero, since they

Probably an empty line between the two paragraphs is in order, or just
keep it one paragraph.

> + * evidently have no live snapshot at all (and any one they might be in
> + * process of taking is certainly newer than ours).  

Please rephrase what's in the paranthesis, I am not quite sure what
it means, maybe "(and any they are currently taking is..."?

> + * We can also exclude autovacuum processes and processes running manual
> + * lazy VACUUMs, because they won't be fazed by missing index entries
> + * either. (Manual ANALYZEs, however, can't be excluded because they
> + * might be within transactions that are going to do arbitrary operations
> + * later.)

Weird punctuation, I think the full stop before the paranthesis should
be put after it, the full stop at the end of the paranthesis be dropped
and the beginning of the parenthesis should not be capitalized.

Oh hrm, I now see that the patch just moves the comments, so maybe don't
bother.

> @@ -1739,7 +1735,7 @@ ChooseIndexColumnNames(List *indexElems)
>   *        Recreate a specific index.
>   */
>  Oid
> -ReindexIndex(RangeVar *indexRelation, int options)
> +ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
>  {
>      Oid            indOid;
>      Oid            heapOid = InvalidOid;
> @@ -1751,8 +1747,9 @@ ReindexIndex(RangeVar *indexRelation, int options)
>       * obtain lock on table first, to avoid deadlock hazard.  The lock level
>       * used here must match the index lock obtained in reindex_index().
>       */
> -    indOid = RangeVarGetRelidExtended(indexRelation, AccessExclusiveLock,
> -                                      false, false,
> +    indOid = RangeVarGetRelidExtended(indexRelation,
> +                                      concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
> +                                      concurrent, concurrent,

I find the way the bool is passed for the third and fourth argument a
bit weird, but ok.  I would still suggest to explain in the comment
above why the two other arguments to RangeVarGetRelidExtended()
(`missing_ok' and `nowait') are dependent on concurrent reindexing; it's
not super-obvious to me.

>  
>      /* The lock level used here should match reindex_relation(). */
> -    heapOid = RangeVarGetRelidExtended(relation, ShareLock, false, false,
> +    heapOid = RangeVarGetRelidExtended(relation,
> +                                       concurrent ? ShareUpdateExclusiveLock : ShareLock,
> +                                       concurrent, concurrent,
>                                         RangeVarCallbackOwnsTable, NULL);

Same here.

> +    if (concurrent)
> +        result = ReindexRelationConcurrently(heapOid, options);
> +    else
> +        result = reindex_relation(heapOid,
> +                                  REINDEX_REL_PROCESS_TOAST |
> +                                  REINDEX_REL_CHECK_CONSTRAINTS,
> +                                  options);

This somewhat begs the question why those two cases are so different
(one is implemented in src/backend/catalog/index.c, the other in
src/backend/commands/indexcmds.c, and their naming scheme is different).
I guess that's ok, but it might also be a hint that
ReindexRelationConcurrently() is implemented at the wrong level.

> @@ -2011,3 +2040,597 @@ ReindexMultipleTables(const char *objectName, ReindexObjectType objectKind,
>  
>      MemoryContextDelete(private_context);
>  }
> +
> +
> +/*
> + * ReindexRelationConcurrently
> + *
> + * Process REINDEX CONCURRENTLY for given relation Oid. The relation can be
> + * either an index or a table. If a table is specified, each phase is processed
> + * one by done for each table's indexes as well as its dependent toast indexes

"one by one".

> +static bool
> +ReindexRelationConcurrently(Oid relationOid, int options)
> +{

[...]

> +    /*
> +     * Extract the list of indexes that are going to be rebuilt based on the
> +     * list of relation Oids given by caller. For each element in given list,
> +     * If the relkind of given relation Oid is a table, all its valid indexes

capitalization, "If" is in the middle of a sentence.

> +     * will be rebuilt, including its associated toast table indexes. If
> +     * relkind is an index, this index itself will be rebuilt. 

Maybe mention here that system catalogs and shared relations cannot be
reindexed concurrently?

> +        /* Create concurrent index based on given index */
> +        concurrentOid = index_concurrent_create_copy(indexParentRel,
> +                                                     indOid,
> +                                                     concurrentName);

AIUI, this creates/copies some meta-data for the concurrent index, but
does not yet create the index itself, right?  If so, the comment is
somewhat misleading.

> +        /*
> +         * Now open the relation of concurrent index, a lock is also needed on
> +         * it
> +         */

Multi-line comments should end with a full-stop I think?

> +    /*
> +     * Phase 3 of REINDEX CONCURRENTLY

[...]

> +        /*
> +         * This concurrent index is now valid as they contain all the tuples
> +         * necessary. However, it might not have taken into account deleted tuples

"as they contain" should be "as it contains" I guess, since the rest of
the comment is talking about a singular index.

> +    /*
> +     * Phase 4 of REINDEX CONCURRENTLY
> +     *
> +     * Now that the concurrent indexes have been validated, it is necessary
> +     * to swap each concurrent index with its corresponding old index.
> +     *
> +     * We mark the new indexes as valid and the old indexes dead at the same
> +     * time to make sure we get only get constraint violations from the

"we only get"

> +    /*
> +     * Phase 5 of REINDEX CONCURRENTLY
> +     *
> +     * The indexes hold now a fresh relfilenode of their respective concurrent

I'd write "now hold" instead of "hold now".

> +     * entries indexes. It is time to mark the now-useless concurrent entries
> +     * as not ready so as they can be safely discarded from write operations
> +     * that may occur on them.

So the "concurrent entries" is the original index, as that one should be
now-useless?  If so, that's a bit confusing terminology to me and it was
called "old index" in the previous phases.

> +    /*
> +     * Phase 6 of REINDEX CONCURRENTLY
> +     *
> +     * Drop the concurrent indexes, with actually the same code path as

Again, I'd have written "Drop the old indexes". Also, "with actually the
same" sounds a bit awkward, maybe "actually using the same" would be
better.

> +    /*
> +     * Last thing to do is to release the session-level lock on the parent table
> +     * and the indexes of table.

"and on the indexes of the table"?  Or what exactly is meant with the
last bit?

> diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
> index 20b5273405..c76dacc44a 100644
> --- a/src/backend/tcop/utility.c
> +++ b/src/backend/tcop/utility.c
> @@ -773,16 +773,20 @@ standard_ProcessUtility(PlannedStmt *pstmt,
> -                        ReindexIndex(stmt->relation, stmt->options);
> +                        ReindexIndex(stmt->relation, stmt->options, stmt->concurrent);
> -                        ReindexTable(stmt->relation, stmt->options);
> +                        ReindexTable(stmt->relation, stmt->options, stmt->concurrent);
> -                        ReindexMultipleTables(stmt->name, stmt->kind, stmt->options);
> +                        ReindexMultipleTables(stmt->name, stmt->kind, stmt->options, stmt->concurrent);

Those lines are now in excess of 80 chars.


Cheers,

Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Mon, Mar 13, 2017 at 11:11 AM, Andreas Karlsson <andreas@proxel.se> wrote:
> On 03/02/2017 03:10 AM, Michael Paquier wrote:
>> There is a lot of mumbo-jumbo in the patch to create the exact same
>> index definition as the original one being reindexed, and that's a
>> huge maintenance burden for the future. You can blame me for that in
>> the current patch. I am wondering if it would not just be better to
>> generate a CREATE INDEX query string and then use the SPI to create
>> the index, and also do the following extensions at SQL level:
>> - Add a sort of WITH NO DATA clause where the index is created, so the
>> index is created empty, and is marked invalid and not ready.
>> - Extend pg_get_indexdef_string() with an optional parameter to
>> enforce the index name to something else, most likely it should be
>> extended with the WITH NO DATA/INVALID clause, which should just be a
>> storage parameter by the way.
>> By doing something like that what the REINDEX CONCURRENTLY code path
>> should just be careful about is that it chooses an index name that
>> avoids any conflicts.
>
>
> Hm, I am not sure how much that would help since a lot of the mumb-jumbo is
> by necessity in the step where we move the constraints over from the old
> index to the new.

Well, the idea is really to get rid of that as there are already
facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
TABLE when rewriting a relation. It is not really attractive to have a
3rd method in the backend code to do the same kind of things, for a
method that is even harder to maintain than the other two.
-- 
Michael



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Mar 30, 2017 at 5:13 AM, Michael Banck
<michael.banck@credativ.de> wrote:
> On Mon, Mar 13, 2017 at 03:11:50AM +0100, Andreas Karlsson wrote:
>> Spotted one of my TODO comments there so I have attached a patch where I
>> have cleaned up that function. I also fixed the the code to properly support
>> triggers.
>
> I hope that Michael will post a full review as he worked on the code
> extensively, but here are some some code comments, mostly on the
> comments (note that I'm not a native speaker, so I might be wrong on
> some of them as well):

Thanks, Michael. I have done a pass on it

> [review comments]

Here are more comments:

+   <para>
+    <command>REINDEX SYSTEM</command> does not support
+    <command>CONCURRENTLY</command>.
+   </para>
It would be nice to mention that REINDEX SCHEMA pg_catalog is not
supported, or just tell that concurrent reindexing of any system
catalog indexes.

When running REINDEX SCHEMA CONCURRENTLY public on the regression
database I am bumping into a bunch of these warnings:
WARNING:  01000: snapshot 0x7fa5e6000040 still active
LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123
WARNING:  01000: snapshot 0x7fa5e6000040 still active
LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123

+ * Reset attcacheoff for a TupleDesc
+ */
+void
+ResetTupleDescCache(TupleDesc tupdesc)
+{
+   int i;
+
+   for (i = 0; i < tupdesc->natts; i++)
+       tupdesc->attrs[i]->attcacheoff = -1;
+}
I think that it would be better to merge that with TupleDescInitEntry
to be sure that the initialization of a TupleDesc's attribute goes
through only one code path.

+   /*
+    * Copy contraint flags for old index. This is safe because the old index
+    * guaranteed uniquness.
+    */
s/uniquness/uniqueness/ and s/contraint/constraint/.

+   /*
+    * Move contstraints and triggers over to the new index
+    */
s/contstraints/constraints/.

-REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} <replaceable class="PARAMETER">name</replaceable>
+REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
} [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
I am taking the war path with such a sentence... But what about adding
CONCURRENTLY to the list of options in parenthesis instead?

With this patch, we are reaching the 9th boolean argument for
create_index(). Any opinions about refactoring that into a set of
bitwise flags? Fairly unrelated to this patch.

+   /*
+    * Move all dependencies on the old index to the new
+    */
Sentence unfinished.

It has been years since I looked at this code (I wrote it in
majority), but here is what I would explore if I were to work on that
for the next release cycle:
- Explore the use of SQL-level interfaces to mark an index as inactive
at creation.
- Remove work done in changeDependencyForAll, and replace it by
something similar to what tablecmds.c does. There is I think here some
place for refactoring if that's not with CREATE TABLE LIKE. This
requires to the same work of creation, renaming and drop of the old
triggers and constraints.
- Do a per-index rebuild and not a per-relation rebuild for concurrent
indexing. Doing a per-relation reindex has the disadvantage that many
objects need to be created at the same time, and in the case of
REINDEX CONCURRENTLY time of the operation is not what matters, it is
how intrusive the operation is. Relations with many indexes would also
result in much object locks taken at each step.
The first and second points require a bit of thoughts for sure, but in
the long term that would pay in maintenance if we don't reinvent the
wheel, or at least try not to.
-- 
Michael



Re: REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
Thanks for the feedback. I will look at it when I get the time.

On 03/31/2017 08:27 AM, Michael Paquier wrote:
> - Do a per-index rebuild and not a per-relation rebuild for concurrent
> indexing. Doing a per-relation reindex has the disadvantage that many
> objects need to be created at the same time, and in the case of
> REINDEX CONCURRENTLY time of the operation is not what matters, it is
> how intrusive the operation is. Relations with many indexes would also
> result in much object locks taken at each step.

I am personally worried about the amount time spent waiting for long 
running transactions if you reindex per index rather than per relation. 
Because when you for one index wait on long running transactions nothing 
prevents new long transaction from starting, which we will have to wait 
for while reindexing the next index. If your database has many long 
running transactions more time will be spent waiting than the time spent 
working.

Are the number of locks really a big deal compared to other costs 
involved here? REINDEX does a lot of expensive things like staring 
transactions, taking snapshots, scanning large tables, building a new 
index, etc. The trade off I see is between temporary disk usage and time 
spent waiting for transactions, and doing the REINDEX per relation 
allows for flexibility since people can still explicitly reindex per 
index of they want to.

Andreas



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson <andreas@proxel.se> wrote:
> Thanks for the feedback. I will look at it when I get the time.
>
> On 03/31/2017 08:27 AM, Michael Paquier wrote:
>>
>> - Do a per-index rebuild and not a per-relation rebuild for concurrent
>> indexing. Doing a per-relation reindex has the disadvantage that many
>> objects need to be created at the same time, and in the case of
>> REINDEX CONCURRENTLY time of the operation is not what matters, it is
>> how intrusive the operation is. Relations with many indexes would also
>> result in much object locks taken at each step.
>
>
> I am personally worried about the amount time spent waiting for long running
> transactions if you reindex per index rather than per relation. Because when
> you for one index wait on long running transactions nothing prevents new
> long transaction from starting, which we will have to wait for while
> reindexing the next index. If your database has many long running
> transactions more time will be spent waiting than the time spent working.

Yup, I am not saying that one approach or the other are bad, both are
worth considering. That's a deal between waiting and manual potential
cleanup in the event of a failure.

> and doing the REINDEX per relation allows for flexibility
> since people can still explicitly reindex per index of they want to.

You have a point here.

I am marking this patch as returned with feedback, this won't get in
PG10. If I am freed from the SCRAM-related open items I'll try to give
another shot at implementing this feature before the first CF of PG11.
-- 
Michael



Re: REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 04/03/2017 07:57 AM, Michael Paquier wrote:
> On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson <andreas@proxel.se> wrote:
>> On 03/31/2017 08:27 AM, Michael Paquier wrote:
>>> - Do a per-index rebuild and not a per-relation rebuild for concurrent
>>> indexing. Doing a per-relation reindex has the disadvantage that many
>>> objects need to be created at the same time, and in the case of
>>> REINDEX CONCURRENTLY time of the operation is not what matters, it is
>>> how intrusive the operation is. Relations with many indexes would also
>>> result in much object locks taken at each step.
>>
>> I am personally worried about the amount time spent waiting for long running
>> transactions if you reindex per index rather than per relation. Because when
>> you for one index wait on long running transactions nothing prevents new
>> long transaction from starting, which we will have to wait for while
>> reindexing the next index. If your database has many long running
>> transactions more time will be spent waiting than the time spent working.
>
> Yup, I am not saying that one approach or the other are bad, both are
> worth considering. That's a deal between waiting and manual potential
> cleanup in the event of a failure.

Agreed, and which is worse probably depends heavily on your schema and 
workload.

> I am marking this patch as returned with feedback, this won't get in
> PG10. If I am freed from the SCRAM-related open items I'll try to give
> another shot at implementing this feature before the first CF of PG11.

Thanks! I also think I will have time to work on this before the first CF.

Andreas



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
I have attached a new, rebased version of the batch with most of Banck's 
and some of your feedback incorporated. Thanks for the good feedback!

On 03/31/2017 08:27 AM, Michael Paquier wrote> When running REINDEX 
SCHEMA CONCURRENTLY public on the regression
> database I am bumping into a bunch of these warnings:
> WARNING:  01000: snapshot 0x7fa5e6000040 still active
> LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123
> WARNING:  01000: snapshot 0x7fa5e6000040 still active
> LOCATION:  AtEOXact_Snapshot, snapmgr.c:1123

I failed to reproduce this. Do you have a reproducible test case?

> + * Reset attcacheoff for a TupleDesc
> + */
> +void
> +ResetTupleDescCache(TupleDesc tupdesc)
> +{
> +   int i;
> +
> +   for (i = 0; i < tupdesc->natts; i++)
> +       tupdesc->attrs[i]->attcacheoff = -1;
> +}
> I think that it would be better to merge that with TupleDescInitEntry
> to be sure that the initialization of a TupleDesc's attribute goes
> through only one code path.

Sorry, but I am not sure I understand your suggestion. I do not like the 
ResetTupleDescCache function so all suggestions are welcome.
> -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
> } <replaceable class="PARAMETER">name</replaceable>
> +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM
> } [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable>
> I am taking the war path with such a sentence... But what about adding
> CONCURRENTLY to the list of options in parenthesis instead?

I have thought some about this myself and I do not care strongly either way.

> - Explore the use of SQL-level interfaces to mark an index as inactive
> at creation.
> - Remove work done in changeDependencyForAll, and replace it by
> something similar to what tablecmds.c does. There is I think here some
> place for refactoring if that's not with CREATE TABLE LIKE. This
> requires to the same work of creation, renaming and drop of the old
> triggers and constraints.

I am no fan of the current code duplication and how fragile it is, but I 
think these cases are sufficiently different to prevent meaningful code 
reuse. But it could just be me who is unfamiliar with that part of the code.

> - Do a per-index rebuild and not a per-relation rebuild for concurrent
> indexing. Doing a per-relation reindex has the disadvantage that many
> objects need to be created at the same time, and in the case of
> REINDEX CONCURRENTLY time of the operation is not what matters, it is
> how intrusive the operation is. Relations with many indexes would also
> result in much object locks taken at each step.

I am still leaning towards my current tradeoff since waiting for all 
queries to stop using an index can take a lot of time and if you only 
have to do that once per table it would be a huge benefit under some 
workloads, and you can still reindex each index separately if you need to.

Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
Here is a rebased version of the patch.

Andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Wed, Nov 1, 2017 at 1:20 PM, Andreas Karlsson <andreas@proxel.se> wrote:
> Here is a rebased version of the patch.

The patch does not apply, and needs a rebase. I am moving it to next
CF with waiting on author as status.
-- 
Michael


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Alvaro Herrera
Date:
Andreas Karlsson wrote:
> Here is a rebased version of the patch.

Is anybody working on rebasing this patch?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Alvaro Herrera
Date:
Michael Paquier wrote:

> Well, the idea is really to get rid of that as there are already
> facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
> TABLE when rewriting a relation. It is not really attractive to have a
> 3rd method in the backend code to do the same kind of things, for a
> method that is even harder to maintain than the other two.

I dislike the backend code that uses SPI and manufacturing node to
re-creates indexes.  IMO we should get rid of it.  Let's not call it
"facilities", but rather "grotty hacks".

I think before suggesting to add even more code to perpetuate that idea,
we should think about going in the other direction.  I have not tried to
write the code, but it should be possible to have an intermediate
function called by ProcessUtility* which transforms the IndexStmt into
an internal representation, then calls DefineIndex.   This way, all this
code that wants to create indexes for backend-internal reasons can
create the internal representation directly then call DefineIndex,
instead of the horrible hacks they use today creating parse nodes by
hand.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Dec 21, 2017 at 11:46 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
> Michael Paquier wrote:
>> Well, the idea is really to get rid of that as there are already
>> facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
>> TABLE when rewriting a relation. It is not really attractive to have a
>> 3rd method in the backend code to do the same kind of things, for a
>> method that is even harder to maintain than the other two.
>
> I dislike the backend code that uses SPI and manufacturing node to
> re-creates indexes.  IMO we should get rid of it.  Let's not call it
> "facilities", but rather "grotty hacks".

Aha. You are making my day here ;)

> I think before suggesting to add even more code to perpetuate that idea,
> we should think about going in the other direction.  I have not tried to
> write the code, but it should be possible to have an intermediate
> function called by ProcessUtility* which transforms the IndexStmt into
> an internal representation, then calls DefineIndex.   This way, all this
> code that wants to create indexes for backend-internal reasons can
> create the internal representation directly then call DefineIndex,
> instead of the horrible hacks they use today creating parse nodes by
> hand.

Yeah, that would be likely possible. I am not volunteering for that in
the short term though..
-- 
Michael


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Craig Ringer
Date:
On 21 December 2017 at 11:31, Michael Paquier <michael.paquier@gmail.com> wrote:
On Thu, Dec 21, 2017 at 11:46 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
> Michael Paquier wrote:
>> Well, the idea is really to get rid of that as there are already
>> facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
>> TABLE when rewriting a relation. It is not really attractive to have a
>> 3rd method in the backend code to do the same kind of things, for a
>> method that is even harder to maintain than the other two.
>
> I dislike the backend code that uses SPI and manufacturing node to
> re-creates indexes.  IMO we should get rid of it.  Let's not call it
> "facilities", but rather "grotty hacks".

Aha. You are making my day here ;)

> I think before suggesting to add even more code to perpetuate that idea,
> we should think about going in the other direction.  I have not tried to
> write the code, but it should be possible to have an intermediate
> function called by ProcessUtility* which transforms the IndexStmt into
> an internal representation, then calls DefineIndex.   This way, all this
> code that wants to create indexes for backend-internal reasons can
> create the internal representation directly then call DefineIndex,
> instead of the horrible hacks they use today creating parse nodes by
> hand.

Yeah, that would be likely possible. I am not volunteering for that in
the short term though..

It sounds like that'd make some of ALTER TABLE a bit less ... upsetting ... too. 

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Stephen Frost
Date:
Craig, Michael, all,

* Craig Ringer (craig@2ndquadrant.com) wrote:
> On 21 December 2017 at 11:31, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>
> > On Thu, Dec 21, 2017 at 11:46 AM, Alvaro Herrera
> > <alvherre@alvh.no-ip.org> wrote:
> > > Michael Paquier wrote:
> > >> Well, the idea is really to get rid of that as there are already
> > >> facilities of this kind for CREATE TABLE LIKE in the parser and ALTER
> > >> TABLE when rewriting a relation. It is not really attractive to have a
> > >> 3rd method in the backend code to do the same kind of things, for a
> > >> method that is even harder to maintain than the other two.
> > >
> > > I dislike the backend code that uses SPI and manufacturing node to
> > > re-creates indexes.  IMO we should get rid of it.  Let's not call it
> > > "facilities", but rather "grotty hacks".
> >
> > Aha. You are making my day here ;)
> >
> > > I think before suggesting to add even more code to perpetuate that idea,
> > > we should think about going in the other direction.  I have not tried to
> > > write the code, but it should be possible to have an intermediate
> > > function called by ProcessUtility* which transforms the IndexStmt into
> > > an internal representation, then calls DefineIndex.   This way, all this
> > > code that wants to create indexes for backend-internal reasons can
> > > create the internal representation directly then call DefineIndex,
> > > instead of the horrible hacks they use today creating parse nodes by
> > > hand.
> >
> > Yeah, that would be likely possible. I am not volunteering for that in
> > the short term though..
>
> It sounds like that'd make some of ALTER TABLE a bit less ... upsetting ...
> too.

I'm a big fan of this patch but it doesn't appear to have made any
progress in quite a while.  Is there any chance we can get an updated
patch and perhaps get another review before the end of this CF...?

Refactoring this to have an internal representation between
ProcessUtility() and DefineIndex doesn't sound too terrible and if it
means the ability to reuse that, seems like it'd be awful nice to do
so..

Thanks!

Stephen

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 01/26/2018 03:28 AM, Stephen Frost wrote:
> I'm a big fan of this patch but it doesn't appear to have made any
> progress in quite a while.  Is there any chance we can get an updated
> patch and perhaps get another review before the end of this CF...?

Sorry, as you may have guessed I do not have the time right now to get 
this updated during this commitfest.

> Refactoring this to have an internal representation between
> ProcessUtility() and DefineIndex doesn't sound too terrible and if it
> means the ability to reuse that, seems like it'd be awful nice to do
> so..

I too like the concept, but have not had the time to look into it.

Andreas


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Wed, Jan 31, 2018 at 01:48:00AM +0100, Andreas Karlsson wrote:
> I too like the concept, but have not had the time to look into it.

This may happen at some point, for now I am marking the patch as
returned with feedback.
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
Here is a revival of this patch.  This is Andreas Karlsson's v4 patch
(2017-11-01) with some updates for conflicts and changed APIs.

AFAICT from the discussions, there were no more conceptual concerns with
this approach.  Recall that with this patch REINDEX CONCURRENTLY creates
a new index (with a new OID) and then switch the names and dependencies.

I have done a review of this patch and it looks pretty solid to me.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Sergei Kornilov
Date:
Hello

Thank you for working on this patch!

I perform some tests and think behavior with partition tables is slightly inconsistent.

postgres=# reindex table measurement;
WARNING:  REINDEX of partitioned tables is not yet implemented, skipping "measurement"
NOTICE:  table "measurement" has no indexes
REINDEX
postgres=# reindex table CONCURRENTLY measurement;
ERROR:  cannot reindex concurrently this type of relation

Maybe we need report warning and skip partitioned tables similar to plain reindex?

This makes more sense for "reindex database" or "reindex schema": as far i can see, concurrent reindex will stop work
afterfirst partitioned table in list.
 

regards, Sergei


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 07/12/2018 17:40, Sergei Kornilov wrote:
> I perform some tests and think behavior with partition tables is slightly inconsistent.
> 
> postgres=# reindex table measurement;
> WARNING:  REINDEX of partitioned tables is not yet implemented, skipping "measurement"
> NOTICE:  table "measurement" has no indexes
> REINDEX
> postgres=# reindex table CONCURRENTLY measurement;
> ERROR:  cannot reindex concurrently this type of relation
> 
> Maybe we need report warning and skip partitioned tables similar to plain reindex?

OK, that should be easy to fix.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Sergei Kornilov
Date:
Hello

I review code and documentation and i have few notes. Did you register this patch in CF app?

I found one error in phase 4. Simple reproducer:

> create table test (i int);
> create index this_is_very_large_exactly_maxnamelen_index_name_wink_wink_wink on test (i);
> create index this_is_very_large_exactly_maxnamelen_index_name_wink_winkccold on test (i);
> reindex table CONCURRENTLY test;

This fails with error

> ERROR:  duplicate key value violates unique constraint "pg_class_relname_nsp_index"
> DETAIL:  Key (relname, relnamespace)=(this_is_very_large_exactly_maxnamelen_index_name_wink_win_ccold, 2200) already
exists.

CommandCounterIncrement() in (or after) index_concurrently_swap will fix this issue.

> ReindexPartitionedIndex(Relation parentIdx)
>     ereport(ERROR,
>            (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>             errmsg("REINDEX is not yet implemented for partitioned indexes")));

I think we need add errhint("you can REINDEX each partition separately") or something similar. 
Also can we omit this warning for reindex database? All partition must be in same database and warning in such case is
useless:we have warning, but doing reindex for each partition => we reindex partitioned table correctly.
 

Another behavior issue i found with reindex (verbose) schema/database: INFO ereport is printed twice for each table.

> INFO:  relation "measurement_y2006m02" was reindexed
> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.07 s.
> INFO:  table "public.measurement_y2006m02" was reindexed

One from ReindexMultipleTables and another (with pg_rusage_show) from ReindexRelationConcurrently.

> ReindexRelationConcurrently
> if (!indexRelation->rd_index->indisvalid)

it is better use IndexIsValid macro here? And same question about added indexform->indisvalid in
src/backend/commands/tablecmds.c

>       <para>
>       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
>       an <quote>invalid</quote> index. Such indexes are useless but it can be
>-      convenient to use <command>REINDEX</command> to rebuild them. Note that
>-      <command>REINDEX</command> will not perform a concurrent build. To build the
>-      index without interfering with production you should drop the index and
>-      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
>+      convenient to use <command>REINDEX</command> to rebuild them.
>      </para>

This documentation change seems wrong for me: reindex concurrently does not rebuild invalid indexes. To fix invalid
indexeswe still need reindex with lock table or recreate this index concurrently.
 

> +       A first pass to build the index is done for each new index entry.
> +       Once the index is built, its flag <literal>pg_class.isready</literal> is
> +       switched to <quote>true</quote>
> +       At this point <literal>pg_class.indisvalid</literal> is switched to
> +       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
> +       Old indexes have <literal>pg_class.isready</literal> switched to <quote>false</quote>

Should be pg_index.indisvalid and pg_index.indisready, right?

regards, Sergei


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 09/12/2018 19:55, Sergei Kornilov wrote:
>> reindex table CONCURRENTLY test;

By the way, does this syntax make sense?  I haven't seen a discussion on
this anywhere in the various threads.  I keep thinking that

    reindex concurrently table test;

would make more sense.  How about in combination with (verbose)?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Stephen Frost
Date:
Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> On 09/12/2018 19:55, Sergei Kornilov wrote:
> >> reindex table CONCURRENTLY test;
>
> By the way, does this syntax make sense?  I haven't seen a discussion on
> this anywhere in the various threads.  I keep thinking that
>
>     reindex concurrently table test;
>
> would make more sense.  How about in combination with (verbose)?

I don't think it's a mistake that we have 'create index concurrently'
and it certainly would seem odd to me for 'create index' and 'reindex
table' to be different.

Certainly, from my recollection of english, you'd say "I am going to
reindex the table concurrently", you wouldn't say "I am going to
reindex concurrently the table."

Based on at least a quick looking around, the actual grammar rule seems
to match my recollection[1], adverbs should typically go AFTER the
verb + object, and the adverb shouldn't ever be placed between the verb
and the object.

Thanks!

Stephen

[1]: http://www.grammar.cl/Notes/Adverbs.htm

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Dec 13, 2018 at 07:14:57PM -0500, Stephen Frost wrote:
> Based on at least a quick looking around, the actual grammar rule seems
> to match my recollection[1], adverbs should typically go AFTER the
> verb + object, and the adverb shouldn't ever be placed between the verb
> and the object.

This part has been a long debate already in 2012-2013 when I sent the
first iterations of the patch, and my memories on the matter are that
the grammar you are showing here matches with the past agreement.
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 14/12/2018 01:14, Stephen Frost wrote:
>>>> reindex table CONCURRENTLY test;
>>
>> By the way, does this syntax make sense?  I haven't seen a discussion on
>> this anywhere in the various threads.  I keep thinking that
>>
>>     reindex concurrently table test;
>>
>> would make more sense.  How about in combination with (verbose)?
> 
> I don't think it's a mistake that we have 'create index concurrently'
> and it certainly would seem odd to me for 'create index' and 'reindex
> table' to be different.
> 
> Certainly, from my recollection of english, you'd say "I am going to
> reindex the table concurrently", you wouldn't say "I am going to
> reindex concurrently the table."
> 
> Based on at least a quick looking around, the actual grammar rule seems
> to match my recollection[1], adverbs should typically go AFTER the
> verb + object, and the adverb shouldn't ever be placed between the verb
> and the object.

So it would be grammatical to say

    reindex table test concurrently

or in a pinch

    reindex concurrently table test

but I don't see anything grammatical about

    reindex table concurrently test

(given that the object is "table test").

Where this gets really messy is stuff like this:

    reindex (verbose) database concurrently postgres

Why would "concurrently" not be part of the options next to "verbose"?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 14/12/2018 01:23, Michael Paquier wrote:
> On Thu, Dec 13, 2018 at 07:14:57PM -0500, Stephen Frost wrote:
>> Based on at least a quick looking around, the actual grammar rule seems
>> to match my recollection[1], adverbs should typically go AFTER the
>> verb + object, and the adverb shouldn't ever be placed between the verb
>> and the object.
> 
> This part has been a long debate already in 2012-2013 when I sent the
> first iterations of the patch, and my memories on the matter are that
> the grammar you are showing here matches with the past agreement.

Do you happen to have a link for that?  I didn't find anything.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Alvaro Herrera
Date:
On 2018-Dec-14, Peter Eisentraut wrote:

> On 14/12/2018 01:23, Michael Paquier wrote:
> > On Thu, Dec 13, 2018 at 07:14:57PM -0500, Stephen Frost wrote:
> >> Based on at least a quick looking around, the actual grammar rule seems
> >> to match my recollection[1], adverbs should typically go AFTER the
> >> verb + object, and the adverb shouldn't ever be placed between the verb
> >> and the object.
> > 
> > This part has been a long debate already in 2012-2013 when I sent the
> > first iterations of the patch, and my memories on the matter are that
> > the grammar you are showing here matches with the past agreement.
> 
> Do you happen to have a link for that?  I didn't find anything.

I think putting the CONCURRENTLY in the parenthesized list of options is
most sensible.

CREATE INDEX didn't have such an option list when we added this feature
there; see 
https://www.postgresql.org/message-id/flat/200608011143.k71Bh9c22067%40momjian.us#029e9a7ee8cb38beee494ef7891bec1d
for some discussion about that grammar.  Our options were not great ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Stephen Frost
Date:
Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> On 14/12/2018 01:14, Stephen Frost wrote:
> >>>> reindex table CONCURRENTLY test;
> >>
> >> By the way, does this syntax make sense?  I haven't seen a discussion on
> >> this anywhere in the various threads.  I keep thinking that
> >>
> >>     reindex concurrently table test;
> >>
> >> would make more sense.  How about in combination with (verbose)?
> >
> > I don't think it's a mistake that we have 'create index concurrently'
> > and it certainly would seem odd to me for 'create index' and 'reindex
> > table' to be different.
> >
> > Certainly, from my recollection of english, you'd say "I am going to
> > reindex the table concurrently", you wouldn't say "I am going to
> > reindex concurrently the table."
> >
> > Based on at least a quick looking around, the actual grammar rule seems
> > to match my recollection[1], adverbs should typically go AFTER the
> > verb + object, and the adverb shouldn't ever be placed between the verb
> > and the object.
>
> So it would be grammatical to say
>
>     reindex table test concurrently

Yes, though I'm not really a fan of it.

> or in a pinch
>
>     reindex concurrently table test

No, you can't put concurrently between reindex and table.

> but I don't see anything grammatical about
>
>     reindex table concurrently test

I disagree, this does look reasonable to me and it's certainly much
better than 'reindex concurrently table' which looks clearly incorrect.

> Where this gets really messy is stuff like this:
>
>     reindex (verbose) database concurrently postgres
>
> Why would "concurrently" not be part of the options next to "verbose"?

That wasn't what was asked and I don't think I see a problem with having
concurrently be allowed in the parentheses.  For comparison, it's not
like "explain analyze select ..." or "explain buffers select" is
terribly good grammatical form.

If you wanted to try to get to a better form for the spelled out
sentence, I would think:

concurrently reindex table test

would probably be the approach to use, though that's not what we use for
'create index' and it'd be rather out of character for us to start a
command with an adverb, making it ultimately a poor choice overall.

Going back to what we already have done and have in released versions,
we have 'create unique index concurrently test ...' and that's at least
reasonable (the adverb isn't showing up between the verb and the object,
and the adjective is between the verb and the object) and is what I
vote to go with, with the caveat that if we want to also allow it inside
the parentheses, I'm fine with that.

Thanks!

Stephen

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Alvaro Herrera
Date:
On 2018-Dec-14, Stephen Frost wrote:

> That wasn't what was asked and I don't think I see a problem with having
> concurrently be allowed in the parentheses.  For comparison, it's not
> like "explain analyze select ..." or "explain buffers select" is
> terribly good grammatical form.

... and we don't allow EXPLAIN BUFFERS at all, and if we had had a
parenthesized option list in EXPLAIN when we invented EXPLAIN ANALYZE, I
bet we would have *not* made the ANALYZE keyword appear unadorned in
that command.

> If you wanted to try to get to a better form for the spelled out
> sentence, I would think:
> 
> concurrently reindex table test
> 
> would probably be the approach to use,

I think this is terrible from a command-completion perspective, and from
a documentation perspective (Certainly we wouldn't have a manpage about
the "concurrently" command, for starters).

My vote goes to put the keyword inside of and exclusively in the
parenthesized option list.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Stephen Frost
Date:
Greetings,

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> On 2018-Dec-14, Stephen Frost wrote:
>
> > That wasn't what was asked and I don't think I see a problem with having
> > concurrently be allowed in the parentheses.  For comparison, it's not
> > like "explain analyze select ..." or "explain buffers select" is
> > terribly good grammatical form.
>
> ... and we don't allow EXPLAIN BUFFERS at all, and if we had had a
> parenthesized option list in EXPLAIN when we invented EXPLAIN ANALYZE, I
> bet we would have *not* made the ANALYZE keyword appear unadorned in
> that command.

I'm not convinced of that- there is value in being able to write full
and useful commands without having to always use parentheses.

> > If you wanted to try to get to a better form for the spelled out
> > sentence, I would think:
> >
> > concurrently reindex table test
> >
> > would probably be the approach to use,
>
> I think this is terrible from a command-completion perspective, and from
> a documentation perspective (Certainly we wouldn't have a manpage about
> the "concurrently" command, for starters).

Right, I agreed that this had other downsides in the email you're
replying to here.  Glad we agree that it's not a good option.

> My vote goes to put the keyword inside of and exclusively in the
> parenthesized option list.

I disagree with the idea of exclusively having concurrently be in the
parentheses.  'explain buffers' is a much less frequently used option
(though that might, in part, be because it's a bit annoying to write out
explain (analyze, buffers) select...; I wonder if we could have a way to
say "if I'm running analyze, I always want buffers"...), but
concurrently reindexing a table (or index..) is going to almost
certainly be extremely common, perhaps even more common than *not*
reindexing concurrently.

Thanks!

Stephen

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Dec 14, 2018 at 09:00:58AM -0300, Alvaro Herrera wrote:
> On 2018-Dec-14, Peter Eisentraut wrote:
>> Do you happen to have a link for that?  I didn't find anything.

The message I was thinking about is close to here:
https://www.postgresql.org/message-id/20121210152856.GC16664@awork2.anarazel.de

> I think putting the CONCURRENTLY in the parenthesized list of options is
> most sensible.

For new options of VACUUM and ANALYZE we tend to prefer that as well,
and this simplifies the query parsing.
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 09/12/2018 19:55, Sergei Kornilov wrote:
>>       <para>
>>       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
>>       an <quote>invalid</quote> index. Such indexes are useless but it can be
>> -      convenient to use <command>REINDEX</command> to rebuild them. Note that
>> -      <command>REINDEX</command> will not perform a concurrent build. To build the
>> -      index without interfering with production you should drop the index and
>> -      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
>> +      convenient to use <command>REINDEX</command> to rebuild them.
>>      </para>
> This documentation change seems wrong for me: reindex concurrently does not rebuild invalid indexes. To fix invalid
indexeswe still need reindex with lock table or recreate this index concurrently.
 
> 

The current patch prevents REINDEX CONCURRENTLY of invalid indexes, but
I wonder why that is so.  Anyone remember?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Dec 27, 2018 at 11:04:09AM +0100, Peter Eisentraut wrote:
> The current patch prevents REINDEX CONCURRENTLY of invalid indexes, but
> I wonder why that is so.  Anyone remember?

It should be around this time:
https://www.postgresql.org/message-id/CAB7nPqRwVtQcHWErUf9o0hrRGFyQ9xArk7K7jCLxqKLy_6CXPQ@mail.gmail.com

And if I recall correctly the rason to not be able to reindex invalid
entries was that when working on a table, schema or database, if a
failure happens in the process, the reindexing would need to happen
for a double number of indexes when repeating the command.
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Alvaro Herrera
Date:
On 2018-Dec-14, Stephen Frost wrote:

> > My vote goes to put the keyword inside of and exclusively in the
> > parenthesized option list.
> 
> I disagree with the idea of exclusively having concurrently be in the
> parentheses.  'explain buffers' is a much less frequently used option
> (though that might, in part, be because it's a bit annoying to write out
> explain (analyze, buffers) select...; I wonder if we could have a way to
> say "if I'm running analyze, I always want buffers"...),

I'm skeptical.  I think EXPLAIN ANALYZE is more common because it has
more than one decade of advantage compared to the more detailed option
list.  Yes, it's also easier, but IMO it's a brain thing (muscle
memory), not a fingers thing.

> but concurrently reindexing a table (or index..) is going to almost
> certainly be extremely common, perhaps even more common than *not*
> reindexing concurrently.

Well, users can use the reindexdb utility and save some keystrokes.

Anyway we don't typically add redundant ways to express the same things.
Where we have them, it's just because the old way was there before, and
we added the extensible way later.  Adding two in the first appearance
of a new feature seems absurd to me.

After looking at the proposed grammar again today and in danger of
repeating myself, IMO allowing the concurrency keyword to appear outside
the parens would be a mistake.  Valid commands:

  REINDEX (VERBOSE, CONCURRENTLY) TABLE foo;
  REINDEX (CONCURRENTLY) INDEX bar;

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andrew Gierth
Date:
>>>>> "Alvaro" == Alvaro Herrera <alvherre@2ndquadrant.com> writes:

 Alvaro> After looking at the proposed grammar again today and in danger
 Alvaro> of repeating myself, IMO allowing the concurrency keyword to
 Alvaro> appear outside the parens would be a mistake. Valid commands:

 Alvaro>   REINDEX (VERBOSE, CONCURRENTLY) TABLE foo;
 Alvaro>   REINDEX (CONCURRENTLY) INDEX bar;

We burned that bridge with CREATE INDEX CONCURRENTLY; to make REINDEX
require different syntax would be too inconsistent.

If we didn't have all these existing uses of CONCURRENTLY without
parens, your argument might have more merit; but we do.

-- 
Andrew (irc:RhodiumToad)


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
On 2018-12-31 21:35:57 +0000, Andrew Gierth wrote:
> >>>>> "Alvaro" == Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> 
>  Alvaro> After looking at the proposed grammar again today and in danger
>  Alvaro> of repeating myself, IMO allowing the concurrency keyword to
>  Alvaro> appear outside the parens would be a mistake. Valid commands:
> 
>  Alvaro>   REINDEX (VERBOSE, CONCURRENTLY) TABLE foo;
>  Alvaro>   REINDEX (CONCURRENTLY) INDEX bar;
> 
> We burned that bridge with CREATE INDEX CONCURRENTLY; to make REINDEX
> require different syntax would be too inconsistent.
> 
> If we didn't have all these existing uses of CONCURRENTLY without
> parens, your argument might have more merit; but we do.

+1


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Stephen Frost
Date:
Greetings,

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> On 2018-Dec-14, Stephen Frost wrote:
>
> > > My vote goes to put the keyword inside of and exclusively in the
> > > parenthesized option list.
> >
> > I disagree with the idea of exclusively having concurrently be in the
> > parentheses.  'explain buffers' is a much less frequently used option
> > (though that might, in part, be because it's a bit annoying to write out
> > explain (analyze, buffers) select...; I wonder if we could have a way to
> > say "if I'm running analyze, I always want buffers"...),
>
> I'm skeptical.  I think EXPLAIN ANALYZE is more common because it has
> more than one decade of advantage compared to the more detailed option
> list.  Yes, it's also easier, but IMO it's a brain thing (muscle
> memory), not a fingers thing.

I would argue that it's both.

> > but concurrently reindexing a table (or index..) is going to almost
> > certainly be extremely common, perhaps even more common than *not*
> > reindexing concurrently.
>
> Well, users can use the reindexdb utility and save some keystrokes.

That's a really poor argument as those unix utilities are hardly ever
used, in my experience.

> Anyway we don't typically add redundant ways to express the same things.
> Where we have them, it's just because the old way was there before, and
> we added the extensible way later.  Adding two in the first appearance
> of a new feature seems absurd to me.

SQL allows many, many different ways to express the same thing.  I agree
that we haven't done that much in our utility commands, but I don't see
that as an argument against doing so, just that we haven't (previously)
really had the need- because most of the time we don't have a bunch of
different options where we want to have a list.

> After looking at the proposed grammar again today and in danger of
> repeating myself, IMO allowing the concurrency keyword to appear outside
> the parens would be a mistake.  Valid commands:
>
>   REINDEX (VERBOSE, CONCURRENTLY) TABLE foo;
>   REINDEX (CONCURRENTLY) INDEX bar;

This discussion hasn't changed my opinion, and, though I'm likely
repeating myself as well, I also agree with the down-thread comment that
this ship really has already sailed.

Thanks!

Stephen

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
Updated patch for some merge conflicts and addressing most of your
comments.  (I did not do anything about the syntax.)

On 09/12/2018 19:55, Sergei Kornilov wrote:
> I found one error in phase 4. Simple reproducer:
> 
>> create table test (i int);
>> create index this_is_very_large_exactly_maxnamelen_index_name_wink_wink_wink on test (i);
>> create index this_is_very_large_exactly_maxnamelen_index_name_wink_winkccold on test (i);
>> reindex table CONCURRENTLY test;
> 
> This fails with error
> 
>> ERROR:  duplicate key value violates unique constraint "pg_class_relname_nsp_index"
>> DETAIL:  Key (relname, relnamespace)=(this_is_very_large_exactly_maxnamelen_index_name_wink_win_ccold, 2200) already
exists.
> 
> CommandCounterIncrement() in (or after) index_concurrently_swap will fix this issue.

fixed

>> ReindexPartitionedIndex(Relation parentIdx)
>>     ereport(ERROR,
>>             (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>              errmsg("REINDEX is not yet implemented for partitioned indexes")));
> 
> I think we need add errhint("you can REINDEX each partition separately") or something similar. 
> Also can we omit this warning for reindex database? All partition must be in same database and warning in such case
isuseless: we have warning, but doing reindex for each partition => we reindex partitioned table correctly.
 

fixed by skipping in ReindexRelationConcurrently(), same as other
unsupported relkinds

> Another behavior issue i found with reindex (verbose) schema/database: INFO ereport is printed twice for each table.
> 
>> INFO:  relation "measurement_y2006m02" was reindexed
>> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.07 s.
>> INFO:  table "public.measurement_y2006m02" was reindexed
> 
> One from ReindexMultipleTables and another (with pg_rusage_show) from ReindexRelationConcurrently.

fixed with some restructuring

>> ReindexRelationConcurrently
>> if (!indexRelation->rd_index->indisvalid)
> 
> it is better use IndexIsValid macro here? And same question about added indexform->indisvalid in
src/backend/commands/tablecmds.c

IndexIsValid() has been removed in the meantime.

>>       <para>
>>       An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
>>       an <quote>invalid</quote> index. Such indexes are useless but it can be
>> -      convenient to use <command>REINDEX</command> to rebuild them. Note that
>> -      <command>REINDEX</command> will not perform a concurrent build. To build the
>> -      index without interfering with production you should drop the index and
>> -      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
>> +      convenient to use <command>REINDEX</command> to rebuild them.
>>      </para>
> 
> This documentation change seems wrong for me: reindex concurrently does not rebuild invalid indexes. To fix invalid
indexeswe still need reindex with lock table or recreate this index concurrently.
 

still being discussed elsewhere in this thread

>> +       A first pass to build the index is done for each new index entry.
>> +       Once the index is built, its flag <literal>pg_class.isready</literal> is
>> +       switched to <quote>true</quote>
>> +       At this point <literal>pg_class.indisvalid</literal> is switched to
>> +       <quote>true</quote> for the new index and to <quote>false</quote> for the old, and
>> +       Old indexes have <literal>pg_class.isready</literal> switched to <quote>false</quote>
> 
> Should be pg_index.indisvalid and pg_index.indisready, right?

fixed

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Sergei Kornilov
Date:
Hello
Thank you! I review new patch version. It applied, builds and pass tests. Code looks good, but i notice new behavior
notes:

> postgres=# reindex (verbose) table CONCURRENTLY measurement ;
> WARNING:  REINDEX of partitioned tables is not yet implemented, skipping "measurement"
> NOTICE:  table "measurement" has no indexes
> REINDEX
> postgres=# \d measurement
>         Partitioned table "public.measurement"
>   Column   |  Type   | Collation | Nullable | Default 
> -----------+---------+-----------+----------+---------
>  city_id   | integer |           | not null | 
>  logdate   | date    |           | not null | 
>  peaktemp  | integer |           |          | 
>  unitsales | integer |           |          | 
> Partition key: RANGE (logdate)
> Indexes:
>     "measurement_logdate_idx" btree (logdate)
> Number of partitions: 0

NOTICE seems unnecessary here.

Unfortunally concurrenttly reindex loses comments, reproducer:

> create table testcomment (i int);
> create index testcomment_idx1 on testcomment (i);
> comment on index testcomment_idx1 IS 'test comment';
> \di+ testcomment_idx1
> reindex table testcomment ;
> \di+ testcomment_idx1 # ok
> reindex table CONCURRENTLY testcomment ;
> \di+ testcomment_idx1 # we lose comment

Also i think we need change REINDEX to "<command>REINDEX</command> (without <option>CONCURRENTLY</option>)" in ACCESS
EXCLUSIVEsection Table-level Lock Modes documentation (to be similar with REFRESH MATERIALIZED VIEW and CREATE INDEX
description)

About reindex invalid indexes - i found one good question in archives [1]: how about toast indexes?
I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index.
Reproduce:
session 1: begin; select from test_toast ... for update;
session 2: reindex table CONCURRENTLY test_toast ;
session 2: interrupt by ctrl+C
session 1: commit
session 2: reindex table test_toast ;
and now we have two toast indexes. DROP INDEX is able to remove only invalid ones. Valid index gives "ERROR:
permissiondenied: "pg_toast_16426_index_ccnew" is a system catalog"
 

About syntax: i vote for current syntax "reindex table CONCURRENTLY tablename". This looks consistent with existed
CREATEINDEX CONCURRENTLY and REFRESH MATERIALIZED VIEW CONCURRENTLY.
 

regards, Sergei

[1]: https://www.postgresql.org/message-id/CAB7nPqT%2B6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw%40mail.gmail.com


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Pavel Stehule
Date:



About syntax: i vote for current syntax "reindex table CONCURRENTLY tablename". This looks consistent with existed CREATE INDEX CONCURRENTLY and REFRESH MATERIALIZED VIEW CONCURRENTLY.

+1

Pavel


regards, Sergei

[1]: https://www.postgresql.org/message-id/CAB7nPqT%2B6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw%40mail.gmail.com

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote:
> NOTICE seems unnecessary here.
>
> Unfortunally concurrently reindex loses comments, reproducer:

Yes, the NOTICE message makes little sense.

I am getting back in touch with this stuff.  It has been some time but
the core of the patch has not actually changed in its base concept, so
I am still very familiar with it as the original author.  There are
even typos I may have introduced a couple of years back, like
"contraint".  I have not yet spent much time on that, but there are at
quick glance a bunch of things that could be retouched to get pieces
of that committable.

+    The concurrent index created during the processing has a name ending in
+    the suffix ccnew, or ccold if it is an old index definiton which we failed
+    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</literal>,
+    including invalid toast indexes.
This needs <literal> markups for "ccnew" and "ccold".  "definiton" is
not correct.

index_create does not actually need its extra argument with the tuple
descriptor.  I think that we had better grab the column name list from
indexInfo and just pass that down to index_create() (patched on my
local branch), so it is an overkill to take a full copy of the index's
TupleDesc.

The patch, standing as-is, is close to 2k lines long, so let's cut
that first into more pieces refactoring the concurrent build code.
Here are some preliminary notes:
- WaitForOlderSnapshots() could be in its own patch.
- index_concurrently_build() and index_concurrently_set_dead() can be
in an independent patch.  set_dead() had better be a wrapper on top of
index_set_state_flags actually which is able to set any kind of
flags.
- A couple of pieces in index_create() could be cut as well.

I can send patches for those things as first steps which could happen
in this commit then, and commit them as needed.  This way, we reduce
the size of the main patch.  Even if the main portion does not get
into v12, we'd still have base pieces to move on next.

Regarding the grammar, we tend for the last couple of years to avoid
complicating the main grammar and move on to parenthesized grammars
(see VACUUM, ANALYZE, EXPLAIN, etc).  So in the same vein I think that
it would make sense to only support CONCURRENTLY within parenthesis
and just plugin that with the VERBOSE option.

Does somebody mind if I jump into the ship after so long?  I was the
original author of the monster after all...
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andreas Karlsson
Date:
On 1/16/19 9:27 AM, Michael Paquier wrote:> Regarding the grammar, we 
tend for the last couple of years to avoid
> complicating the main grammar and move on to parenthesized grammars
> (see VACUUM, ANALYZE, EXPLAIN, etc).  So in the same vein I think that
> it would make sense to only support CONCURRENTLY within parenthesis
> and just plugin that with the VERBOSE option.

Personally I do not care, but there have been a lot of voices for 
keeping REINDEX CONCURRENTLY consistent with CREATE INDEX CONCURRENTLY 
and DROP INDEX CONCURRENTLY.

> Does somebody mind if I jump into the ship after so long?  I was the
> original author of the monster after all...

Fine by me. Peter?

Andreas



Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Alvaro Herrera
Date:
On 2019-Jan-16, Michael Paquier wrote:

> Regarding the grammar, we tend for the last couple of years to avoid
> complicating the main grammar and move on to parenthesized grammars
> (see VACUUM, ANALYZE, EXPLAIN, etc).  So in the same vein I think that
> it would make sense to only support CONCURRENTLY within parenthesis
> and just plugin that with the VERBOSE option.

That's my opinion too, but I was outvoted in another subthread -- see
https://postgr.es/m/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql
Stephen Frost, Andrew Gierth and Andres Freund all voted to put
CONCURRENTLY outside the parens.  It seems we now have three votes to
put it *in* the parens (you, Peter Eisentraut, me).  I guess more votes
are needed to settle this issue.

My opinion is that if we had had parenthesized options lists back when
CREATE INDEX CONCURRENTLY was invented, we would have put it there.
But we were young and needed the money ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Wed, Jan 16, 2019 at 02:59:31PM -0300, Alvaro Herrera wrote:
> That's my opinion too, but I was outvoted in another subthread -- see
> https://postgr.es/m/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql
> Stephen Frost, Andrew Gierth and Andres Freund all voted to put
> CONCURRENTLY outside the parens.  It seems we now have three votes to
> put it *in* the parens (you, Peter Eisentraut, me).  I guess more votes
> are needed to settle this issue.

Sure, let's see.  I would have been in the crowd of not using
parenthetised grammar five years ago, but the recent deals with other
commands worry me, as we would repeat the same errors.

> My opinion is that if we had had parenthesized options lists back when
> CREATE INDEX CONCURRENTLY was invented, we would have put it there.
> But we were young and needed the money ...

:)
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Wed, Jan 16, 2019 at 05:56:15PM +0100, Andreas Karlsson wrote:
> On 1/16/19 9:27 AM, Michael Paquier wrote:
>> Does somebody mind if I jump into the ship after so long?  I was the
>> original author of the monster after all...
>
> Fine by me. Peter?

Okay, I have begun digging into the patch, and extracted for now two
things which can be refactored first, giving a total of three patches:
- 0001, which creates WaitForOlderSnapshots to snapmgr.c.  I think
that this can be useful for external extensions to have a process wait
for snapshots older than a minimum threshold hold by other
transactions.
- 0002, which moves the concurrent index build into its own routine,
index_build_concurrent().  At the same time, index_build() has a
isprimary argument which is not used, so let's remove it.  This
simplifies a bit the refactoring as well.
- 0003 is the core patch, realigned with the rest, fixing some typos I
found on the way.

Here are also some notes for things I am planning to look after with a
second pass:
- The concurrent drop (phase 5) part, still shares a lot with DROP
INDEX CONCURRENTLY, and I think that we had better refactor more the
code so as REINDEX CONCURRENTLY shared more with DROP INDEX.  One
thing which I think is incorrect is that we do not clear the invalid
flag of the drop index before marking it as dead.  This looks like a
missing piece from another concurrent-related bug fix lost over the
rebases this patch had.
- set_dead could be refactored so as it is able to handle in input
multiple indexes, using WaitForLockersMultiple().  This way CREATE
INDEX CONCURRENTLY could also use it.
- There are no regression tests for partitioned tables.
- The NOTICE messages showing up when a table has no indexes should be
removed.
- index_create() does not really need a TupleDesc argument, as long as
the caller is able to provide a list of column names.
- At the end of the day, I think that it would be nice to reach a
state where we have a set of low-level routines like
index_build_concurrent, index_set_dead_concurrent which are used by
both paths of CONCURRENTLY and can be called for each phase within a
given transaction.  Those pieces can also be helpful to implement for
example an extension able to do concurrent reindexing out of core.

I think that the refactorings in 0001 and 0002 are committable as-is,
and this shaves some code from the core patch.

Thoughts?
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Vik Fearing
Date:
On 16/01/2019 18:59, Alvaro Herrera wrote:
> On 2019-Jan-16, Michael Paquier wrote:
> 
>> Regarding the grammar, we tend for the last couple of years to avoid
>> complicating the main grammar and move on to parenthesized grammars
>> (see VACUUM, ANALYZE, EXPLAIN, etc).  So in the same vein I think that
>> it would make sense to only support CONCURRENTLY within parenthesis
>> and just plugin that with the VERBOSE option.
> 
> That's my opinion too, but I was outvoted in another subthread -- see
> https://postgr.es/m/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql
> Stephen Frost, Andrew Gierth and Andres Freund all voted to put
> CONCURRENTLY outside the parens.  It seems we now have three votes to
> put it *in* the parens (you, Peter Eisentraut, me).  I guess more votes
> are needed to settle this issue.

My vote is to have homogeneous syntax for all of this, and so put it in
parentheses, but we should also allow CREATE INDEX and DROP INDEX to use
parentheses for it, too.

I supposed we'll keep what would then be the legacy syntax for a few
decades or more.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Jan 18, 2019 at 07:58:06PM +0100, Vik Fearing wrote:
> My vote is to have homogeneous syntax for all of this, and so put it in
> parentheses, but we should also allow CREATE INDEX and DROP INDEX to use
> parentheses for it, too.

That would be a new thing as these variants don't exist yet, and WITH
is for storage parameters.  In my opinion, the long-term take on doing
such things is that we are then able to reduce the number of reserved
keywords in the grammar.  Even if for the case of CONCURRENTLY we may
see humans on Mars before this actually happens, this does not mean
that we should not do it moving forward for other keywords in the
grammar.
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Vik Fearing
Date:
On 19/01/2019 02:33, Michael Paquier wrote:
> On Fri, Jan 18, 2019 at 07:58:06PM +0100, Vik Fearing wrote:
>> My vote is to have homogeneous syntax for all of this, and so put it in
>> parentheses, but we should also allow CREATE INDEX and DROP INDEX to use
>> parentheses for it, too.
>
> That would be a new thing as these variants don't exist yet, and WITH
> is for storage parameters.  In my opinion, the long-term take on doing
> such things is that we are then able to reduce the number of reserved
> keywords in the grammar.  Even if for the case of CONCURRENTLY we may
> see humans on Mars before this actually happens, this does not mean
> that we should not do it moving forward for other keywords in the
> grammar.


I'm not sure I understand your point.

I don't want a situation like this:
    CREATE INDEX CONCURRENTLY ...
    DROP INDEX CONCURRENTLY ...
    REINDEX INDEX (CONCURRENTLY) ...

All three should be the same, and my suggestion is to add the
parenthesized version to CREATE and DROP and not add the unparenthesized
version to REINDEX.

I never said anything about WITH.
--
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Jan 17, 2019 at 02:11:01PM +0900, Michael Paquier wrote:
> Okay, I have begun digging into the patch, and extracted for now two
> things which can be refactored first, giving a total of three patches:
> - 0001, which creates WaitForOlderSnapshots to snapmgr.c.  I think
> that this can be useful for external extensions to have a process wait
> for snapshots older than a minimum threshold hold by other
> transactions.
> - 0002, which moves the concurrent index build into its own routine,
> index_build_concurrent().  At the same time, index_build() has a
> isprimary argument which is not used, so let's remove it.  This
> simplifies a bit the refactoring as well.
> - 0003 is the core patch, realigned with the rest, fixing some typos I
> found on the way.

Are there any objections if I commit 0001?  Introducing
WaitForOlderSnapshots() is quite independent from the rest, and the
refactoring is obvious.  For 0002, I am still not 100% sure if
index_build_concurrent() is the best interface but I am planning to
look more at this stuff next week, particularly the drop portion which
needs more work.
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Sergei Kornilov
Date:
Hello

> I don't want a situation like this:
>     CREATE INDEX CONCURRENTLY ...
>     DROP INDEX CONCURRENTLY ...
>     REINDEX INDEX (CONCURRENTLY) ...
>
> All three should be the same, and my suggestion is to add the
> parenthesized version to CREATE and DROP and not add the unparenthesized
> version to REINDEX.

We already have parenthesized VERBOSE option for REINDEX. So proposed syntax was:

REINDEX (CONCURRENTLY) INDEX ...
REINDEX (VERBOSE, CONCURRENTLY) INDEX ...

Like parameters for EXPLAIN, VACUUM. And completely unlike create/drop index.

So consistent syntax for create/drop would be:

CREATE (CONCURRENTLY) INDEX ...
CREATE (UNIQUE, CONCURRENTLY) INDEX ... # or we want parenthesized concurrently, but not unique? CREATE UNIQUE
(CONCURRENTLY)INDEX?
 
DROP  (CONCURRENTLY) INDEX ...

How about REFRESH MATERIALIZED VIEW? Do not change?

regards, Sergei


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Stephen Frost
Date:
Greetings,

* Vik Fearing (vik.fearing@2ndquadrant.com) wrote:
> On 16/01/2019 18:59, Alvaro Herrera wrote:
> > On 2019-Jan-16, Michael Paquier wrote:
> >
> >> Regarding the grammar, we tend for the last couple of years to avoid
> >> complicating the main grammar and move on to parenthesized grammars
> >> (see VACUUM, ANALYZE, EXPLAIN, etc).  So in the same vein I think that
> >> it would make sense to only support CONCURRENTLY within parenthesis
> >> and just plugin that with the VERBOSE option.
> >
> > That's my opinion too, but I was outvoted in another subthread -- see
> > https://postgr.es/m/20181214144529.wvmjwmy7wxgmgyb3@alvherre.pgsql
> > Stephen Frost, Andrew Gierth and Andres Freund all voted to put
> > CONCURRENTLY outside the parens.  It seems we now have three votes to
> > put it *in* the parens (you, Peter Eisentraut, me).  I guess more votes
> > are needed to settle this issue.
>
> My vote is to have homogeneous syntax for all of this, and so put it in
> parentheses, but we should also allow CREATE INDEX and DROP INDEX to use
> parentheses for it, too.
>
> I supposed we'll keep what would then be the legacy syntax for a few
> decades or more.

I'm still of the opinion that we should have CONCURRENTLY allowed
without the parentheses.  I could see allowing it with them, as well,
but I do feel that we should be using the parentheses-based approach
more as a last-resort kind of thing instead of just baking in everything
to require them.

We have said before that we don't want to have things implemented in a
purely functional way (see the discussions around pglogical and such)
and while this isn't quite the same, I do think it heads in that
direction.  It's certainly harder to have to think about how to
structure these commands so that they look like they belong in SQL but I
think it has benefits too.

Thanks!

Stephen

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:

On January 19, 2019 7:32:55 AM PST, Stephen Frost <sfrost@snowman.net> wrote:
>Greetings,
>
>* Vik Fearing (vik.fearing@2ndquadrant.com) wrote:
>> My vote is to have homogeneous syntax for all of this, and so put it
>in
>> parentheses, but we should also allow CREATE INDEX and DROP INDEX to
>use
>> parentheses for it, too.
>>
>> I supposed we'll keep what would then be the legacy syntax for a few
>> decades or more.
>
>I'm still of the opinion that we should have CONCURRENTLY allowed
>without the parentheses.  I could see allowing it with them, as well,
>but I do feel that we should be using the parentheses-based approach
>more as a last-resort kind of thing instead of just baking in
>everything
>to require them.

+1

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Sat, Jan 19, 2019 at 03:01:07AM +0100, Vik Fearing wrote:
> On 19/01/2019 02:33, Michael Paquier wrote:
>> On Fri, Jan 18, 2019 at 07:58:06PM +0100, Vik Fearing wrote:
>>> My vote is to have homogeneous syntax for all of this, and so put it in
>>> parentheses, but we should also allow CREATE INDEX and DROP INDEX to use
>>> parentheses for it, too.
>>
>> That would be a new thing as these variants don't exist yet, and WITH
>> is for storage parameters.  In my opinion, the long-term take on doing
>> such things is that we are then able to reduce the number of reserved
>> keywords in the grammar.  Even if for the case of CONCURRENTLY we may
>> see humans on Mars before this actually happens, this does not mean
>> that we should not do it moving forward for other keywords in the
>> grammar.
>
> I'm not sure I understand your point.
>
> I don't want a situation like this:
>     CREATE INDEX CONCURRENTLY ...
>     DROP INDEX CONCURRENTLY ...
>     REINDEX INDEX (CONCURRENTLY) ...
>
> All three should be the same, and my suggestion is to add the
> parenthesized version to CREATE and DROP and not add the unparenthesized
> version to REINDEX.

I am not sure what is the actual reason which could force to decide
that all three queries should have the same grammar, and why this has
anything to do on a thread about REINDEX.  REINDEX can work on many
more object types than an index so its scope is much larger, contrary
to CREATE/DROP INDEX.  An advantage of using parenthesized grammar and
prioritize it is that you don't have to add it to the list of reserved
keywords, and the parser can rely on IDENT for its work.

I personally prefer the parenthesized grammar for that reason.  If the
crowd votes in majority for the other option, that's of course fine to
me too.

> I never said anything about WITH.

Perhaps I have not explained my thoughts clearly here.  My point was
that if some day we decide to drop the non-parenthesized grammar of
CREATE/DROP INDEX, one possibility would be to have a "concurrent"
option as part of WITH, even if that's used only now for storage
parameters. That's the only actual part of the grammar which is
extensible.
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Robert Haas
Date:
On Fri, Jan 18, 2019 at 9:01 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
> I don't want a situation like this:
>     CREATE INDEX CONCURRENTLY ...
>     DROP INDEX CONCURRENTLY ...
>     REINDEX INDEX (CONCURRENTLY) ...
>
> All three should be the same, and my suggestion is to add the
> parenthesized version to CREATE and DROP and not add the unparenthesized
> version to REINDEX.

+1 for all three being the same.  I could see allowing only the
unparenthesized format for all three, or allowing both forms for all
three, but I think having only one form for each and having them not
agree will be too confusing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Pavel Stehule
Date:


st 23. 1. 2019 19:17 odesílatel Robert Haas <robertmhaas@gmail.com> napsal:
On Fri, Jan 18, 2019 at 9:01 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
> I don't want a situation like this:
>     CREATE INDEX CONCURRENTLY ...
>     DROP INDEX CONCURRENTLY ...
>     REINDEX INDEX (CONCURRENTLY) ...
>
> All three should be the same, and my suggestion is to add the
> parenthesized version to CREATE and DROP and not add the unparenthesized
> version to REINDEX.

+1 for all three being the same.  I could see allowing only the
unparenthesized format for all three, or allowing both forms for all
three, but I think having only one form for each and having them not
agree will be too confusing.

+1

Pavel


--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
Hi,

On 2019-01-23 13:17:26 -0500, Robert Haas wrote:
> On Fri, Jan 18, 2019 at 9:01 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
> > I don't want a situation like this:
> >     CREATE INDEX CONCURRENTLY ...
> >     DROP INDEX CONCURRENTLY ...
> >     REINDEX INDEX (CONCURRENTLY) ...
> >
> > All three should be the same, and my suggestion is to add the
> > parenthesized version to CREATE and DROP and not add the unparenthesized
> > version to REINDEX.
> 
> +1 for all three being the same.  I could see allowing only the
> unparenthesized format for all three, or allowing both forms for all
> three, but I think having only one form for each and having them not
> agree will be too confusing.

It seems quite unnecesarily confusing to me to require parens for
REINDEX CONCURRENTLY when we've historically not required that for
CREATE/DROP INDEX CONCURRENTLY. Besides that, training people that it's
the correct form to use parens for CIC/DIC, creates an unnecessary
version dependency.

I think it actually makes sense to see the CONCURRENTLY versions as
somewhat separate types of statements than the non concurrent
versions. They have significantly different transactional behaviour
(like not being able to be run within one, and leaving gunk behind in
case of error). For me it semantically makes sense to have that denoted
at the toplevel, it's a related but different type of DDL statement.

Greetings,

Andres Freund


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
Here is an updated patch, which addresses some of your issues below as
well as the earlier reported issue that comments were lost during
REINDEX CONCURRENTLY.

On 16/01/2019 09:27, Michael Paquier wrote:
> On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote:
>> NOTICE seems unnecessary here.
>>
>> Unfortunally concurrently reindex loses comments, reproducer:
> 
> Yes, the NOTICE message makes little sense.

This is existing behavior of reindex-not-concurrently.

> I am getting back in touch with this stuff.  It has been some time but
> the core of the patch has not actually changed in its base concept, so
> I am still very familiar with it as the original author.  There are
> even typos I may have introduced a couple of years back, like
> "contraint".  I have not yet spent much time on that, but there are at
> quick glance a bunch of things that could be retouched to get pieces
> of that committable.
> 
> +    The concurrent index created during the processing has a name ending in
> +    the suffix ccnew, or ccold if it is an old index definiton which we failed
> +    to drop. Invalid indexes can be dropped using <literal>DROP INDEX</literal>,
> +    including invalid toast indexes.
> This needs <literal> markups for "ccnew" and "ccold".  "definiton" is
> not correct.

Fixed those.

> index_create does not actually need its extra argument with the tuple
> descriptor.  I think that we had better grab the column name list from
> indexInfo and just pass that down to index_create() (patched on my
> local branch), so it is an overkill to take a full copy of the index's
> TupleDesc.

Please send a fixup patch.

> The patch, standing as-is, is close to 2k lines long, so let's cut
> that first into more pieces refactoring the concurrent build code.
> Here are some preliminary notes:
> - WaitForOlderSnapshots() could be in its own patch.
> - index_concurrently_build() and index_concurrently_set_dead() can be
> in an independent patch.  set_dead() had better be a wrapper on top of
> index_set_state_flags actually which is able to set any kind of
> flags.
> - A couple of pieces in index_create() could be cut as well.

I'm not a fan of that.  I had already considered all the ways in which
subparts of this patch could get committed, and some of it was
committed, so what's left now is what I thought should stay together.
The patch isn't really that big and most of it is moving code around.  I
would also avoid chopping around in this patch now and focus on getting
it finished instead.  The functionality seems solid, so if it's good,
let's commit it, if it's not, let's get it fixed up.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Tue, Jan 29, 2019 at 09:51:35PM +0100, Peter Eisentraut wrote:
> On 16/01/2019 09:27, Michael Paquier wrote:
>> index_create does not actually need its extra argument with the tuple
>> descriptor.  I think that we had better grab the column name list from
>> indexInfo and just pass that down to index_create() (patched on my
>> local branch), so it is an overkill to take a full copy of the index's
>> TupleDesc.
>
> Please send a fixup patch.

Sure.  Attached is a patch which can be applied on top of what you
sent last, based on what I noticed at review, here and there.  You
also forgot to switch two heap_open() to table_open() in pg_depend.c.

>> The patch, standing as-is, is close to 2k lines long, so let's cut
>> that first into more pieces refactoring the concurrent build code.
>> Here are some preliminary notes:
>> - WaitForOlderSnapshots() could be in its own patch.
>> - index_concurrently_build() and index_concurrently_set_dead() can be
>> in an independent patch.  set_dead() had better be a wrapper on top of
>> index_set_state_flags actually which is able to set any kind of
>> flags.
>> - A couple of pieces in index_create() could be cut as well.
>
> I'm not a fan of that.  I had already considered all the ways in which
> subparts of this patch could get committed, and some of it was
> committed, so what's left now is what I thought should stay together.
> The patch isn't really that big and most of it is moving code around.  I
> would also avoid chopping around in this patch now and focus on getting
> it finished instead.  The functionality seems solid, so if it's good,
> let's commit it, if it's not, let's get it fixed up.

Well, the feature looks solid to me, and not much of its code has
actually changed over the years FWIW.

Committing large and complex patches is something you have more
experience with than myself, and I find the exercise very difficult.
So if you feel it's adapted to keep things grouped together, it is not
an issue to me.  I'll follow the lead.
--
Michael

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 30/01/2019 06:16, Michael Paquier wrote:
> On Tue, Jan 29, 2019 at 09:51:35PM +0100, Peter Eisentraut wrote:
>> On 16/01/2019 09:27, Michael Paquier wrote:
>>> index_create does not actually need its extra argument with the tuple
>>> descriptor.  I think that we had better grab the column name list from
>>> indexInfo and just pass that down to index_create() (patched on my
>>> local branch), so it is an overkill to take a full copy of the index's
>>> TupleDesc.
>>
>> Please send a fixup patch.
> 
> Sure.  Attached is a patch which can be applied on top of what you
> sent last, based on what I noticed at review, here and there.  You
> also forgot to switch two heap_open() to table_open() in pg_depend.c.

OK, applied most of that.

I didn't take your function renaming.  I had deliberately named the
functions index_concurrently_${task}, so their relationship is more
easily visible.

Anyway, that's all cosmetics.  Are there any more functional or
correctness issues to be addressed?

Another thing I was thinking of: We need some database-global tests.
For example, at some point during development, I had broken some variant
of REINDEX DATABASE.  Where could we put those tests?  Maybe with reindexdb?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Feb 07, 2019 at 12:49:43PM +0100, Peter Eisentraut wrote:
> Anyway, that's all cosmetics.  Are there any more functional or
> correctness issues to be addressed?

Not that I can think of.  At least this evening.

> Another thing I was thinking of: We need some database-global tests.
> For example, at some point during development, I had broken some variant
> of REINDEX DATABASE.  Where could we put those tests?  Maybe with
> reindexdb?

Having some coverage in the TAP tests of reindexdb is a good idea.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Sergei Kornilov
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            tested, passed

Hello
Sorry for late response. I review latest patch version.

I didn't found any new behavior bugs.

reindex concurrenlty can be in deadlock with another reindex concurrently (or manual vacuum (maybe with wraparound
autovacuum)or create index concurrently on same table). But i think this is not issue for this feature, two create
indexconcurrently can be in deadlock too.
 

Just one small note for documentation:

> +Indexes:
> +    "idx" btree (col)
> +    "idx_cct" btree (col) INVALID

Second index should be idx_ccnew (or idx_ccold), right?

Code looks good for me.

regards, Sergei

Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Alvaro Herrera
Date:
On 2019-Feb-07, Peter Eisentraut wrote:

> Another thing I was thinking of: We need some database-global tests.
> For example, at some point during development, I had broken some variant
> of REINDEX DATABASE.  Where could we put those tests?  Maybe with reindexdb?

What's wrong with a new reindex.sql in regress?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Feb 07, 2019 at 12:07:01PM -0300, Alvaro Herrera wrote:
> On 2019-Feb-07, Peter Eisentraut wrote:
>> Another thing I was thinking of: We need some database-global tests.
>> For example, at some point during development, I had broken some variant
>> of REINDEX DATABASE.  Where could we put those tests?  Maybe with reindexdb?
>
> What's wrong with a new reindex.sql in regress?

Depending on the numbers of objects created and remaining around
before the script is run in the main suite, this would be costly.  I
think that this approach would not scale well in the long-term.
Having TAP test with reindexdb gives you access to a full instance
with its contents always fully known at test time.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Sergei Kornilov
Date:
Hello

Patch is marked as target version 12, but is inactive few weeks long. I think many users want this feature and patch is
ingood shape. We have open questions on this thread?
 

Latest patch still can be aplied cleanly; it builds and pass tests.

regards, Sergei


Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-03-13 15:13, Sergei Kornilov wrote:
> Patch is marked as target version 12, but is inactive few weeks long. I think many users want this feature and patch
isin good shape. We have open questions on this thread?
 
> 
> Latest patch still can be aplied cleanly; it builds and pass tests.

Here is an updated patch.  It includes the typo fix in the documentation
from you, some small bug fixes, a new reindexdb --concurrently option,
and based on that some whole-database tests, as discussed recently.

I think this addresses all open issues.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Michael Banck
Date:
Hi,

Am Mittwoch, den 13.03.2019, 23:10 +0100 schrieb Peter Eisentraut:
> Here is an updated patch.

I had a quick look at some of the comments and noticed some possible
nitpicky-level problems:

> +/*
> + * index_concurrently_set_dead
> + *
> + * Perform the last invalidation stage of DROP INDEX CONCURRENTLY or REINDEX
> + * CONCURRENTLY before actually dropping the index. After calling this
> + * function the index is seen by all the backends as dead. Low-level locks
> + * taken here are kept until the end of the transaction doing calling this
> + * function.
> + */

"the transaction doing calling this function." sounds wrong to me.

> +      * Extract the list of indexes that are going to be rebuilt based on the
> +      * list of relation Oids given by caller. For each element in given list,
> +      * if the relkind of given relation Oid is a table, all its valid indexes
> +      * will be rebuilt, including its associated toast table indexes. If
> +      * relkind is an index, this index itself will be rebuilt. The locks taken
> +      * on parent relations and involved indexes are kept until this
> +      * transaction is committed to protect against schema changes that might
> +      * occur until the session lock is taken on each relation, session lock
> +      * used to similarly protect from any schema change that could happen
> +      * within the multiple transactions that are used during this process.
> +      */

I think the last sentence in the above should be split up into several
sentences, maybe at "session lock used..."? Or maybe it should just say
"a session lock is used" instead?

> +                                             else
> +                                             {
> +                                                     /*
> +                                                      * Save the list of relation OIDs in private
> +                                                      * context
> +                                                      */

Missing full stop at end of comment.

> +     /* Definetely no indexes, so leave */

s/Definetely/Definitely/.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz


Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-03-15 22:32, Michael Banck wrote:
> I had a quick look at some of the comments and noticed some possible
> nitpicky-level problems:

Thanks, I've integrated these changes into my local branch.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: REINDEX CONCURRENTLY 2.0

From
Sergei Kornilov
Date:
Hello

Yet another review of this patch from me...

>        An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
>        an <quote>invalid</quote> index. Such indexes are useless but it can be
> -      convenient to use <command>REINDEX</command> to rebuild them. Note that
> -      <command>REINDEX</command> will not perform a concurrent build. To build the
> -      index without interfering with production you should drop the index and
> -      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
> +      convenient to use <command>REINDEX</command> to rebuild them.

Not sure we can just say "use REINDEX" since only non-concurrently reindex can rebuild such index. I propose to not
changethis part.
 

> +    The following steps occur in a concurrent index build, each in a separate
> +    transaction except when the new index definitions are created
> 
> +       All the constraints and foreign keys which refer to the index are swapped...
> +       ... This step is done within a single transaction
> +       for each temporary entry.
> 
> +       Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote>
> +       to prevent any new tuple insertions after waiting for running queries which
> +       may reference the old index to complete. This step is done within a single
> +       transaction for each temporary entry.

According to the code index_concurrently_swap is called in loop inside one transaction for all processed indexes of
table.Same for index_concurrently_set_dead and index_concurrently_drop calls. So this part of documentation seems
incorrect.

And few questions:
- reindexdb has concurrently flag logic even in reindex_system_catalogs, but "reindex concurrently" can not reindex
systemcatalog. Is this expected?
 
- should reindexdb check server version? For example, binary from patched HEAD can reindex v11 cluster and obviously
failif --concurrently was requested.
 
- psql/tab-complete.c vs old releases? Seems we need suggest CONCURRENTLY keyword only for releases with concurrently
support.

Well, i still have no new questions about backend logic. Maybe we need mark patch as "Ready for Committer" in order to
getmore attention from other committers?
 

regards, Sergei


Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-03-23 20:04, Sergei Kornilov wrote:
>>        An index build with the <literal>CONCURRENTLY</literal> option failed, leaving
>>        an <quote>invalid</quote> index. Such indexes are useless but it can be
>> -      convenient to use <command>REINDEX</command> to rebuild them. Note that
>> -      <command>REINDEX</command> will not perform a concurrent build. To build the
>> -      index without interfering with production you should drop the index and
>> -      reissue the <command>CREATE INDEX CONCURRENTLY</command> command.
>> +      convenient to use <command>REINDEX</command> to rebuild them.
> 
> Not sure we can just say "use REINDEX" since only non-concurrently reindex can rebuild such index. I propose to not
changethis part.
 

Yeah, I reverted that and adjusted the wording a bit.

>> +       Old indexes have <literal>pg_index.indisready</literal> switched to <quote>false</quote>
>> +       to prevent any new tuple insertions after waiting for running queries which
>> +       may reference the old index to complete. This step is done within a single
>> +       transaction for each temporary entry.
> 
> According to the code index_concurrently_swap is called in loop inside one transaction for all processed indexes of
table.Same for index_concurrently_set_dead and index_concurrently_drop calls. So this part of documentation seems
incorrect.

I rewrote that whole procedure to make it a bit simpler.

> And few questions:
> - reindexdb has concurrently flag logic even in reindex_system_catalogs, but "reindex concurrently" can not reindex
systemcatalog. Is this expected?
 

If support is ever added, then reindexdb supports it automatically.  It
seems simpler to not have to repeat the same checks in two places.

> - should reindexdb check server version? For example, binary from patched HEAD can reindex v11 cluster and obviously
failif --concurrently was requested.
 

Added.

> - psql/tab-complete.c vs old releases? Seems we need suggest CONCURRENTLY keyword only for releases with concurrently
support.

It seems we don't do version checks for tab completion of keywords.

> Well, i still have no new questions about backend logic. Maybe we need mark patch as "Ready for Committer" in order
toget more attention from other committers?
 

Let's do it. :-)

I've gone over this patch a few more times.  I've read all the
discussion since 2012 again and made sure all the issues were addressed.
 I made particularly sure that during the refactoring nothing in CREATE
INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY was inadvertently
changed.  I checked all the steps again.  I'm happy with it.

One more change I made was in the drop phase.  I had to hack it up a bit
so that we can call index_drop() with a concurrent lock but not actually
doing the concurrent processing (that would be a bit recursive).  The
previous patch was actually taking too strong a lock here.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Mon, Mar 25, 2019 at 04:23:34PM +0100, Peter Eisentraut wrote:
> Let's do it. :-)

I am pretty sure that this has been said at least once since 2012.

> I've gone over this patch a few more times.  I've read all the
> discussion since 2012 again and made sure all the issues were addressed.
>  I made particularly sure that during the refactoring nothing in CREATE
> INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY was inadvertently
> changed.  I checked all the steps again.  I'm happy with it.

From my side, I would be happy to look at this patch.  Unfortunately I
won't have the room to look at it this week I think :(

But if you are happy with it that's fine by me, at least I can fix
anything which is broken :)
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Sergei Kornilov
Date:
Hi

Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)?

>>  And few questions:
>>  - reindexdb has concurrently flag logic even in reindex_system_catalogs, but "reindex concurrently" can not reindex
systemcatalog. Is this expected?
 
>
> If support is ever added, then reindexdb supports it automatically. It
> seems simpler to not have to repeat the same checks in two places.

ok, reasonable for me

>>  - psql/tab-complete.c vs old releases? Seems we need suggest CONCURRENTLY keyword only for releases with
concurrentlysupport.
 
>
> It seems we don't do version checks for tab completion of keywords.

Hmm, yes, i found only few checks for "create trigger" syntax about "EXECUTE FUNCTION"/"EXECUTE PROCEDURE" difference
in11
 

regards, Sergei



Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-03-28 09:07, Sergei Kornilov wrote:
> Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)?

Committed :)

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Sergei Kornilov
Date:
>>  Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)?
>
> Committed :)

wow! Congratulations! This was very long way

my favorite pg12 feature

regards, Sergei



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Mar 29, 2019 at 10:39:23AM +0300, Sergei Kornilov wrote:
> wow! Congratulations! This was very long way
>
> my favorite pg12 feature

So this has been committed, nice!  Thanks a lot to all for keeping
alive this patch over the ages, with particular thanks to Andreas and
Peter.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-03-29 09:04, Michael Paquier wrote:
> On Fri, Mar 29, 2019 at 10:39:23AM +0300, Sergei Kornilov wrote:
>> wow! Congratulations! This was very long way
>>
>> my favorite pg12 feature
> 
> So this has been committed, nice!  Thanks a lot to all for keeping
> alive this patch over the ages, with particular thanks to Andreas and
> Peter.

So, we're getting buildfarm failures, only with clang.  I can reproduce
those (with clang).

It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX
CONCURRENTLY".  More eyes welcome.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-03-29 09:13, Peter Eisentraut wrote:
> On 2019-03-29 09:04, Michael Paquier wrote:
>> On Fri, Mar 29, 2019 at 10:39:23AM +0300, Sergei Kornilov wrote:
>>> wow! Congratulations! This was very long way
>>>
>>> my favorite pg12 feature
>>
>> So this has been committed, nice!  Thanks a lot to all for keeping
>> alive this patch over the ages, with particular thanks to Andreas and
>> Peter.
> 
> So, we're getting buildfarm failures, only with clang.  I can reproduce
> those (with clang).
> 
> It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX
> CONCURRENTLY".  More eyes welcome.

I think I found a fix.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Mar 29, 2019 at 09:13:35AM +0100, Peter Eisentraut wrote:
> So, we're getting buildfarm failures, only with clang.  I can reproduce
> those (with clang).

Indeed, I can reproduce the failures using -O2 with clang.  I am
wondering if we are not missing a volatile flag somewhere and that
some code reordering is at cause here.

> It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX
> CONCURRENTLY".  More eyes welcome.

Here is a short reproducer:
create materialized view aam as select 1 AS a;
create index aai on aam(a);
reindex table CONCURRENTLY aam;
--
Michael

Attachment

RE: REINDEX CONCURRENTLY 2.0

From
"Shinoda, Noriyoshi (PN Japan A&PS Delivery)"
Date:
Hi hackers,

I tried this great feature for partition index.
The first time the REINDEX TABLE CONCURRENTLY statement is executed to the partition, then an error occurs.
The second run succeeds but leaves an index with an INVALID status.
I think this is not the desired behaviour.

# TEST
postgres=> CREATE TABLE part1(c1 INT) PARTITION BY RANGE(c1);
CREATE TABLE
postgres=> CREATE TABLE part1v1 PARTITION OF part1 FOR VALUES FROM (0) TO (100);
CREATE TABLE
postgres=> CREATE INDEX idx1_part1 ON part1(c1);
CREATE INDEX
postgres=> REINDEX TABLE CONCURRENTLY part1v1;
ERROR:  cannot drop index part1v1_c1_idx_ccold because index idx1_part1 requires it
HINT:  You can drop index idx1_part1 instead.
postgres=> \d+ part1v1
                                  Table "public.part1v1"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 c1     | integer |           |          |         | plain   |              |
Partition of: part1 FOR VALUES FROM (0) TO (100)
Partition constraint: ((c1 IS NOT NULL) AND (c1 >= 0) AND (c1 < 100))
Indexes:
    "part1v1_c1_idx" btree (c1)
    "part1v1_c1_idx_ccold" btree (c1) INVALID
Access method: heap

postgres=> REINDEX TABLE CONCURRENTLY part1v1;
REINDEX
postgres=> \d+ part1v1
                                  Table "public.part1v1"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 c1     | integer |           |          |         | plain   |              |
Partition of: part1 FOR VALUES FROM (0) TO (100)
Partition constraint: ((c1 IS NOT NULL) AND (c1 >= 0) AND (c1 < 100))
Indexes:
    "part1v1_c1_idx" btree (c1)
    "part1v1_c1_idx_ccold" btree (c1) INVALID
Access method: heap

Regards,
Noriyoshi Shinoda

-----Original Message-----
From: Michael Paquier [mailto:michael@paquier.xyz]
Sent: Friday, March 29, 2019 6:21 PM
To: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Cc: Sergei Kornilov <sk@zsrv.org>; pgsql-hackers@lists.postgresql.org
Subject: Re: REINDEX CONCURRENTLY 2.0

On Fri, Mar 29, 2019 at 09:13:35AM +0100, Peter Eisentraut wrote:
> So, we're getting buildfarm failures, only with clang.  I can
> reproduce those (with clang).

Indeed, I can reproduce the failures using -O2 with clang.  I am wondering if we are not missing a volatile flag
somewhereand that some code reordering is at cause here. 

> It seems the issue is somewhere near indexcmds.c "Phase 6 of REINDEX
> CONCURRENTLY".  More eyes welcome.

Here is a short reproducer:
create materialized view aam as select 1 AS a; create index aai on aam(a); reindex table CONCURRENTLY aam;
--
Michael



Re: REINDEX CONCURRENTLY 2.0

From
Robert Treat
Date:
On Fri, Mar 29, 2019 at 3:28 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
>
> On 2019-03-28 09:07, Sergei Kornilov wrote:
> > Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in pg12)?
>
> Committed :)
>

Given this has been committed I've probably missed the window, but
philosophically speaking, is there any reason not to make the
"concurrently" behavior the default behavior, and require a keyword
for the more heavy-weight old behavior? In most production scenarios
you probably want to avoid exclusive locking, and in the cases where
that isn't an issue, 'concurrently' isn't that much slower that most
users would object to it. I would perhaps give a nod to historical
syntax concerns, but this would more closely align with the behavior
in vacuum vs vacuum full, and we've done behavior modifying changes
such as the recent WITH ... MATERIALIZED change. Thoughts?

Robert Treat
https://xzilla.net



Re: REINDEX CONCURRENTLY 2.0

From
Andres Freund
Date:
Hi,

On 2019-03-29 11:47:10 -0400, Robert Treat wrote:
> On Fri, Mar 29, 2019 at 3:28 AM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
> >
> > On 2019-03-28 09:07, Sergei Kornilov wrote:
> > > Unfortunately patch does not apply due recent commits. Any chance this can be fixed (and even committed in
pg12)?
> >
> > Committed :)
> >
> 
> Given this has been committed I've probably missed the window, but
> philosophically speaking, is there any reason not to make the
> "concurrently" behavior the default behavior, and require a keyword
> for the more heavy-weight old behavior?

Yes, it increases the total runtime quite considerably. And it adds new
failure modes with partially built invalid indexes hanging around that
need to be dropped manually.


> In most production scenarios
> you probably want to avoid exclusive locking, and in the cases where
> that isn't an issue, 'concurrently' isn't that much slower that most
> users would object to it.

It does at *least* twice as much IO.

Greetings,

Andres Freund



Re: REINDEX CONCURRENTLY 2.0

From
"Bossart, Nathan"
Date:
I noticed a very small typo in the documentation for this feature.

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index ccabb330cb..e45bf86c8d 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -349,7 +349,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURR
      <listitem>
       <para>
        The old indexes are dropped.  The <literal>SHARE UPDATE
-       EXCLUSIVE</literal> session locks for the indexes and the table ar
+       EXCLUSIVE</literal> session locks for the indexes and the table are
        released.
       </para>
      </listitem>

Nathan



Re: REINDEX CONCURRENTLY 2.0

From
Justin Pryzby
Date:
On Fri, Mar 29, 2019 at 03:53:05PM +0000, Bossart, Nathan wrote:
> I noticed a very small typo in the documentation for this feature.

I submit a bunch more changes for consideration, attached.

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-03-29 16:53, Bossart, Nathan wrote:
> I noticed a very small typo in the documentation for this feature.

fixed

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-03-29 17:01, Justin Pryzby wrote:
> On Fri, Mar 29, 2019 at 03:53:05PM +0000, Bossart, Nathan wrote:
>> I noticed a very small typo in the documentation for this feature.
> 
> I submit a bunch more changes for consideration, attached.

fixed, thanks

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Mar 29, 2019 at 03:10:23PM +0000, Shinoda, Noriyoshi (PN Japan A&PS Delivery) wrote:
> I tried this great feature for partition index.
> The first time the REINDEX TABLE CONCURRENTLY statement is executed
> to the partition, then an error occurs.

Yes, that's a problem.  I am adding an open item.

> The second run succeeds but leaves an index with an INVALID status.
> I think this is not the desired behaviour.

This one is partially expected actually.  Invalid indexes are ignored
when processing as that would cause at least a table-level reindex to
double its amount of processing.  However, it is not possible either
to detach an index from an partition tree for indexes, hence the index
cannot be dropped directly either.  It seems to me that the root of
the problem is that the partition indexes created as copycats of the
original ones should never be in a state where they are attached to
the index tree.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Mar 29, 2019 at 08:48:03AM -0700, Andres Freund wrote:
> Yes, it increases the total runtime quite considerably. And it adds new
> failure modes with partially built invalid indexes hanging around that
> need to be dropped manually.

On top of that CONCURRENTLY needs multiple transactions to perform its
different phases for each transaction: build, validation, swap and
cleanup.  So it cannot run in a transaction block.  Having a separate
option makes the most sense.

> It does at *least* twice as much IO.

Yeah, I can guarantee you that it is much slower, at the advantage of
being lock-free.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-03-29 16:10, Shinoda, Noriyoshi (PN Japan A&PS Delivery) wrote:
> postgres=> CREATE TABLE part1(c1 INT) PARTITION BY RANGE(c1);
> CREATE TABLE
> postgres=> CREATE TABLE part1v1 PARTITION OF part1 FOR VALUES FROM (0) TO (100);
> CREATE TABLE
> postgres=> CREATE INDEX idx1_part1 ON part1(c1);
> CREATE INDEX
> postgres=> REINDEX TABLE CONCURRENTLY part1v1;
> ERROR:  cannot drop index part1v1_c1_idx_ccold because index idx1_part1 requires it
> HINT:  You can drop index idx1_part1 instead.

The attached patch fixes this.  The issue was that we didn't move all
dependencies from the index (only in the other direction).  Maybe that
was sufficient when the patch was originally written, before partitioned
indexes.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Sat, Mar 30, 2019 at 11:56:27AM +0100, Peter Eisentraut wrote:
> The attached patch fixes this.  The issue was that we didn't move all
> dependencies from the index (only in the other direction).  Maybe that
> was sufficient when the patch was originally written, before partitioned
> indexes.

Hm.  I don't think that it is quite right either because the new index
is missing from the partition tree after the reindex, taking the
example from your patch I see that:
=# CREATE TABLE concur_reindex_part1 (c1 int) PARTITION BY RANGE (c1);
CREATE TABLE
=# CREATE TABLE concur_reindex_part1v1 PARTITION OF
     concur_reindex_part1 FOR VALUES FROM (0) TO (100);
CREATE TABLE
=# SELECT relid, level FROM
   pg_partition_tree('concur_reindex_idx1_part1');
             relid             | level
-------------------------------+-------
 concur_reindex_idx1_part1     |     0
 concur_reindex_part1v1_c1_idx |     1
(2 rows)
=# CREATE INDEX concur_reindex_idx1_part1 ON
     concur_reindex_part1 (c1);
CREATE INDEX
=# REINDEX TABLE CONCURRENTLY concur_reindex_part1v1;
REINDEX
SELECT relid, level FROM
   pg_partition_tree('concur_reindex_idx1_part1');
           relid           | level
---------------------------+-------
 concur_reindex_idx1_part1 |     0
(1 row)

And I would have expected concur_reindex_part1v1_c1_idx to still be
part of the partition tree.  I think that the issue is in
index_concurrently_create_copy() where we create the new index with
index_create() without setting parentIndexRelid, causing the
dependency to be lost.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Sat, Mar 30, 2019 at 11:56:27AM +0100, Peter Eisentraut wrote:
> The attached patch fixes this.  The issue was that we didn't move all
> dependencies from the index (only in the other direction).  Maybe that
> was sufficient when the patch was originally written, before partitioned
> indexes.

Hm.  I don't think that it is quite right either because the new index
is missing from the partition tree after the reindex.  Taking the
example from your patch I see that:
=# CREATE TABLE concur_reindex_part1 (c1 int) PARTITION BY RANGE (c1);
CREATE TABLE
=# CREATE TABLE concur_reindex_part1v1 PARTITION OF
     concur_reindex_part1 FOR VALUES FROM (0) TO (100);
CREATE TABLE
=# SELECT relid, level FROM
   pg_partition_tree('concur_reindex_idx1_part1');
             relid             | level
-------------------------------+-------
 concur_reindex_idx1_part1     |     0
 concur_reindex_part1v1_c1_idx |     1
(2 rows)
=# CREATE INDEX concur_reindex_idx1_part1 ON
     concur_reindex_part1 (c1);
CREATE INDEX
=# REINDEX TABLE CONCURRENTLY concur_reindex_part1v1;
REINDEX
SELECT relid, level FROM
   pg_partition_tree('concur_reindex_idx1_part1');
           relid           | level
---------------------------+-------
 concur_reindex_idx1_part1 |     0
(1 row)

And I would have expected concur_reindex_part1v1_c1_idx to still be
part of the partition tree.  I think that the issue is in
index_concurrently_create_copy() where we create the new index with
index_create() without setting parentIndexRelid, causing the
dependency to be lost.  This parameter ought to be set to the OID of
the parent index so I think that we need to look at the ancestors of
the index if relispartition is set, and use get_partition_ancestors()
for that purpose.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Mon, Apr 01, 2019 at 03:43:43PM +0900, Michael Paquier wrote:
> And I would have expected concur_reindex_part1v1_c1_idx to still be
> part of the partition tree.  I think that the issue is in
> index_concurrently_create_copy() where we create the new index with
> index_create() without setting parentIndexRelid, causing the
> dependency to be lost.  This parameter ought to be set to the OID of
> the parent index so I think that we need to look at the ancestors of
> the index if relispartition is set, and use get_partition_ancestors()
> for that purpose.

And here is the patch to address this issue.  It happens that a bit
more than the dependency switch was lacking here:
- At swap time, we need to have the new index definition track
relispartition from the old index.
- Again at swap time, the inheritance link needs to be updated between
the old/new index and its parent when reindexing a partition index.

Tracking the OID of the parent via index_concurrently_create_copy() is
not a bright idea as we would finish with the impossibility to drop
invalid indexes if the REINDEX CONCURRENTLY failed in the middle (just
added some manual elog(ERROR) to test that).  I have added a comment
before making the index duplica.  I have also expanded the regression
tests so as we have more coverage for all that, finishing with the
attached which keeps partition trees consistent across the operations.
Thoughts?
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Tue, Apr 09, 2019 at 03:50:27PM +0900, Michael Paquier wrote:
> And here is the patch to address this issue.  It happens that a bit
> more than the dependency switch was lacking here:
> - At swap time, we need to have the new index definition track
> relispartition from the old index.
> - Again at swap time, the inheritance link needs to be updated between
> the old/new index and its parent when reindexing a partition index.

Peter, this is an open item, and I think as the committer of the
feature you are its owner.  Well, in this case, I don't mind taking
the ownership as need be as I know this stuff.  Anyway, could you have
a look at the patch proposed and see if you have any issues with it?
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

> On 2019-03-28 09:07, Sergei Kornilov wrote:
>> Unfortunately patch does not apply due recent commits. Any chance
>> this can be fixed (and even committed in pg12)?
>
> Committed :)

I noticed that the docs for how to recover from a failed CREATE INDEX
CONCURRENTLY say that «REINDEX does not support concurrent builds»,
which is no longer true.  I was going to just remove the caveat, but
then I discovered that REINDEX CONCURRENTLY doesn't work on INVALID
indexes (why?).

Attached is a patch that instead adjust the claim to say that REINDEX
dos not support rebulilding invalid indexess concurrently.

- ilmari
-- 
"A disappointingly low fraction of the human race is,
 at any given time, on fire." - Stig Sandbeck Mathisen

From e2de72b348f8a96e24128fc4188bd542eb676610 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org>
Date: Thu, 11 Apr 2019 10:58:47 +0100
Subject: [PATCH] Correct claim about REINDEX CONCURRENTLY in CREATE INDEX
 CONCURRENTLY docs

REINDEX CONCURRENTLY exists, but cannot reindex invalid indexes.
---
 doc/src/sgml/ref/create_index.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index d9d95b20e3..c458f54ef1 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -557,8 +557,8 @@
     method in such cases is to drop the index and try again to perform
     <command>CREATE INDEX CONCURRENTLY</command>.  (Another possibility is to rebuild
     the index with <command>REINDEX</command>.  However, since <command>REINDEX</command>
-    does not support concurrent builds, this option is unlikely to seem
-    attractive.)
+    does not support reindexing invalid indexes concurrently, this option is
+    unlikely to seem attractive.)
    </para>
 
    <para>
-- 
2.20.1


Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Apr 11, 2019 at 11:21:29AM +0100, Dagfinn Ilmari Mannsåker wrote:
> I noticed that the docs for how to recover from a failed CREATE INDEX
> CONCURRENTLY say that «REINDEX does not support concurrent builds»,
> which is no longer true.

Good catch.  I'll apply that in a couple of hours.

> I was going to just remove the caveat, but then I discovered that
> REINDEX CONCURRENTLY doesn't work on INVALID indexes (why?).

This is a wanted choice.  The index built in parallel of the existing
one during a concurrent reindex is marked invalid during most of the
operation.  Hence, if the reindex is interrupted or fails, you finish
with potentially twice the number of original indexes, half being
invalid and the other half being the ones in use.  If the user decides
to rinse and repeat the concurrent reindex, and if we were to also
select invalid indexes for the operation, then you would finish by
potentially doing twice the amount of work when working on a relation,
half of it for nothing.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Alvaro Herrera
Date:
On 2019-Apr-11, Michael Paquier wrote:

> > I was going to just remove the caveat, but then I discovered that
> > REINDEX CONCURRENTLY doesn't work on INVALID indexes (why?).
> 
> This is a wanted choice.  The index built in parallel of the existing
> one during a concurrent reindex is marked invalid during most of the
> operation.  Hence, if the reindex is interrupted or fails, you finish
> with potentially twice the number of original indexes, half being
> invalid and the other half being the ones in use.  If the user decides
> to rinse and repeat the concurrent reindex, and if we were to also
> select invalid indexes for the operation, then you would finish by
> potentially doing twice the amount of work when working on a relation,
> half of it for nothing.

Hmm, I suppose that makes sense for REINDEX TABLE or anything that
reindexes more than one index, but if you do REINDEX INDEX surely it
is reasonable to allow the case?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Thu, Apr 11, 2019 at 09:49:47AM -0400, Alvaro Herrera wrote:
> Hmm, I suppose that makes sense for REINDEX TABLE or anything that
> reindexes more than one index, but if you do REINDEX INDEX surely it
> is reasonable to allow the case?

Yes, we could revisit the REINDEX INDEX portion of the decision, and
after sleeping on it my previous argument makes limited sense for
REINDEX processes using only one index.  One could note that the
header comment of ReindexRelationConcurrently() kind of implies the
same conclusion as you do, but that's perhaps just an accident.

So...  I am coming up with the patch attached.  I have introduced some
tests using a trick with CIC to have an invalid index to work on.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-04-11 05:59, Michael Paquier wrote:
> On Tue, Apr 09, 2019 at 03:50:27PM +0900, Michael Paquier wrote:
>> And here is the patch to address this issue.  It happens that a bit
>> more than the dependency switch was lacking here:
>> - At swap time, we need to have the new index definition track
>> relispartition from the old index.
>> - Again at swap time, the inheritance link needs to be updated between
>> the old/new index and its parent when reindexing a partition index.
> 
> Peter, this is an open item, and I think as the committer of the
> feature you are its owner.  Well, in this case, I don't mind taking
> the ownership as need be as I know this stuff.  Anyway, could you have
> a look at the patch proposed and see if you have any issues with it?

Looks good, committed.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Apr 12, 2019 at 08:44:18AM +0200, Peter Eisentraut wrote:
> Looks good, committed.

Thanks for committing!
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Michael Paquier <michael@paquier.xyz> writes:

> So...  I am coming up with the patch attached.  I have introduced some
> tests using a trick with CIC to have an invalid index to work on.

I don't have any comments on the code (but the test looks sensible, it's
the same trick I used to discover the issue in the first place).

However, the doc patch lost the trailing paren:

>      The recommended recovery
>      method in such cases is to drop the index and try again to perform
> -    <command>CREATE INDEX CONCURRENTLY</command>.  (Another possibility is to rebuild
> -    the index with <command>REINDEX</command>.  However, since <command>REINDEX</command>
> -    does not support concurrent builds, this option is unlikely to seem
> -    attractive.)
> +    <command>CREATE INDEX CONCURRENTLY</command>.  (Another possibility is
> +    to rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>.
>     </para>


- ilmari
-- 
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.               - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.                      - Calle Dybedahl



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Fri, Apr 12, 2019 at 12:11:12PM +0100, Dagfinn Ilmari Mannsåker wrote:
> I don't have any comments on the code (but the test looks sensible, it's
> the same trick I used to discover the issue in the first place).

After thinking some more on it, this behavior looks rather sensible to
me.  Are there any objections?

> However, the doc patch lost the trailing paren:

Fixed on my branch, thanks.
--
Michael

Attachment

Re: REINDEX CONCURRENTLY 2.0

From
Peter Eisentraut
Date:
On 2019-04-16 08:19, Michael Paquier wrote:
> On Fri, Apr 12, 2019 at 12:11:12PM +0100, Dagfinn Ilmari Mannsåker wrote:
>> I don't have any comments on the code (but the test looks sensible, it's
>> the same trick I used to discover the issue in the first place).
> 
> After thinking some more on it, this behavior looks rather sensible to
> me.  Are there any objections?

Looks good to me.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: REINDEX CONCURRENTLY 2.0

From
Michael Paquier
Date:
On Tue, Apr 16, 2019 at 08:50:31AM +0200, Peter Eisentraut wrote:
> Looks good to me.

Thanks, committed.  If there are additional discussions on various
points of the feature, let's move to a new thread please.  This one
has been already extensively used ;)
--
Michael

Attachment