Re: [HACKERS] REINDEX CONCURRENTLY 2.0 - Mailing list pgsql-hackers

From Andreas Karlsson
Subject Re: [HACKERS] REINDEX CONCURRENTLY 2.0
Date
Msg-id 1da61300-31e8-d416-1d41-56c15cd4753d@proxel.se
Whole thread Raw
In response to Re: [HACKERS] REINDEX CONCURRENTLY 2.0  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: [HACKERS] REINDEX CONCURRENTLY 2.0
List pgsql-hackers
On 02/14/2017 04:56 AM, Michael Paquier wrote:
> On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson <andreas@proxel.se> wrote:
>> On 02/13/2017 06:31 AM, Michael Paquier wrote:
>>> Er, something like that as well, no?
>>> DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
>>
>> REINDEX (VERBOSE) currently prints one such line per index, which does not
>> really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes
>> on a relation at the same time. It is not immediately obvious how this
>> should work. Maybe one such detail line per table?
>
> Hard to recall this thing in details with the time and the fact that a
> relation is reindexed by processing all the indexes once at each step.
> Hm... What if ReindexRelationConcurrently() actually is refactored in
> such a way that it processes all the steps for each index
> individually? This way you can monitor the time it takes to build
> completely each index, including its . This operation would consume
> more transactions but in the event of a failure the amount of things
> to clean up is really reduced particularly for relations with many
> indexes. This would as well reduce VERBOSE to print one line per index
> rebuilt.

I am actually thinking about going the opposite direction (by reducing 
the number of times we call WaitForLockers), because it is not just 
about consuming transaction IDs, we also do not want to wait too many 
times for transactions to commit. I am leaning towards only calling 
WaitForLockersMultiple three times per table.

1. Between building and validating the new indexes.
2. Between setting the old indexes to invalid and setting them to dead
3. Between setting the old indexes to dead and dropping them

Right now my patch loops over the indexes in step 2 and 3 and waits for 
lockers once per index. This seems rather wasteful.

I have thought about that the code might be cleaner if we just looped 
over all indexes (and as a bonus the VERBOSE output would be more 
obvious), but I do not think it is worth waiting for lockers all those 
extra times.

Andreas



pgsql-hackers by date:

Previous
From: Rushabh Lathia
Date:
Subject: Re: [HACKERS] Gather Merge
Next
From: Amit Kapila
Date:
Subject: Re: [HACKERS] Instability in select_parallel regression test