Re: [HACKERS] REINDEX CONCURRENTLY 2.0 - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [HACKERS] REINDEX CONCURRENTLY 2.0
Date
Msg-id 20190117051101.GD2036@paquier.xyz
Whole thread Raw
In response to Re: [HACKERS] REINDEX CONCURRENTLY 2.0  (Andreas Karlsson <andreas@proxel.se>)
Responses Re: [HACKERS] REINDEX CONCURRENTLY 2.0  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
On Wed, Jan 16, 2019 at 05:56:15PM +0100, Andreas Karlsson wrote:
> On 1/16/19 9:27 AM, Michael Paquier wrote:
>> Does somebody mind if I jump into the ship after so long?  I was the
>> original author of the monster after all...
>
> Fine by me. Peter?

Okay, I have begun digging into the patch, and extracted for now two
things which can be refactored first, giving a total of three patches:
- 0001, which creates WaitForOlderSnapshots to snapmgr.c.  I think
that this can be useful for external extensions to have a process wait
for snapshots older than a minimum threshold hold by other
transactions.
- 0002, which moves the concurrent index build into its own routine,
index_build_concurrent().  At the same time, index_build() has a
isprimary argument which is not used, so let's remove it.  This
simplifies a bit the refactoring as well.
- 0003 is the core patch, realigned with the rest, fixing some typos I
found on the way.

Here are also some notes for things I am planning to look after with a
second pass:
- The concurrent drop (phase 5) part, still shares a lot with DROP
INDEX CONCURRENTLY, and I think that we had better refactor more the
code so as REINDEX CONCURRENTLY shared more with DROP INDEX.  One
thing which I think is incorrect is that we do not clear the invalid
flag of the drop index before marking it as dead.  This looks like a
missing piece from another concurrent-related bug fix lost over the
rebases this patch had.
- set_dead could be refactored so as it is able to handle in input
multiple indexes, using WaitForLockersMultiple().  This way CREATE
INDEX CONCURRENTLY could also use it.
- There are no regression tests for partitioned tables.
- The NOTICE messages showing up when a table has no indexes should be
removed.
- index_create() does not really need a TupleDesc argument, as long as
the caller is able to provide a list of column names.
- At the end of the day, I think that it would be nice to reach a
state where we have a set of low-level routines like
index_build_concurrent, index_set_dead_concurrent which are used by
both paths of CONCURRENTLY and can be called for each phase within a
given transaction.  Those pieces can also be helpful to implement for
example an extension able to do concurrent reindexing out of core.

I think that the refactorings in 0001 and 0002 are committable as-is,
and this shaves some code from the core patch.

Thoughts?
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Donald Dong
Date:
Subject: Re: Ryu floating point output patch
Next
From: Andrew Gierth
Date:
Subject: Re: Ryu floating point output patch