Re: {CREATE INDEX, REINDEX} CONCURRENTLY improvements - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date
Msg-id CAEze2WgUkK29JbpSa72yN5JT4Ao7vez+vbsrPFyK1og5zXtF_Q@mail.gmail.com
Whole thread Raw
In response to Re: {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Álvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
List pgsql-hackers
On Mon, 18 Jan 2021, 21:25 Álvaro Herrera, <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Jan-18, Matthias van de Meent wrote:
>
> > Example:
> >
> > 1.) RI starts
> > 2.) PHASE 2: filling the index:
> > 2.1.) scanning the heap (live tuple is cached)
> > < tuple is deleted
> > < last transaction other than RI commits, only snapshot of RI exists
> > < vacuum drops the tuple, and cannot remove it from the new index
> > because this new index is not yet populated.
> > 2.2.) sorting tuples
> > 2.3.) index filled with tuples, incl. deleted tuple
> > 3.) PHASE 3: wait for transactions
> > 4.) PHASE 4: validate does not remove the tuple from the index,
> > because it is not built to do so: it will only insert new tuples.
> > Tuples that are marked for deletion are removed from the index only
> > through VACUUM (and optimistic ALL_DEAD detection).
> >
> > According to my limited knowledge of RI, it requires VACUUM to not run
> > on the table during the initial index build process (which is
> > currently guaranteed through the use of a snapshot).
>
> VACUUM cannot run concurrently with CIC or RI in a table -- both acquire
> ShareUpdateExclusiveLock, which conflicts with itself, so this cannot
> occur.

Yes, you are correct. Vacuum indeed has a ShareUpdateExclusiveLock.
Are there no other ways that pages are optimistically pruned?

But the base case still stands, ignoring CIC snapshots in  would give
the semantic of all_dead to tuples that are actually still considered
alive in some context, and should not yet be deleted (you're deleting
data from an in-use snapshot). Any local pruning optimizations using
all_dead mechanics now cannot be run on the table unless they hold an
ShareUpdateExclusiveLock; though I'm unaware of any such mechanisms
(other than below).

> I do wonder if the problem you suggest (or something similar) can occur
> via HOT pruning, though.

It could not, at least not at the current HEAD, as only one tuple in a
HOT-chain can be alive at one point, and all indexes point to the root
of the HOT-chain, which is never HOT-pruned. See also the
src/backend/access/heap/README.HOT.

Regards,

Matthias van de Meent



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Printing backtrace of postgres processes
Next
From: Tom Lane
Date:
Subject: Re: [PATCH v2 1/1] Fix detection of pwritev support for OSX.