Re: Support for REINDEX CONCURRENTLY - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Support for REINDEX CONCURRENTLY
Date
Msg-id CAB7nPqTuR_7gqxU3UCt_oYNHsjRydmm4KxGoPs=_DtvPtrhk9w@mail.gmail.com
Whole thread Raw
In response to Re: Support for REINDEX CONCURRENTLY  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Support for REINDEX CONCURRENTLY
Re: Support for REINDEX CONCURRENTLY
List pgsql-hackers
OK. Patches updated... Please see attached.
With all the work done on those patches, I suppose this is close to being something clean...

On Wed, Mar 6, 2013 at 5:50 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-03-06 13:21:27 +0900, Michael Paquier wrote:
> Hum... OK. I changed that using a method based on get_index_constraint for
> a given index. So if the constraint Oid is invalid, it means that this
> index has no constraints and its concurrent entry won't create an index in
> consequence. It is more stable this way.

Sounds good. Just to make that clear:
To get a unique index without constraint:
CREATE TABLE table_u(id int, data int);
CREATE UNIQUE INDEX table_u__data ON table_u(data);
To get a constraint:
ALTER TABLE table_u ADD CONSTRAINT table_u__id_unique UNIQUE(id);
OK no problem. Thanks for the clarification.
 
> > > > > +     /*
> > > > > +      * Phase 3 of REINDEX CONCURRENTLY
> > > > > +      *
> > > > > +      * During this phase the concurrent indexes catch up with the
> > > > INSERT that
> > > > > +      * might have occurred in the parent table and are marked as
> > valid
> > > > once done.
> > > > > +      *
> > > > > +      * We once again wait until no transaction can have the table
> > open
> > > > with
> > > > > +      * the index marked as read-only for updates. Each index
> > > > validation is done
> > > > > +      * with a separate transaction to avoid opening transaction
> > for an
> > > > > +      * unnecessary too long time.
> > > > > +      */
> > > >
> > > > Maybe I am being dumb because I have the feeling I said differently in
> > > > the past, but why do we not need a WaitForMultipleVirtualLocks() here?
> > > > The comment seems to say we need to do so.
> > > >
> > > Yes you said the contrary in a previous review. The purpose of this
> > > function is to first gather the locks and then wait for everything at
> > once
> > > to reduce possible conflicts.
> >
> > you say:
> >
> > +        * We once again wait until no transaction can have the table open
> > with
> > +        * the index marked as read-only for updates. Each index
> > validation is done
> > +        * with a separate transaction to avoid opening transaction for an
> > +        * unnecessary too long time.
> >
> > Which doesn't seem to be done?
> >
> > I read back and afaics I only referred to CacheInvalidateRelcacheByRelid
> > not being necessary in this phase. Which I think is correct.
> >
> Regarding CacheInvalidateRelcacheByRelid at phase 3, I think that it is
> needed. If we don't use it the pg_index entries will be updated but not the
> cache, what is incorrect.

A heap_update will cause cache invalidations to be sent.
Ok. removed it.
 
> Anyway, if I claimed otherwise, I think I was wrong:
> >
> > The reason - I think - we need to wait here is that otherwise its not
> > guaranteed that all other backends see the index with ->isready
> > set. Which means they might add tuples which are invisible to the mvcc
> > snapshot passed to validate_index() (just created beforehand) which are
> > not yet added to the new index because those backends think the index is
> > not ready yet.
> > Any flaws in that logic?
> >
> Not that I think. In consequence, and I think we will agree on that: I am
> removing WaitForMultipleVirtualLocks and add a WaitForVirtualLock on the
> parent relation for EACH index before building and validating it.

I have the feeling we are talking past each other. Unless I miss
something *there is no* WaitForMultipleVirtualLocks between phase 2 and
3. But one WaitForMultipleVirtualLocks for all would be totally
sufficient.
OK, sorry for the confusion. I added a call to WaitForMultipleVirtualLocks also before phase 3.
Honestly, I am still not very comfortable with the fact that the ShareLock wait on parent relation is done outside each index transaction for build and validation... Changed as requested though...
--
Michael
Attachment

pgsql-hackers by date:

Previous
From: Shigeru Hanada
Date:
Subject: Re: Writable foreign tables: how to identify rows
Next
From: Andres Freund
Date:
Subject: Re: Support for REINDEX CONCURRENTLY