Re: Bugs in CREATE/DROP INDEX CONCURRENTLY - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Bugs in CREATE/DROP INDEX CONCURRENTLY
Date
Msg-id 26129.1354041908@sss.pgh.pa.us
Whole thread Raw
In response to Re: Bugs in CREATE/DROP INDEX CONCURRENTLY  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Bugs in CREATE/DROP INDEX CONCURRENTLY  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
> On 2012-11-26 22:44:49 -0500, Tom Lane wrote:
>> I looked through the code a bit, and I think we should be able to make
>> this work, but note there are also places that update indcheckxmin using
>> heap_update, and that's just as dangerous as changing the other two
>> flags via heap_update, if you don't have exclusive lock on the table.

> Isn't setting indexcheckxmin already problematic in the case of CREATE
> INDEX CONCURRENTLY? index_build already runs in a separate transaction
> there.

Yeah, you are right, except that AFAICS indcheckxmin is really only
needed for regular non-concurrent CREATE INDEX, which needs it because
it commits without waiting for readers that might be bothered by broken
HOT chains.  In a concurrent CREATE INDEX, we handle that problem by
waiting out all such readers before setting indisvalid.  So the
concurrent code path should not be bothering to set indcheckxmin at all,
I think.  (This is underdocumented.)

Looking closer at the comment in reindex_index, what it's really full of
angst about is that simple_heap_update will update the tuple's xmin
*when we would rather that it didn't*.  So switching to update-in-place
there will solve a problem, not create one.

In short, all flag changes in pg_index should be done by
update-in-place, and the tuple's xmin will never change from the
original creating transaction (until frozen).

What we want the xmin to do, for indcheckxmin purposes, is reflect the
time at which the index started to be included in HOT-safety decisions.
Since we're never going to move the xmin, that means that *we cannot
allow REINDEX to mark a dead index live again*.  Once DROP INDEX
CONCURRENTLY has reached the final state, you can't revalidate the index
by reindexing it, you'll have to drop it and then make a brand new one.
That seems like a pretty minor compromise.

What I propose to do next is create a patch for HEAD that includes a
new pg_index flag column, since I think the logic will be clearer
with that.  Once we're happy with that, we can back-port the patch
into a form that uses the existing flag columns.

Anybody feel like bikeshedding the flag column name?  I'm thinking
"indislive" but maybe there's a better name.

Note: I'm not impressed by the proposal to replace these columns with
a single integer flag column.  Aside from any possible incompatibility
with existing client code, it just isn't going to be easy to read the
index's state manually if we do that.  We could maybe dodge that
complaint with a char (pseudo-enum) status column but I don't think
that will simplify the code at all, and it's got the same or worse
compatibility issues.

> Btw, even if we manage to find a sensible fix for this I would suggest
> postponing it after the next back branch release.

AFAICS this is a data loss/corruption problem, and as such a "must fix".
If we can't have it done by next week, I'd rather postpone the releases
until it is done.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Next
From: Jeff Davis
Date:
Subject: Re: MySQL search query is not executing in Postgres DB