Re: Bugs in CREATE/DROP INDEX CONCURRENTLY - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Bugs in CREATE/DROP INDEX CONCURRENTLY |
Date | |
Msg-id | 26129.1354041908@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Bugs in CREATE/DROP INDEX CONCURRENTLY (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: Bugs in CREATE/DROP INDEX CONCURRENTLY
|
List | pgsql-hackers |
Andres Freund <andres@2ndquadrant.com> writes: > On 2012-11-26 22:44:49 -0500, Tom Lane wrote: >> I looked through the code a bit, and I think we should be able to make >> this work, but note there are also places that update indcheckxmin using >> heap_update, and that's just as dangerous as changing the other two >> flags via heap_update, if you don't have exclusive lock on the table. > Isn't setting indexcheckxmin already problematic in the case of CREATE > INDEX CONCURRENTLY? index_build already runs in a separate transaction > there. Yeah, you are right, except that AFAICS indcheckxmin is really only needed for regular non-concurrent CREATE INDEX, which needs it because it commits without waiting for readers that might be bothered by broken HOT chains. In a concurrent CREATE INDEX, we handle that problem by waiting out all such readers before setting indisvalid. So the concurrent code path should not be bothering to set indcheckxmin at all, I think. (This is underdocumented.) Looking closer at the comment in reindex_index, what it's really full of angst about is that simple_heap_update will update the tuple's xmin *when we would rather that it didn't*. So switching to update-in-place there will solve a problem, not create one. In short, all flag changes in pg_index should be done by update-in-place, and the tuple's xmin will never change from the original creating transaction (until frozen). What we want the xmin to do, for indcheckxmin purposes, is reflect the time at which the index started to be included in HOT-safety decisions. Since we're never going to move the xmin, that means that *we cannot allow REINDEX to mark a dead index live again*. Once DROP INDEX CONCURRENTLY has reached the final state, you can't revalidate the index by reindexing it, you'll have to drop it and then make a brand new one. That seems like a pretty minor compromise. What I propose to do next is create a patch for HEAD that includes a new pg_index flag column, since I think the logic will be clearer with that. Once we're happy with that, we can back-port the patch into a form that uses the existing flag columns. Anybody feel like bikeshedding the flag column name? I'm thinking "indislive" but maybe there's a better name. Note: I'm not impressed by the proposal to replace these columns with a single integer flag column. Aside from any possible incompatibility with existing client code, it just isn't going to be easy to read the index's state manually if we do that. We could maybe dodge that complaint with a char (pseudo-enum) status column but I don't think that will simplify the code at all, and it's got the same or worse compatibility issues. > Btw, even if we manage to find a sensible fix for this I would suggest > postponing it after the next back branch release. AFAICS this is a data loss/corruption problem, and as such a "must fix". If we can't have it done by next week, I'd rather postpone the releases until it is done. regards, tom lane
pgsql-hackers by date: