Re: Should I implement DROP INDEX CONCURRENTLY? - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Should I implement DROP INDEX CONCURRENTLY? |
Date | |
Msg-id | CA+TgmoahY-RJFBAe2x5MDGrXmAMLJAD48fBLnnbM9GD1PX34eQ@mail.gmail.com Whole thread Raw |
In response to | Re: Should I implement DROP INDEX CONCURRENTLY? (Simon Riggs <simon@2ndquadrant.com>) |
Responses |
Re: Should I implement DROP INDEX CONCURRENTLY?
|
List | pgsql-hackers |
On Thu, Jan 19, 2012 at 10:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Wed, Jan 18, 2012 at 11:12 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Jan 18, 2012 at 5:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Can I just check with you that the only review comment is a one line >>> change? Seems better to make any additional review comments in one go. >> >> No, I haven't done a full review yet - I just noticed that right at >> the outset and wanted to be sure that got addressed. I can look it >> over more carefully, and test it. > > Corrected your point, and another found during retest. > > Works as advertised, docs corrected. This comment needs updating: /* * To drop an index safely, we must grab exclusive lock on its parent * table. Exclusive lock on the index aloneis insufficient because * another backend might be about to execute a query on the parent table. * If it relieson a previously cached list of index OIDs, then it could * attempt to access the just-dropped index. We must thereforetake a * table lock strong enough to prevent all queries on the table from * proceeding until we commit andsend out a shared-cache-inval notice * that will make them update their index lists. */ Speaking of that comment, why does a concurrent index drop need any lock at all on the parent table? If, as the current comment text says, the point of locking the parent table is to make sure that no new backends can create relcache entries until our transaction commits, then it's not needed here, or at least not for that purpose. We're going to out-wait anyone who has the index open *after* we've committed our changes to the pg_index entry, so there shouldn't be a race condition there. There may very well be a reason why we need a lock on the parent, or there may not, but that's not it. On the flip side, it strikes me that there's considerable danger that ShareUpdateExclusiveLock on the index might not be strong enough. In particular, it would fall down if there's anyone who takes an AccessShareLock, RowShareLock, or RowExclusiveLock and then proceeds to access the index data despite its being marked !indisready and !indisvalid. There are certainly instances of this in contrib, such as in pgstatindex(), which I'm pretty sure will abort with a nastygram if somebody truncates away the file under it. That might not be enough reason to reject the patch, but I do think it would be the only place in the system where we allow something like that to happen. I'm not sure whether there are any similar risks in core - I am a bit worried about get_actual_variable_range() and gincostestimate(), but I can't tell for sure whether there is any actual problem there, or elsewhere. I wonder if we should only do the *first* phase of the drop with ShareUpdateExcusliveLock, and then after outwaiting everyone, take an AccessExclusiveLock on the index (but not the table) to do the rest of the work. If that doesn't allow the drop to proceed concurrently, then we go find the places that block against it and teach them not to lock/use/examine indexes that are !indisready. That way, the worst case is that D-I-C is less concurrent than we'd like, rather than making random other things fall over - specifically, anything that count on our traditional guarantee that AccessShareLock is enough to keep the file from disappearing. In the department of minor nitpicks, the syntax synopsis you've added looks wrong: DROP INDEX [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> [, ...] [ CASCADE | RESTRICT ] CONCURRENTLY <replaceable class="PARAMETER">name</replaceable> That implies that you may write if exists, followed by 1 or more names, followed by cascade or restrict. After that you must write CONCURRENTLY, followed by exactly one name. I think what you want is: DROP INDEX [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable> [, ...] [ CASCADE | RESTRICT ] DROP INDEX CONCURRENTLY <replaceable class="PARAMETER">name</replaceable> ...but I also wonder if we shouldn't support DROP INDEX [ IF EXISTS ] CONCURRENTLY. It could also be very useful to be able to concurrently drop multiple indexes at once, since that would require waiting out all concurrent transactions only once instead of once per drop. Either way, I think we should have only one syntax, and then reject combinations we can't handle in the code. So I think we should have the parser accept: DROP INDEX [ IF EXISTS ] [ CONCURRENTLY ] <replaceable class="PARAMETER">name</replaceable> [, ...] [ CASCADE | RESTRICT ] And then, if you try to use cascade, or try to drop multiple indexes at once, we should throw an error saying that those options aren't supported in combination with CONCURRENTLY, rather than rejecting them as a syntax error at parse time. That gives a better error message and will make it easier for anyone who wants to extend this in the future - plus it will allow things like DROP INDEX CONCURRENTLY bob RESTRICT, which ought to be legal even if CASCADE isn't supported. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: