Re: Support for REINDEX CONCURRENTLY - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Support for REINDEX CONCURRENTLY
Date
Msg-id CAB7nPqRJq6UAjL3ORUW5cCN5+NYud-NoGnk_T5h7C2c_sJrMUg@mail.gmail.com
Whole thread Raw
In response to Re: Support for REINDEX CONCURRENTLY  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Support for REINDEX CONCURRENTLY  (Jim Nasby <jim@nasby.net>)
List pgsql-hackers


On Sat, Oct 6, 2012 at 8:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Sat, Oct 6, 2012 at 6:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> OK. That is a different approach and would limit strictly the amount of
> code necessary for the feature, but I feel that it breaks the nature of
> CONCURRENTLY which should run without any exclusive locks.

Hm?  The whole point is that the CONCURRENTLY commands don't require
exclusive locks.  Only the SWAP command would.
Yes, but my point is that it is more user-friendly to have such a functionality with a single command.
By having something without locks, you could use the concurrent APIs to perform a REINDEX automatically in autovacuum for example.
Also, the possibility to perform concurrent operations entirely without exclusive locks is not a problem limited to REINDEX, there would be for sure similar problems if CLUSTER CONCURRENTLY or ALTER TABLE CONCURRENTLY are wanted.
 

> Until now all the approaches investigated (switch of relfilenode, switch of
> index OID) need to have an exclusive lock because we try to maintain index
> OID as consistent. In the patch I submitted, the new index created has a
> different OID than the old index, and simply switches names. So after the
> REINDEX CONCURRENTLY the OID of index on the table is different, but seen
> from user the name is the same. Is it acceptable to consider that a reindex
> concurrently could change the OID of the index rebuild?

That is not going to work without ex-lock somewhere.  If you change the
index's OID then you will have to change pg_constraint and pg_depend
entries referencing it, and that creates race condition hazards for
other processes looking at those catalogs.  I'm not convinced that you
can even do a rename safely without ex-lock.  Basically, any DDL update
on an active index is going to be dangerous and probably impossible
without lock, IMO.
In the current version of the patch, at the beginning of process a new index is created. It is a twin of the index it has to replace, meaning that it copies the dependencies of old index and creates twin entries of the old index even in pg_depend and pg_constraint also if necessary. So the old index and the new index have exactly the same data in catalog, they are completely decoupled, and you do not need to worry about the OID replacements and the visibility consequences.
Knowing that both indexes are completely separate entities, isn't this enough to change the new index as the old one with a low-level lock? In the case of my patch only the names are simply exchanged and make the user unaware of what is happening in background. This behaves similarly to pg_reorg, explaining why the OIDs of tables reorganized are changed after being pg_reorg'ed.

To answer your question, I don't think anyone would object to the
index's OID changing if the operation were safe otherwise.  But I don't
think that allowing that gets us to a safe solution.
OK thanks.
--
Michael Paquier
http://michael.otacoo.com

pgsql-hackers by date:

Previous
From: John R Pierce
Date:
Subject: Re: Bad Data back Door
Next
From: Marc Balmer
Date:
Subject: Add the FET timetone abbreviation