Re: [HACKERS] REINDEX CONCURRENTLY 2.0 - Mailing list pgsql-hackers

From Andreas Karlsson
Subject Re: [HACKERS] REINDEX CONCURRENTLY 2.0
Date
Msg-id 025c1e3d-1586-c678-922c-c01e12bb013e@proxel.se
Whole thread Raw
In response to Re: [HACKERS] REINDEX CONCURRENTLY 2.0  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] REINDEX CONCURRENTLY 2.0  (Jim Nasby <jim.nasby@openscg.com>)
List pgsql-hackers
On 03/08/2017 03:48 AM, Robert Haas wrote:
> On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andreas@proxel.se> wrote:
>> And I would argue that his feature is useful for quite many, based on my
>> experience running a semi-large database. Index bloat happens and without
>> REINDEX CONCURRENTLY it can be really annoying to solve, especially for
>> primary keys. Certainly more people have problems with index bloat than the
>> number of people who store index oids in their database.
>
> Yeah, but that's not the only wart, I think.

The only two potential issues I see with the patch are:

1) That the index oid changes visibly to external users.

2) That the code for moving the dependencies will need to be updated 
when adding new things which refer to an index oid.

Given how useful I find REINDEX CONCURRENTLY I think these warts are 
worth it given that the impact is quite limited. I am of course biased 
since if I did not believe this I would not pursue this solution in the 
first place.

> For example, I believe
> (haven't looked at this patch series in a while) that the patch takes
> a lock and later escalates the lock level.  If so, that could lead to
> doing a lot of work to build the index and then getting killed by the
> deadlock detector.

This version of the patch no longer does that. For my use case 
escalating the lock would make this patch much less interesting. The 
highest lock level taken is the same one as the initial one (SHARE 
UPDATE EXCLUSIVE). The current patch does on a high level (very 
simplified) this:

1. CREATE INDEX CONCURRENTLY ind_new;
2. Atomically move all dependencies from ind to ind_new, rename ind to 
ind_old, and rename ind_new to ind.
3. DROP INDEX CONCURRENTLY ind_old;

The actual implementation is a bit more complicated in reality, but no 
part escalates the lock level over what would be required by the steps 
for creating and dropping indexes concurrently

> Also, if by any chance you think (or use any
> software that thinks) that OIDs for system objects are a stable
> identifier, this will be the first case where that ceases to be true.
> If the system is shut down or crashes or the session is killed, you'll
> be left with stray objects with names that you've never typed into the
> system.  I'm sure you're going to say "don't worry, none of that is
> any big deal" and maybe you're right.

Hm, I cannot think of any real life scenario where this will be an issue 
based on my personal experience with PostgreSQL, but if you can think of 
one please provide it. I will try to ponder some more on this myself.

Andreas



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: [HACKERS] Hash support for grouping sets
Next
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] Explicit subtransactions for PL/Tcl