Thread: Duplicate entries in pg_depend after REINDEX CONCURRENTLY

Duplicate entries in pg_depend after REINDEX CONCURRENTLY

From
Michael Paquier
Date:
Hi all,

While digging into a separate issue, I have found a new bug with
REINDEX CONCURRENTLY.  Once the new index is built and validated,
a couple of things are done at the swap phase, like switching
constraints, comments, and dependencies.  The current code moves all
the dependency entries of pg_depend from the old index to the new
index, but it never counted on the fact that the new index may have
some entries already.  So, once the swapping is done, pg_depend
finishes with duplicated entries: the ones coming from the old index
and the ones of the index freshly-created.  For example, take an index
which uses an attribute or an expression and has dependencies with the
parent's columns.

Attached is a patch to fix the issue.  As we know that the old index
will have a definition and dependencies that match with the old one, I
think that we should just remove any dependency records on the new
index before moving the new set of dependencies from the old to the
new index.  The patch includes regression tests that scan pg_depend to
check that everything remains consistent after REINDEX CONCURRENTLY.

Any thoughts?
--
Michael

Attachment

Re: Duplicate entries in pg_depend after REINDEX CONCURRENTLY

From
Michael Paquier
Date:
On Fri, Oct 25, 2019 at 03:43:18PM +0900, Michael Paquier wrote:
> Attached is a patch to fix the issue.  As we know that the old index
> will have a definition and dependencies that match with the old one, I
> think that we should just remove any dependency records on the new
> index before moving the new set of dependencies from the old to the
> new index.  The patch includes regression tests that scan pg_depend to
> check that everything remains consistent after REINDEX CONCURRENTLY.
>
> Any thoughts?

I have done more tests for this one through the day, and committed the
patch.  There is still one bug pending related to partitioned indexes
where REINDEX CONCURRENTLY is cancelled after phase 4 (swap) has
committed.  I am still looking more into that.
--
Michael

Attachment

Re: Duplicate entries in pg_depend after REINDEX CONCURRENTLY

From
Bruce Momjian
Date:
On Mon, Oct 28, 2019 at 03:01:31PM +0900, Michael Paquier wrote:
> On Fri, Oct 25, 2019 at 03:43:18PM +0900, Michael Paquier wrote:
> > Attached is a patch to fix the issue.  As we know that the old index
> > will have a definition and dependencies that match with the old one, I
> > think that we should just remove any dependency records on the new
> > index before moving the new set of dependencies from the old to the
> > new index.  The patch includes regression tests that scan pg_depend to
> > check that everything remains consistent after REINDEX CONCURRENTLY.
> > 
> > Any thoughts?
> 
> I have done more tests for this one through the day, and committed the
> patch.  There is still one bug pending related to partitioned indexes
> where REINDEX CONCURRENTLY is cancelled after phase 4 (swap) has
> committed.  I am still looking more into that.

Are there any bad effects of this bug on PG 12?

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



Re: Duplicate entries in pg_depend after REINDEX CONCURRENTLY

From
Michael Paquier
Date:
On Tue, Nov 05, 2019 at 06:26:56PM -0500, Bruce Momjian wrote:
> Are there any bad effects of this bug on PG 12?

Not that I could guess, except a bloat of pg_depend...  The more you
issue REINDEX CONCURRENTLY on an index, the more duplicated entries
accumulate in pg_depend as the dependencies of the old index are
passed to the new one, say:
=# create table aa (a int);
CREATE TABLE
=# create index aai on aa(a);
CREATE INDEX
=# select count(pg_describe_object(classid, objid, objsubid))
   from pg_depend
   where classid = 'pg_class'::regclass AND
         objid in ('aai'::regclass);
 count
-------
     1
(1 row)
=# reindex index concurrently aai;
REINDEX
=# reindex index concurrently aai;
REINDEX
=# select count(pg_describe_object(classid, objid, objsubid))
   from pg_depend
   where classid = 'pg_class'::regclass AND
         objid in ('aai'::regclass);
 count
-------
     3
(1 row)

After that, if for example one drops a column the rebuilt index
depends on or just drops the index, then all the duplicated entries
get removed as well with the index.  Note that we have also cases
where it is legit to have multiple entries in pg_depend.  For example
take the case of one index which lists two times the same column.
--
Michael

Attachment