Thread: Duplicate entries in pg_depend after REINDEX CONCURRENTLY
Hi all, While digging into a separate issue, I have found a new bug with REINDEX CONCURRENTLY. Once the new index is built and validated, a couple of things are done at the swap phase, like switching constraints, comments, and dependencies. The current code moves all the dependency entries of pg_depend from the old index to the new index, but it never counted on the fact that the new index may have some entries already. So, once the swapping is done, pg_depend finishes with duplicated entries: the ones coming from the old index and the ones of the index freshly-created. For example, take an index which uses an attribute or an expression and has dependencies with the parent's columns. Attached is a patch to fix the issue. As we know that the old index will have a definition and dependencies that match with the old one, I think that we should just remove any dependency records on the new index before moving the new set of dependencies from the old to the new index. The patch includes regression tests that scan pg_depend to check that everything remains consistent after REINDEX CONCURRENTLY. Any thoughts? -- Michael
Attachment
On Fri, Oct 25, 2019 at 03:43:18PM +0900, Michael Paquier wrote: > Attached is a patch to fix the issue. As we know that the old index > will have a definition and dependencies that match with the old one, I > think that we should just remove any dependency records on the new > index before moving the new set of dependencies from the old to the > new index. The patch includes regression tests that scan pg_depend to > check that everything remains consistent after REINDEX CONCURRENTLY. > > Any thoughts? I have done more tests for this one through the day, and committed the patch. There is still one bug pending related to partitioned indexes where REINDEX CONCURRENTLY is cancelled after phase 4 (swap) has committed. I am still looking more into that. -- Michael
Attachment
On Mon, Oct 28, 2019 at 03:01:31PM +0900, Michael Paquier wrote: > On Fri, Oct 25, 2019 at 03:43:18PM +0900, Michael Paquier wrote: > > Attached is a patch to fix the issue. As we know that the old index > > will have a definition and dependencies that match with the old one, I > > think that we should just remove any dependency records on the new > > index before moving the new set of dependencies from the old to the > > new index. The patch includes regression tests that scan pg_depend to > > check that everything remains consistent after REINDEX CONCURRENTLY. > > > > Any thoughts? > > I have done more tests for this one through the day, and committed the > patch. There is still one bug pending related to partitioned indexes > where REINDEX CONCURRENTLY is cancelled after phase 4 (swap) has > committed. I am still looking more into that. Are there any bad effects of this bug on PG 12? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Tue, Nov 05, 2019 at 06:26:56PM -0500, Bruce Momjian wrote: > Are there any bad effects of this bug on PG 12? Not that I could guess, except a bloat of pg_depend... The more you issue REINDEX CONCURRENTLY on an index, the more duplicated entries accumulate in pg_depend as the dependencies of the old index are passed to the new one, say: =# create table aa (a int); CREATE TABLE =# create index aai on aa(a); CREATE INDEX =# select count(pg_describe_object(classid, objid, objsubid)) from pg_depend where classid = 'pg_class'::regclass AND objid in ('aai'::regclass); count ------- 1 (1 row) =# reindex index concurrently aai; REINDEX =# reindex index concurrently aai; REINDEX =# select count(pg_describe_object(classid, objid, objsubid)) from pg_depend where classid = 'pg_class'::regclass AND objid in ('aai'::regclass); count ------- 3 (1 row) After that, if for example one drops a column the rebuilt index depends on or just drops the index, then all the duplicated entries get removed as well with the index. Note that we have also cases where it is legit to have multiple entries in pg_depend. For example take the case of one index which lists two times the same column. -- Michael