Thread: v12.0: interrupt reindex CONCURRENTLY: ccold: ERROR: could not findtuple for parent of relation ...

On a badly-overloaded VM, we hit the previously-reported segfault in progress
reporting.  This left around some *ccold indices.  I tried to drop them but:

sentinel=# DROP INDEX child.alarms_null_alarm_id_idx1_ccold; -- child.alarms_null_alarm_time_idx_ccold; --
alarms_null_alarm_id_idx_ccold;
ERROR:  could not find tuple for parent of relation 41351896

Those are children of relkind=I index on relkind=p table.

postgres=# CREATE TABLE t(i int)PARTITION BY RANGE(i);
postgres=# CREATE TABLE t1 PARTITION OF t FOR VALUES FROM (1)TO(100);
postgres=# INSERT INTO t1 SELECT 1 FROM generate_series(1,99999);
postgres=# CREATE INDEX ON t(i);

postgres=# begin; SELECT * FROM t; -- DO THIS IN ANOTHER SESSION

postgres=# REINDEX INDEX CONCURRENTLY t1_i_idx; -- cancel this one
^CCancel request sent
ERROR:  canceling statement due to user request

postgres=# \d t1
...
    "t1_i_idx" btree (i)
    "t1_i_idx_ccold" btree (i) INVALID

postgres=# SELECT inhrelid::regclass FROM pg_inherits WHERE inhparent='t_i_idx'::regclass;
inhrelid
t1_i_idx
(1 row)

Not only can't I DROP the _ccold indexes, but also dropping the table doesn't
cause them to be dropped, and then I can't even slash dee them anymore:

jtp=# DROP INDEX t1_i_idx_ccold;
ERROR:  could not find tuple for parent of relation 290818869

jtp=# DROP TABLE t; -- does not fail, but ..

jtp=# \d t1_i_idx_ccold
ERROR:  cache lookup failed for relation 290818865

jtp=# SELECT indrelid::regclass, * FROM pg_index WHERE indexrelid='t1_i_idx_ccold'::regclass;
indrelid       | 290818865
indexrelid     | 290818869
indrelid       | 290818865
[...]

Justin



On Tue, Oct 15, 2019 at 11:40:47AM -0500, Justin Pryzby wrote:
> Not only can't I DROP the _ccold indexes, but also dropping the table doesn't
> cause them to be dropped, and then I can't even slash dee them anymore:

Yes, I can confirm the report.  In the case of this scenario the
reindex is waiting for the first transaction to finish before step 5,
the cancellation causing the follow-up process to not be done
(set_dead & the next ones).  So at this stage the swap has actually
happened.  I am still analyzing the report in depths, but you don't
have any problems with a plain index when interrupting at this stage,
and the old index can be cleanly dropped with the new one present, so
my first thoughts are that we are just missing some more dependency
cleanup at the swap phase when dealing with a partition index.
--
Michael

Attachment
On Thu, Oct 24, 2019 at 01:59:29PM +0900, Michael Paquier wrote:
> Yes, I can confirm the report.  In the case of this scenario the
> reindex is waiting for the first transaction to finish before step 5,
> the cancellation causing the follow-up process to not be done
> (set_dead & the next ones).  So at this stage the swap has actually
> happened.  I am still analyzing the report in depths, but you don't
> have any problems with a plain index when interrupting at this stage,
> and the old index can be cleanly dropped with the new one present, so
> my first thoughts are that we are just missing some more dependency
> cleanup at the swap phase when dealing with a partition index.

Okay, I have found this one.  The issue is that at the swap phase
pg_class.relispartition of the new index is updated to use the value
of the old index (true for a partition index), however relispartition
needs to be updated as well for the old index or when trying to
interact with it we get failures as the old index is part of no
inheritance trees.  We could use just use false as the index created
concurrently is not attached to a partition with its inheritance links
updated until the swap phase, but it feels more natural to just swap
relispartition for the old and the new index, as per the attached.

This brings also the point that you could just update pg_class to fix
things if you have a broken cluster.

In short, the attached fixes the issue for me, and that's the last bug
I know of in what has been reported..
--
Michael

Attachment
On Mon, Oct 28, 2019 at 04:14:41PM +0900, Michael Paquier wrote:
> This brings also the point that you could just update pg_class to fix
> things if you have a broken cluster.
>
> In short, the attached fixes the issue for me, and that's the last bug
> I know of in what has been reported..

This one is now done.  Justin has also confirmed me offline that it
fixed his problems.
--
Michael

Attachment