On Thu, Oct 24, 2019 at 01:59:29PM +0900, Michael Paquier wrote:
> Yes, I can confirm the report. In the case of this scenario the
> reindex is waiting for the first transaction to finish before step 5,
> the cancellation causing the follow-up process to not be done
> (set_dead & the next ones). So at this stage the swap has actually
> happened. I am still analyzing the report in depths, but you don't
> have any problems with a plain index when interrupting at this stage,
> and the old index can be cleanly dropped with the new one present, so
> my first thoughts are that we are just missing some more dependency
> cleanup at the swap phase when dealing with a partition index.
Okay, I have found this one. The issue is that at the swap phase
pg_class.relispartition of the new index is updated to use the value
of the old index (true for a partition index), however relispartition
needs to be updated as well for the old index or when trying to
interact with it we get failures as the old index is part of no
inheritance trees. We could use just use false as the index created
concurrently is not attached to a partition with its inheritance links
updated until the swap phase, but it feels more natural to just swap
relispartition for the old and the new index, as per the attached.
This brings also the point that you could just update pg_class to fix
things if you have a broken cluster.
In short, the attached fixes the issue for me, and that's the last bug
I know of in what has been reported..
--
Michael