Thread: REINDEX SCHEMA/DATABASE/SYSTEM weak with dropped relations

REINDEX SCHEMA/DATABASE/SYSTEM weak with dropped relations

From

Michael Paquier

Date:

13 August 2020, 04:38:05

Hi all,

While working on support for REINDEX for partitioned relations, I have
noticed an old bug in the logic of ReindexMultipleTables(): the list
of relations to process is built in a first transaction, and then each
table is done in an independent transaction, but we don't actually
check that the relation still exists when doing the real work.  I
think that a complete fix involves two things:
1) Addition of one SearchSysCacheExists1() at the beginning of the
loop processing each item in the list in ReindexMultipleTables().
This would protect from a relation dropped, but that would not be
enough if ReindexMultipleTables() is looking at a relation dropped
before we lock it in reindex_table() or ReindexRelationConcurrently().
Still that's simple, cheaper, and would protect from most problems.
2) Be completely water-proof and adopt a logic close to what we do for
VACUUM where we try to open a relation, but leave if it does not
exist.  This can be achieved with just some try_relation_open() calls
with the correct lock used, and we also need to have a new
REINDEXOPT_* flag to control this behavior conditionally.

2) and 1) are complementary, but 2) is invasive, so based on the lack
of complaints we have seen that does not seem really worth
backpatching to me, and I think that we could also just have 1) on
stable branches as a minimal fix, to take care of most of the
problems that could show up to users.

Attached is a patch to fix all that, with a cheap isolation test that
fails on HEAD with a cache lookup failure.  I am adding that to the
next CF for now, and I would rather fix this issue before moving on
with REINDEX for partitioned relations as fixing this issue reduces
the use of session locks for partition trees.

Any thoughts?
--
Michael

Attachment

Re: REINDEX SCHEMA/DATABASE/SYSTEM weak with dropped relations

From

Anastasia Lubennikova

Date:

31 August 2020, 15:10:46

On 13.08.2020 07:38, Michael Paquier wrote:
> Hi all,
>
> While working on support for REINDEX for partitioned relations, I have
> noticed an old bug in the logic of ReindexMultipleTables(): the list
> of relations to process is built in a first transaction, and then each
> table is done in an independent transaction, but we don't actually
> check that the relation still exists when doing the real work.  I
> think that a complete fix involves two things:
> 1) Addition of one SearchSysCacheExists1() at the beginning of the
> loop processing each item in the list in ReindexMultipleTables().
> This would protect from a relation dropped, but that would not be
> enough if ReindexMultipleTables() is looking at a relation dropped
> before we lock it in reindex_table() or ReindexRelationConcurrently().
> Still that's simple, cheaper, and would protect from most problems.
> 2) Be completely water-proof and adopt a logic close to what we do for
> VACUUM where we try to open a relation, but leave if it does not
> exist.  This can be achieved with just some try_relation_open() calls
> with the correct lock used, and we also need to have a new
> REINDEXOPT_* flag to control this behavior conditionally.
>
> 2) and 1) are complementary, but 2) is invasive, so based on the lack
> of complaints we have seen that does not seem really worth
> backpatching to me, and I think that we could also just have 1) on
> stable branches as a minimal fix, to take care of most of the
> problems that could show up to users.
>
> Attached is a patch to fix all that, with a cheap isolation test that
> fails on HEAD with a cache lookup failure.  I am adding that to the
> next CF for now, and I would rather fix this issue before moving on
> with REINDEX for partitioned relations as fixing this issue reduces
> the use of session locks for partition trees.
>
> Any thoughts?
> --
> Michael

Hi,
I reviewed the patch. It does work and the code is clean and sane. It 
implements a logic that we already had in CLUSTER, so I think it was 
simply an oversight. Thank you for fixing this.

I noticed that REINDEXOPT_MISSING_OK can be passed to the TOAST table 
reindex. I think it would be better to reset the flag in this 
reindex_relation() call, as we don't expect a concurrent DROP here.

     /*
      * If the relation has a secondary toast rel, reindex that too while we
      * still hold the lock on the main table.
      */
     if ((flags & REINDEX_REL_PROCESS_TOAST) && OidIsValid(toast_relid))
         result |= reindex_relation(toast_relid, flags, options);


-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: REINDEX SCHEMA/DATABASE/SYSTEM weak with dropped relations

From

Michael Paquier

Date:

01 September 2020, 01:38:44

On Mon, Aug 31, 2020 at 06:10:46PM +0300, Anastasia Lubennikova wrote:
> I reviewed the patch. It does work and the code is clean and sane. It
> implements a logic that we already had in CLUSTER, so I think it was simply
> an oversight. Thank you for fixing this.

Thanks Anastasia for the review.

> I noticed that REINDEXOPT_MISSING_OK can be passed to the TOAST table
> reindex. I think it would be better to reset the flag in this
> reindex_relation() call, as we don't expect a concurrent DROP here.

Good point, and fixed by resetting the flag here if it is set.

I have added some extra comments.  There is one in
ReindexRelationConcurrently() to mention that there should be no extra
use of MISSING_OK once the list of indexes is built as session locks
are taken where needed.

Does the version attached look fine to you?  I have done one round of
indentation while on it.
--
Michael

Attachment

Re: REINDEX SCHEMA/DATABASE/SYSTEM weak with dropped relations

From

Anastasia Lubennikova

Date:

01 September 2020, 10:25:27

On 01.09.2020 04:38, Michael Paquier wrote:
> I have added some extra comments.  There is one in
> ReindexRelationConcurrently() to mention that there should be no extra
> use of MISSING_OK once the list of indexes is built as session locks
> are taken where needed.
Great, it took me a moment to understand the logic around index list 
check at first pass.
> Does the version attached look fine to you?  I have done one round of
> indentation while on it.

Yes, this version is good.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: REINDEX SCHEMA/DATABASE/SYSTEM weak with dropped relations

From

Michael Paquier

Date:

02 September 2020, 00:20:13

On Tue, Sep 01, 2020 at 01:25:27PM +0300, Anastasia Lubennikova wrote:
> Yes, this version is good.

Thanks.  I have added an extra comment for the case of RELKIND_INDEX
with REINDEXOPT_MISSING_OK while on it, as it was not really obvious
why the parent relation needs to be locked (at least attempted to) at
this stage.  And applied the change.  Thanks for the review,
Anastasia.
--
Michael

Attachment

signature.asc