Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Date
Msg-id 30579.1556639470@sss.pgh.pa.us
Whole thread Raw
In response to Re: REINDEX INDEX results in a crash for an index of pg_class since9.6  (Andres Freund <andres@anarazel.de>)
Responses Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Re: REINDEX INDEX results in a crash for an index of pg_class since9.6
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2019-04-30 00:50:20 -0400, Tom Lane wrote:
>> Hm?  REINDEX INDEX is deadlock-prone by definition, because it starts
>> by opening/locking the index and then it has to open/lock the index's
>> table.  Every other operation locks tables before their indexes.

> We claim to have solved that:

Oh, okay ...

> I suspect the problem isn't REINDEX INDEX in general, it's REINDEX INDEX
> over catalog tables modified during reindex.

So far, every one of the failures in the buildfarm looks like the REINDEX
is deciding that it needs to wait for some other transaction, eg here

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2019-04-30%2014%3A43%3A11

the relevant bit of postmaster log is

2019-04-30 14:44:13.478 UTC [16135:450] pg_regress/create_index LOG:  statement: REINDEX TABLE pg_class;
2019-04-30 14:44:14.478 UTC [16137:430] pg_regress/create_view LOG:  process 16137 detected deadlock while waiting for
AccessShareLockon relation 2662 of database 16384 after 1000.148 ms 
2019-04-30 14:44:14.478 UTC [16137:431] pg_regress/create_view DETAIL:  Process holding the lock: 16135. Wait queue: .
2019-04-30 14:44:14.478 UTC [16137:432] pg_regress/create_view STATEMENT:  DROP SCHEMA temp_view_test CASCADE;
2019-04-30 14:44:14.478 UTC [16137:433] pg_regress/create_view ERROR:  deadlock detected
2019-04-30 14:44:14.478 UTC [16137:434] pg_regress/create_view DETAIL:  Process 16137 waits for AccessShareLock on
relation2662 of database 16384; blocked by process 16135. 
    Process 16135 waits for ShareLock on transaction 2875; blocked by process 16137.
    Process 16137: DROP SCHEMA temp_view_test CASCADE;
    Process 16135: REINDEX TABLE pg_class;
2019-04-30 14:44:14.478 UTC [16137:435] pg_regress/create_view HINT:  See server log for query details.
2019-04-30 14:44:14.478 UTC [16137:436] pg_regress/create_view STATEMENT:  DROP SCHEMA temp_view_test CASCADE;

I haven't been able to reproduce this locally yet, but my guess is that
the REINDEX wants to update some row that was already updated by the
concurrent transaction, so it has to wait to see if the latter commits
or not.  And, of course, waiting while holding AccessExclusiveLock on
any index of pg_class is a Bad Idea (TM).  But I can't quite see why
we'd be doing something like that during the reindex ...

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Heap lock levels for REINDEX INDEX CONCURRENTLY not quite right?
Next
From: Teodor Sigaev
Date:
Subject: Re: Failure in contrib test _int on loach