Thread: BUG #17578: undetected (on one side) deadlock with reindex CONCURRENTLY partitioned index vs drop index

The following bug has been logged on the website:

Bug reference:      17578
Logged by:          Maxim Boguk
Email address:      maxim.boguk@gmail.com
PostgreSQL version: 14.4
Operating system:   Linux
Description:

Hi,

It's of course very synthetic case but it might be symptom of some issues
with deadlock detection in some cases.
My expectations was that the session with longer deadlock timeout should
always win and session with shorter deadlock timeout should lost locking
race (and error out with deadlock message).

However there is case where deadlock detection doesn't work that way in
SESSION TWO:

--prepare data
create table test (id integer) partition by range(id);
create table test_part_1000000 partition of test for values from (0) to
(1000000);
insert into test_part_1000000 select (random()*999999)::integer from
generate_series(1, 10000000);
create index test_id_key on only test(id);
create index CONCURRENTLY test_part_1000000_id_key on
test_part_1000000(id);
alter index test_id_key attach partition test_part_1000000_id_key;

SESSION ONE
set deadlock_timeout to '1000s';
SET
reindex index CONCURRENTLY test_id_key;


SESSION TWO 
show deadlock_timeout ;
 deadlock_timeout 
------------------
 1s
(run while reindex concurrently run in session 1)
drop index test_part_1000000_id_key_ccnew;


expected behavior
SESSION TWO in one second detect it in deadlock condition and error out
actual behavior
SESSION wait until SESSION ONE reach it 1000s deadlock timeout and SESSION
ONE error out with deadlock message
(
session one:
reindex index CONCURRENTLY test_id_key;
ERROR:  deadlock detected
DETAIL:  Process 266093 waits for ShareLock on virtual transaction 4/22381;
blocked by process 266107.
Process 266107 waits for AccessExclusiveLock on relation 40986 of database
24579; blocked by process 266093.
HINT:  See server log for query details.

session two:
drop index test_part_1000000_id_key_ccnew;
DROP INDEX
Time: 1005052.210 ms (16:45.052)
)

PS: there are seems some more cases of funny behavior between different
concurrent operations on partitioned indexes 
(this is easiest to reproduce).


On Fri, Aug 05, 2022 at 09:16:57PM +0000, PG Bug reporting form wrote:
> My expectations was that the session with longer deadlock timeout should
> always win and session with shorter deadlock timeout should lost locking
> race (and error out with deadlock message).

That holds if all edges of the deadlock are in lock waits by the time of the
shorter deadlock timeout.  Otherwise, since each session runs the deadlock
detector only once, the longer timeout will be the one to error out.