DROP TABLESPACE waits for checkpoint with lwlock held - Mailing list pgsql-hackers

From Andres Freund
Subject DROP TABLESPACE waits for checkpoint with lwlock held
Date
Msg-id rhjhn2f3ckqdspvkmvbs6sgmbksq5fsy2ds4fzcxigbzxh75kx@6cv55inphzhv
Whole thread Raw
List pgsql-hackers
Hi,

For a while I thought there was a hard deadlock involving DROP TABLESPACE, due
to doing RequestCheckpoint() while holding an lwlock. That turned out to not
be the case, the fd.c LRU was corrupted, causing checkpointer to spin
endlessly in _dump_lru(), due to [1]. After putting HOLD/RESUME_INTERRUPTS in
all the smgr functions I cannot repro the hard deadlock.


But it nonetheless does seem like a rather bad idea to do a blocking
RequestCheckpoint() while holding an lwlock blocking interrupts processing. At
the very least it leaves you with a query that can't be interrupted for a long
time and for a long time global barriers aren't processed.


At first I thought this was introduced in

commit 4eb2176318d
Author: Thomas Munro <tmunro@postgresql.org>
Date:   2022-02-12 10:21:23 +1300

    Fix DROP {DATABASE,TABLESPACE} on Windows.

but that was just due to the code being moved, made more confusing by my git
show not showing that due to having git configured to use a different diff
algorithm...

Interestingly that commit *does* drop the lwlock around the
WaitForProcSignalBarrier() (which would otherwise trivially lead to
deadlocks).


Looks like instead that goes back to

commit 6cc4451b5c4
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   2007-11-15 20:36:40 +0000

    Prevent re-use of a deleted relation's relfilenode until after the next
    checkpoint.  This guards against an unlikely data-loss scenario in which
    we re-use the relfilenode, then crash, then replay the deletion and
    recreation of the file.  Even then we'd be OK if all insertions into the
    new relation had been WAL-logged ... but that's not guaranteed given all
    the no-WAL-logging optimizations that have recently been added.

    Patch by Heikki Linnakangas, per a discussion last month.


Somewhat relatedly, I think it'd be good to put an
    Assert(InterruptHoldoffCount == 0);
or
        Assert(INTERRUPTS_CAN_BE_PROCESSED());

in WaitForProcSignalBarrier() and something like
    Assert(!(flags & CHECKPOINT_WAIT) || INTERRUPTS_CAN_BE_PROCESSED());
in RequestCheckpoint().

Greetings,

Andres Freund

[1] https://postgr.es/m/3vae7l5ozvqtxmd7rr7zaeq3qkuipz365u3rtim5t5wdkr6f4g%40vkgf2fogjirl



pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET
Next
From: Florents Tselai
Date:
Subject: like pg_shmem_allocations, but fine-grained for DSM registry ?