Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?
Date
Msg-id 20210201190228.s2lxs5frjefbjqyf@alap3.anarazel.de
Whole thread Raw
In response to Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?
List pgsql-hackers
Hi,

Thanks for developing this.

On 2021-01-31 02:11:06 +1300, Thomas Munro wrote:
> --- a/src/backend/commands/tablespace.c
> +++ b/src/backend/commands/tablespace.c
> @@ -520,15 +520,23 @@ DropTableSpace(DropTableSpaceStmt *stmt)
>           * but we can't tell them apart from important data files that we
>           * mustn't delete.  So instead, we force a checkpoint which will clean
>           * out any lingering files, and try again.
> -         *
> -         * XXX On Windows, an unlinked file persists in the directory listing
> -         * until no process retains an open handle for the file.  The DDL
> -         * commands that schedule files for unlink send invalidation messages
> -         * directing other PostgreSQL processes to close the files.  DROP
> -         * TABLESPACE should not give up on the tablespace becoming empty
> -         * until all relevant invalidation processing is complete.
>           */
>          RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
> +        /*
> +         * On Windows, an unlinked file persists in the directory listing until
> +         * no process retains an open handle for the file.  The DDL
> +         * commands that schedule files for unlink send invalidation messages
> +         * directing other PostgreSQL processes to close the files, but nothing
> +         * guarantees they'll be processed in time.  So, we'll also use a
> +         * global barrier to ask all backends to close all files, and wait
> +         * until they're finished.
> +         */
> +#if defined(WIN32) || defined(USE_ASSERT_CHECKING)
> +        LWLockRelease(TablespaceCreateLock);
> +        WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE));
> +        LWLockAcquire(TablespaceCreateLock, LW_EXCLUSIVE);
> +#endif
> +        /* And now try again. */
>          if (!destroy_tablespace_directories(tablespaceoid, false))
>          {
>              /* Still not empty, the files must be important then */

It's probably rare enough to care, but this still made me thing whether
we could avoid the checkpoint at all somehow. Requiring an immediate
checkpoint for dropping relations is quite a heavy hammer that
practically cannot be used in production without causing performance
problems. But it seems hard to process the fsync deletion queue in
another way.


> diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
> index 4dc24649df..0f8548747c 100644
> --- a/src/backend/storage/smgr/smgr.c
> +++ b/src/backend/storage/smgr/smgr.c
> @@ -298,6 +298,12 @@ smgrcloseall(void)
>          smgrclose(reln);
>  }
>  
> +void
> +smgrrelease(void)
> +{
> +    mdrelease();
> +}

Probably should be something like
    for (i = 0; i < NSmgr; i++)
    {
        if (smgrsw[i].smgr_release)
            smgrsw[i].smgr_release();
    }

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: "Mead, Scott"
Date:
Subject: Running autovacuum dynamic update to cost_limit and delay
Next
From: "Euler Taveira"
Date:
Subject: Re: row filtering for logical replication