Re: In-placre persistance change of a relation - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: In-placre persistance change of a relation
Date
Msg-id 20230425.165457.193946345712188069.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: In-placre persistance change of a relation  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: In-placre persistance change of a relation  (Jakub Wartak <jakub.wartak@enterprisedb.com>)
List pgsql-hackers
At Fri, 17 Mar 2023 15:16:34 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> Mmm. It took longer than I said, but this is the patch set that
> includes all three parts.
> 
> 1. "Mark files" to prevent orphan storage files for in-transaction
>   created relations after a crash.
> 
> 2. In-place persistence change: For ALTER TABLE SET LOGGED/UNLOGGED
>   with wal_level minimal, and ALTER TABLE SET UNLOGGED with other
>   wal_levels, the commands don't require a file copy for the relation
>   storage. ALTER TABLE SET LOGGED with non-minimal wal_level emits
>   bulk FPIs instead of a bunch of individual INSERTs.
> 
> 3. An extension to ALTER TABLE SET (UN)LOGGED that can handle all
>   tables in a tablespace at once.
> 
> 
> As a side note, I quickly go over the behavior of the mark files
> introduced by the first patch, particularly what happens when deletion
> fails.
> 
> (1) The mark file for MAIN fork ("<oid>.u") corresponds to all forks,
>     while the mark file for INIT fork ("<oid>_init.u") corresponds to
>     INIT fork alone.
> 
> (2) The mark file is created just before the the corresponding storage
>     file is made. This is always logged in the WAL.
> 
> (3) The mark file is deleted after removing the corresponding storage
>     file during the commit and rollback. This action is logged in the
>     WAL, too. If the deletion fails, an ERROR is output and the
>     transaction aborts.
> 
> (4) If a crash leaves a mark file behind, server will try to delete it
>     after successfully removing the corresponding storage file during
>     the subsequent startup that runs a recovery. If deletion fails,
>     server leaves the mark file alone with emitting a WARNING. (The
>     same behavior for non-mark files.)
> 
> (5) If the deletion of the mark file fails, the leftover mark file
>     prevents the creation of the corresponding storage file (causing
>     an ERROR).  The leftover mark files don't result in the removal of
>     the wrong files due to that behavior.
> 
> (6) The mark file for an INIT fork is created only when ALTER TABLE
>     SET UNLOGGED is executed (not for CREATE UNLOGGED TABLE) to signal
>     the crash-cleanup code to remove the INIT fork. (Otherwise the
>     cleanup code removes the main fork instead. This is the main
>     objective of introducing the mark files.)

Rebased.

I fixed some code comments and commit messages. I fixed the wrong
arrangement of some changes among patches.  Most importantly, I fixed
the a bug based on a wrong assumption that init-fork is not resides on
shared buffers. Now smgrDoPendingCleanups drops buffer for a init-fork
to be removed.

The new fourth patch is a temporary fix for recently added code, which
will soon be no longer needed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: torikoshia
Date:
Subject: Allow pg_archivecleanup to remove backup history files
Next
From: Pavel Stehule
Date:
Subject: enhancing plpgsql debug api - hooks on statements errors and function errors