Re: replay of CREATE TABLESPACE eats data at wal_level=minimal - Mailing list pgsql-hackers

From Noah Misch
Subject Re: replay of CREATE TABLESPACE eats data at wal_level=minimal
Date
Msg-id 20210825052140.GB1916849@rfd.leadboat.com
Whole thread Raw
In response to Re: replay of CREATE TABLESPACE eats data at wal_level=minimal  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: replay of CREATE TABLESPACE eats data at wal_level=minimal
List pgsql-hackers
On Mon, Aug 23, 2021 at 09:08:44AM -0400, Robert Haas wrote:
> On Sun, Aug 22, 2021 at 6:59 PM Noah Misch <noah@leadboat.com> wrote:
> > Here's what I plan to push.  Besides adding a test, I modified things so
> > CREATE TABLESPACE redo continues to report an error if a non-directory exists
> > under the name we seek to create.  One could argue against covering that
> > corner case, but TablespaceCreateDbspace() does cover it.
> 
> By and large, LGTM, though perhaps it would be appropriate to also
> credit me as the reporter of the issue.
> 
> I feel it might be slightly better to highlight somewhere, either in
> the commit message or in the comments, that removing the old directory
> is unsafe, because if wal_level=minimal, we may have no other copy of
> the data. For me that's the key point here. I feel that the commit
> message and comments inside the patch explain rather thoroughly the
> possible consequences of the bug and why this particular fix was
> chosen, but they're not real explicit about why there was a bug at
> all.

Sounds good.  I think the log message is the optimal place:

===
Fix data loss in wal_level=minimal crash recovery of CREATE TABLESPACE.

If the system crashed between CREATE TABLESPACE and the next checkpoint,
the result could be some files in the tablespace unexpectedly containing
no rows.  Affected files would be those for which the system did not
write WAL; see the wal_skip_threshold documentation.  Before v13, a
different set of conditions governed the writing of WAL; see v12's
<sect2 id="populate-pitr">.  (The v12 conditions were broader in some
ways and narrower in others.)  Users may want to audit non-default
tablespaces for unexpected short files.  The bug could have truncated an
index without affecting the associated table, and reindexing the index
would fix that particular problem.

This fixes the bug by making create_tablespace_directories() more like
TablespaceCreateDbspace().  create_tablespace_directories() was
recursively removing tablespace contents, reasoning that WAL redo would
recreate everything removed that way.  That assumption holds for other
wal_level values.  Under wal_level=minimal, the old approach could
delete files for which no other copy existed.  Back-patch to 9.6 (all
supported versions).

Reviewed by Robert Haas and Prabhat Sahu.  Reported by Robert Haas.

Discussion: https://postgr.es/m/CA+TgmoaLO9ncuwvr2nN-J4VEP5XyAcy=zKiHxQzBbFRxxGxm0w@mail.gmail.com
===



pgsql-hackers by date:

Previous
From: "wangsh.fnst@fujitsu.com"
Date:
Subject: RE: ECPG bug fix: DECALRE STATEMENT and DEALLOCATE, DESCRIBE
Next
From: Amit Kapila
Date:
Subject: Re: row filtering for logical replication