Hi,
On 2023-02-22 17:01:47 -0800, Andres Freund wrote:
> One way to to defend against this would be to make mdextend(), whenever it
> extends into the last block of a segment, unlink the next segment - it can't
> be a validly existing contents. But it seems scary to just unlink entire
> segments.
Another way might be for XLOG_SMGR_TRUNCATE record, as well as smgr unlinks in
commit/abort records, to include not just the "target size", as we do today,
but to also include the current size.
I'm not sure that'd fix all potential issues, but it seems like it'd fix a lot
of the more obvious issues, because it'd prevent scenarios like a base backup
copying segment N, without copying N - 1, due to a concurrent truncate/drop,
from causing harm. Due to the range being included in the WAL record, replay
would know that N needs to be unlinked, even if smgrnblocks() thinks the
relation is much smaller.
Greetings,
Andres Freund