In the release team's discussion leading up to commit 0e758ae89,
Andres opined that what commit 4ab5dae94 had done to mdunlinkfork
was a mess, and I concur. It invented an entirely new code path
through that function, and required two different behaviors from the
segment-deletion loop. I think a very straight line can be drawn
between that extra complexity and the introduction of a nasty bug.
It's all unnecessary too, because AFAICS all we really need is to
apply the pre-existing behavior for temp tables and REDO mode
to binary-upgrade mode as well.
Hence, the attached reverts everything 4ab5dae94 did to this function,
and most of 0e758ae89 too, and instead makes IsBinaryUpgrade an
additional reason to take the immediate-unlink path.
Barring objections, I'll push this after the release freeze lifts.
regards, tom lane
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index b1a2cb575d..0f642df3c1 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -263,6 +263,12 @@ mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
* The fact that temp rels and regular rels have different file naming
* patterns provides additional safety.
*
+ * We also don't do it while performing a binary upgrade. There is no reuse
+ * hazard in that case, since after a crash or even a simple ERROR, the
+ * upgrade fails and the whole cluster must be recreated from scratch.
+ * Furthermore, it is important to remove the files from disk immediately,
+ * because we may be about to reuse the same relfilenumber.
+ *
* All the above applies only to the relation's main fork; other forks can
* just be removed immediately, since they are not needed to prevent the
* relfilenumber from being recycled. Also, we do not carefully
@@ -319,14 +325,15 @@ mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo)
{
char *path;
int ret;
- BlockNumber segno = 0;
path = relpath(rlocator, forknum);
/*
- * Delete or truncate the first segment.
+ * Truncate and then unlink the first segment, or just register a request
+ * to unlink it later, as described in the comments for mdunlink().
*/
- if (isRedo || forknum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
+ if (isRedo || IsBinaryUpgrade || forknum != MAIN_FORKNUM ||
+ RelFileLocatorBackendIsTemp(rlocator))
{
if (!RelFileLocatorBackendIsTemp(rlocator))
{
@@ -351,7 +358,6 @@ mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo)
ereport(WARNING,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m", path)));
- segno++;
}
}
else
@@ -359,42 +365,25 @@ mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo)
/* Prevent other backends' fds from holding on to the disk space */
ret = do_truncate(path);
- /*
- * Except during a binary upgrade, register request to unlink first
- * segment later, rather than now.
- *
- * If we're performing a binary upgrade, the dangers described in the
- * header comments for mdunlink() do not exist, since after a crash or
- * even a simple ERROR, the upgrade fails and the whole new cluster
- * must be recreated from scratch. And, on the other hand, it is
- * important to remove the files from disk immediately, because we may
- * be about to reuse the same relfilenumber.
- */
- if (!IsBinaryUpgrade)
- {
- register_unlink_segment(rlocator, forknum, 0 /* first seg */ );
- segno++;
- }
+ /* Register request to unlink first segment later */
+ register_unlink_segment(rlocator, forknum, 0 /* first seg */ );
}
/*
- * Delete any remaining segments (we might or might not have dealt with
- * the first one above).
+ * Delete any additional segments.
*/
if (ret >= 0)
{
char *segpath = (char *) palloc(strlen(path) + 12);
+ BlockNumber segno;
/*
* Note that because we loop until getting ENOENT, we will correctly
* remove all inactive segments as well as active ones.
*/
- for (;; segno++)
+ for (segno = 1;; segno++)
{
- if (segno == 0)
- strcpy(segpath, path);
- else
- sprintf(segpath, "%s.%u", path, segno);
+ sprintf(segpath, "%s.%u", path, segno);
if (!RelFileLocatorBackendIsTemp(rlocator))
{