Re: Wrong usage of RelationNeedsWAL - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Wrong usage of RelationNeedsWAL
Date
Msg-id 20210120.173444.1427112689198751884.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Wrong usage of RelationNeedsWAL  (Noah Misch <noah@leadboat.com>)
Responses Re: Wrong usage of RelationNeedsWAL  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
At Tue, 19 Jan 2021 01:31:52 -0800, Noah Misch <noah@leadboat.com> wrote in 
> On Tue, Jan 19, 2021 at 01:48:31PM +0900, Kyotaro Horiguchi wrote:
> > I understand that you are suggesting that at least
> > TransactionIdLimitedForOldSnapshots should follow not only relation
> > persistence but RelationNeedsWAL, based on the discussion on the
> > check on LSN of TestForOldSnapshot().
> > 
> > I still don't think that LSN in the WAL-skipping-created relfilenode
> > harm.  ALTER TABLE SET TABLESPACE always makes a dead copy of every
> > block (except checksum) including page LSN regardless of wal_level. In
> > the scenario above, the last select at H correctly reads page LSN set
> > by E then copied by G, which is larger than the snapshot LSN at B. So
> > doesn't go the fast-return path before actual check performed by
> > RelationAllowsEarlyPruning.
> 
> I agree the above procedure doesn't have a problem under your patch versions.
> See https://postgr.es/m/20210116043816.GA1644261@rfd.leadboat.com for a
> different test case.  In more detail:
> 
A> setup: CREATE TABLE t AS SELECT ...;
B> xact1: BEGIN; DELETE FROM t;
C> xact2: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT 1;  -- take snapshot
D> xact1: COMMIT;
> (plenty of time passes, rows of t are now eligible for early pruning)
E> xact3: BEGIN;
F> xact3: ALTER TABLE t SET TABLESPACE something;  -- start skipping WAL
G> xact3: SELECT count(*) FROM t;  -- early pruning w/o WAL, w/o LSN changes

H> xact3: COMMIT;
J> xact2: SELECT count(*) FROM t;  -- no error, wanted "snapshot too old"
> 
> I expect that shows no bug for git master, but I expect it does show a bug

I didn't see "snapshot too old" at J with the patch, but at the same
time the SELECT at G didn't prune tuples almost all of the trial. (the
last SELECT returns the correct answer.) I saw it get pruned only once
among many trials but I'm not sure how I could get to the situation
and haven't succeed to reproduce that.

The reason that the SELECT at G doesn't prune is that limited_xmin in
heap_page_prune_opt at G is limited up to the oldest xid among active
sessions. In this case the xmin of the session 2 is the same with the
xid of the session 1. So xmin of the session 3 at G is the same with
the xmin of the session 2, which is the same with the xid of the
session 1.  So pruning doesn't happen. However that is very fragile as
the base for asserting that pruning won't happen.

Anyway, it seems actually dangerous that cause pruning on wal-skipped
relation.

> with your patch versions.  Could you try implementing both test procedures in
> src/test/modules/snapshot_too_old?  There's no need to make the test use
> wal_level=minimal explicitly; it's sufficient to catch these bugs when
> wal_level=minimal is in the TEMP_CONFIG file.

In the attached, TestForOldSnapshot() considers XLogIsNeeded(),
instead of moving the condition on LSN to TestForOldSnapshot_impl for
performance.

I'll add the test part in the next version.

regards.
-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 5976674e48fdce3c6e911f0ae63485d92941bfd8 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Mon, 18 Jan 2021 14:47:21 +0900
Subject: [PATCH v4] Do not use RelationNeedsWAL to identify relation
 persistence

RelationNeedsWAL() may return false for a permanent relation when
wal_level=minimal and the relation is created or truncated in the
current transaction. Directly examine relpersistence instead of using
the function to know relation persistence.
---
 src/backend/catalog/pg_publication.c | 2 +-
 src/backend/optimizer/util/plancat.c | 3 ++-
 src/include/storage/bufmgr.h         | 8 +++++++-
 src/include/utils/rel.h              | 2 +-
 src/include/utils/snapmgr.h          | 2 +-
 5 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 5f8e1c64e1..84d2efcfd2 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -67,7 +67,7 @@ check_publication_add_relation(Relation targetrel)
                  errdetail("System tables cannot be added to publications.")));
 
     /* UNLOGGED and TEMP relations cannot be part of publication. */
-    if (!RelationNeedsWAL(targetrel))
+    if (targetrel->rd_rel->relpersistence != RELPERSISTENCE_PERMANENT)
         ereport(ERROR,
                 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("table \"%s\" cannot be replicated",
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index da322b453e..177e6e336a 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -126,7 +126,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
     relation = table_open(relationObjectId, NoLock);
 
     /* Temporary and unlogged relations are inaccessible during recovery. */
-    if (!RelationNeedsWAL(relation) && RecoveryInProgress())
+    if (relation->rd_rel->relpersistence != RELPERSISTENCE_PERMANENT &&
+        RecoveryInProgress())
         ereport(ERROR,
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                  errmsg("cannot access temporary or unlogged relations during recovery")));
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index fb00fda6a7..e641174798 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -14,6 +14,7 @@
 #ifndef BUFMGR_H
 #define BUFMGR_H
 
+#include "access/xlog.h"
 #include "storage/block.h"
 #include "storage/buf.h"
 #include "storage/bufpage.h"
@@ -278,12 +279,17 @@ TestForOldSnapshot(Snapshot snapshot, Relation relation, Page page)
 {
     Assert(relation != NULL);
 
+    /*
+     * While wal_level=minimal, early pruning can happen without updating page
+     * LSN. Don't take the fast-return path while wal_level=minimal so that we
+     * don't miss early pruning.
+     */
     if (old_snapshot_threshold >= 0
         && (snapshot) != NULL
         && ((snapshot)->snapshot_type == SNAPSHOT_MVCC
             || (snapshot)->snapshot_type == SNAPSHOT_TOAST)
         && !XLogRecPtrIsInvalid((snapshot)->lsn)
-        && PageGetLSN(page) > (snapshot)->lsn)
+        && (!XLogIsNeeded() || PageGetLSN(page) > (snapshot)->lsn))
         TestForOldSnapshot_impl(snapshot, relation);
 }
 
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 10b63982c0..f58d65cf28 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -554,7 +554,7 @@ typedef struct ViewOptions
 
 /*
  * RelationNeedsWAL
- *        True if relation needs WAL.
+ *        True if relation needs WAL at the time.
  *
  * Returns false if wal_level = minimal and this relation is created or
  * truncated in the current transaction.  See "Skipping WAL for New
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index 579be352c5..c21ee3c289 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -37,7 +37,7 @@
  */
 #define RelationAllowsEarlyPruning(rel) \
 ( \
-     RelationNeedsWAL(rel) \
+     (rel)->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT    \
   && !IsCatalogRelation(rel) \
   && !RelationIsAccessibleInLogicalDecoding(rel) \
 )
-- 
2.27.0


pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [bug?] EXPLAIN outputs 0 for rows and width in cost estimate for update nodes
Next
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: [bug?] EXPLAIN outputs 0 for rows and width in cost estimate for update nodes