Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation
Date
Msg-id 20200908.091353.1505164218636547516.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation  (Noah Misch <noah@leadboat.com>)
Responses Re: v13: CLUSTER segv with wal_level=minimal and parallel index creation
List pgsql-hackers
At Mon, 7 Sep 2020 02:32:55 -0700, Noah Misch <noah@leadboat.com> wrote in 
> On Mon, Sep 07, 2020 at 05:40:36PM +0900, Kyotaro Horiguchi wrote:
> > At Mon, 07 Sep 2020 13:45:28 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> > > The cause is that the worker had received pending-sync entry correctly
> > > but not never created a relcache entry for the relation using
> > > RelationBuildDesc. So the rd_firstRelfilenodeSubid is not correctly
> > > set.
> > > 
> > > I'm investigating it.
> > 
> > Relcaches are loaded from a file with old content at parallel worker
> > startup. The relcache entry is corrected by invalidation at taking a
> > lock but pending syncs are not considered.
> > 
> > Since parallel workers don't access the files so we can just ignore
> > the assertion safely, but I want to rd_firstRelfilenodeSubid flag at
> > invalidation, as attached PoC patch.
> 
> > [patch: When RelationInitPhysicalAddr() handles a mapped relation, re-fill
> > rd_firstRelfilenodeSubid from RelFileNodeSkippingWAL(), like
> > RelationBuildDesc() would do.]
> 
> As a PoC, this looks promising.  Thanks.  Would you add a test case such that
> the following demonstrates the bug in the absence of your PoC?
> 
>   printf '%s\n%s\n%s\n' 'log_statement = all' 'wal_level = minimal' 'max_wal_senders = 0' >/tmp/minimal.conf
>   make check TEMP_CONFIG=/tmp/minimal.conf

Mmm. I was close to add some tests to 018_wal_optimize.pl but your
suggestion seems better.  I added several ines to create_index.sql.

> Please have the test try both a nailed-and-mapped relation and a "nailed, but
> not mapped" relation.  I am fairly confident that your PoC fixes the former
> case, but the latter may need additional code.

Mmm. You're right. I choosed pg_amproc_fam_proc_index as
nailed-but-not-mapped index.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From c4d53acd64011f1ed5652177923d010f4c50fa89 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Mon, 7 Sep 2020 20:30:30 +0900
Subject: [PATCH] Fix assertion failure during reindex while wal_level=minimal.

When wal_level=minimal, nailed relations are forgotten to set
rd_firstRelfilenodeSubid correctly on parallel workers. Fix it.
---
 src/backend/utils/cache/relcache.c         | 16 ++++++++++++++++
 src/test/regress/expected/create_index.out |  5 +++++
 src/test/regress/sql/create_index.sql      |  6 ++++++
 3 files changed, 27 insertions(+)

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 96ecad02dd..cd6c8bbc70 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1273,6 +1273,8 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
 static void
 RelationInitPhysicalAddr(Relation relation)
 {
+    Oid oldnode = relation->rd_node.relNode;
+
     /* these relations kinds never have storage */
     if (!RELKIND_HAS_STORAGE(relation->rd_rel->relkind))
         return;
@@ -1330,6 +1332,20 @@ RelationInitPhysicalAddr(Relation relation)
             elog(ERROR, "could not find relation mapping for relation \"%s\", OID %u",
                  RelationGetRelationName(relation), relation->rd_id);
     }
+
+    /*
+     * Our leader process might have replaced storage before staring this
+     * worker. We need to set rd_firstRelfilenodeSubid according to pending
+     * sync hash in that case. Currently the information is not actually used
+     * in worker processes, but make things tidy.
+     */
+    if (IsParallelWorker() && oldnode != relation->rd_node.relNode)
+    {
+        if (RelFileNodeSkippingWAL(relation->rd_node))
+            relation->rd_firstRelfilenodeSubid = TopSubTransactionId;
+        else
+            relation->rd_firstRelfilenodeSubid = InvalidTransactionId;
+    }
 }
 
 /*
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 814416d936..6468418639 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2598,6 +2598,11 @@ REINDEX TABLE pg_toast.pg_toast_1260;
 ERROR:  must be owner of table pg_toast_1260
 REINDEX INDEX pg_toast.pg_toast_1260_index;
 ERROR:  must be owner of index pg_toast_1260_index
+-- parallel reindex
+RESET ROLE;
+SET min_parallel_table_scan_size = 0;
+REINDEX INDEX pg_attribute_relid_attnum_index;    -- nailed and mapped
+REINDEX INDEX pg_amproc_fam_proc_index;         -- nailed but not mapped
 -- Clean up
 RESET ROLE;
 REVOKE USAGE ON SCHEMA pg_toast FROM regress_reindexuser;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index f3667bacdc..d9cb3f14f6 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1115,6 +1115,12 @@ SET SESSION ROLE regress_reindexuser;
 REINDEX TABLE pg_toast.pg_toast_1260;
 REINDEX INDEX pg_toast.pg_toast_1260_index;
 
+-- parallel reindex
+RESET ROLE;
+SET min_parallel_table_scan_size = 0;
+REINDEX INDEX pg_attribute_relid_attnum_index;    -- nailed and mapped
+REINDEX INDEX pg_amproc_fam_proc_index;         -- nailed but not mapped
+
 -- Clean up
 RESET ROLE;
 REVOKE USAGE ON SCHEMA pg_toast FROM regress_reindexuser;
-- 
2.18.4


pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Optimising compactify_tuples()
Next
From: Tom Lane
Date:
Subject: Re: pgbench and timestamps (bounced)